regex匹配单词，但前提是它不以非字母数字字符开头

9fkzdhlc 于 2021-07-13 发布在 Java

关注(0)|答案(1)|浏览(272)

我有一些句子，我想在其中识别单词，但如果是以字母数字字符开头的话就不行了。不过，如果以一个结尾就好了。
我所做的一个例子；

words = ["THIS", "THAT"]
sentences = ["I want to identify THIS word.", "And THAT!", "But I do not want to identify !THIS word", "Or [THIS] word"] 

for sentence in sentences:
        for word in words:
                word_re = re.search(r"\b(%s)\b" %word, sentence) 
                if word_re:
                    print("It's a match!")

以上代码的输出将与每个句子匹配。我想要只在前两句中匹配的东西。有可能用regex做我想做的吗？
谢谢！

python regex String

来源：https://stackoverflow.com/questions/67289995/regex-to-match-word-but-only-if-it-doesnt-start-with-a-non-alphanumerical-chara

1条答案

按热度按时间

4dc9hkyq1#

你可以使用正则表达式，比如

(?<!\S)(?:THIS|THAT)\b

查看regex演示。细节： (?<!\S) -左边的空白边界 (?:THIS|THAT) -一个非捕获组匹配 THIS 或者
THAT \b -一个词的边界。
请参见python演示：

import re
words = ["THIS", "THAT"]
sentences = ["I want to identify THIS word.", "And THAT!", "But I do not want to identify !THIS word", "Or [THIS] word"] 

pattern = fr"(?<!\S)(?:{'|'.join(words)})\b"
for sentence in sentences:
    word_re = re.search(pattern, sentence) 
    if word_re:
        print(f"'{sentence}' is a match!")

# => 'I want to identify THIS word.' is a match!

# 'And THAT!' is a match!

如果 THIS 或者 THAT 可以包含特殊字符，替换 pattern = fr"(?<!\S)(?:{'|'.join(words)})\b" 与 pattern = fr"(?<!\S)(?:{'|'.join(map(re.escape, words))})\b" .

赞(0）回复(0）举报 2021-07-13

我来回答

regex匹配单词，但前提是它不以非字母数字字符开头

1条答案

相关问题

热门标签

最新问答