Elasticsearch分析器设置和匹配数据

xvw2m8pv  于 8个月前  发布在  ElasticSearch
关注(0)|答案(1)|浏览(72)

我正在尝试一个例子,使用与文档中相同的设置来创建索引

{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_custom_analyzer": { 
          "char_filter": [
            "emoticons"
          ],
          "tokenizer": "punctuation",
          "filter": [
            "lowercase",
            "english_stop"
          ]
        }
      },
      "tokenizer": {
        "punctuation": { 
          "type": "pattern",
          "pattern": "[ .,!?]"
        }
      },
      "char_filter": {
        "emoticons": { 
          "type": "mapping",
          "mappings": [
            ":) => _happy_",
            ":( => _sad_"
          ]
        }
      },
      "filter": {
        "english_stop": { 
          "type": "stop",
          "stopwords": "_english_"
        }
      }
    }
  }
}

然后我保存一个数据到索引

POST /my-index-000003/_doc/1
{
  "content": "I'm feeling :) today, but the weather is quite gloomy :("
}

然而,当我搜索:)或 happy 时,我找不到匹配项。为什么?为什么?

5lwkijsr

5lwkijsr1#

在索引时,:)_happy_替换,:(_sad_替换。因此,您无法再搜索:):(
如果你不想你的表情符号被替换,你需要使用同义词标记过滤器而不是字符过滤器。
如果你搜索happy,不会找到_happy_,但如果你搜索_happy_,将工作,我能够重现,并与以下查询工作:

POST test/_search
{
  "query": {
    "match": {
      "content": "_happy_"
    }
  }
}

请注意,只有在my_custom_analyzer分析器配置了content字段时,此操作才有效

"mappings": {
    "properties": {
      "content": {
        "type": "text",
        "analyzer": "my_custom_analyzer"
      }
    }
  }

相关问题