如何使用elasticsearch获取pdf中出现的文本的确切位置?

2w2cym1i  于 2021-06-13  发布在  ElasticSearch
关注(0)|答案(1)|浏览(329)

我正在建立一个应用程序,需要有功能,允许用户在pdf中搜索类似的内容(可能使用elasticsearch)。。。因为我写了一些代码,以找出如何找到确切的pdf文件使用ElasticSearch,但我不知道如何获得文本的确切位置。。。或者如何突出显示搜索结果?如果提供代码示例将非常有用>_<

4szc88ey

4szc88ey1#

可以使用高亮显示,高亮显示使您能够从搜索结果中的一个或多个字段中获取高亮显示的代码段,以便向用户显示查询匹配的位置。
添加索引数据、搜索查询和搜索结果的工作示例
索引数据:

{
  "content": "The nonprofit Wikimedia Foundation provides the essential infrastructure for free knowledge. We host Wikipedia, the free online encyclopedia, created, edited, and verified by volunteers around the world, as well as many other vital community projects. All of which is made possible thanks to donations from individuals like you. We welcome anyone who shares our vision to join us in collecting and sharing knowledge that fully represents human diversity."
}

搜索查询:

{
  "query": {
    "match": {
      "content": "online"
    }
  },
  "highlight": {
    "fields": {
      "content": {}
    }
  }
}

搜索结果:

"hits": [
      {
        "_index": "65463291",
        "_type": "_doc",
        "_id": "1",
        "_score": 0.2876821,
        "_source": {
          "content": "The nonprofit Wikimedia Foundation provides the essential infrastructure for free knowledge. We host Wikipedia, the free online encyclopedia, created, edited, and verified by volunteers around the world, as well as many other vital community projects. All of which is made possible thanks to donations from individuals like you. We welcome anyone who shares our vision to join us in collecting and sharing knowledge that fully represents human diversity."
        },
        "highlight": {                      // note this
          "content": [
            "We host Wikipedia, the free <em>online</em> encyclopedia, created, edited, and verified by volunteers around the"
          ]
        }
      }
    ]

相关问题