python ElasticSearch无法在最后一个必填字段到达后构建[请求]

xa9qqrwz  于 2022-10-30  发布在  Python
关注(0)|答案(1)|浏览(261)

我正在LinkSO数据集上使用ElasticSearch进行排名评估。我已经使用Kibana创建了 curl 所需的文件,并创建了排名函数。但是,我收到了一个错误

BadRequestError: BadRequestError(400, 'x_content_parse_exception', 'Failed to build [request] after last required field arrived')

这是我的代码,以获得NDCG排名

def get_ndcg(dataframe, input_index="java_lm"):
    ndcg_list = []
    # loop through each qid1
    for i in range(0, len(dataframe["qid1"]), 30):
        qid1_title = es.get(index=input_index, id=dataframe["qid1"][i])['_source']['title']

        # load ratings from the json file
        f = open("qids/" + input_index + "/" + str(dataframe["qid1"][i]) + ".json")
        data = json.load(f)

        _search = ranking(dataframe["qid1"][i], qid1_title, ratings=data)

        result = es.rank_eval(index=input_index, body=_search)

        ndcg = result['metric_score']
        ndcg_list.append(ndcg)

    return ndcg_list

错误出现在es.rank_eval()函数上
我的排名函数为

def ranking(qid1, qid1_title, ratings):
    _search = {
        "requests": [
            {
            "id": str(qid1),
            "request": {
                "query": {
                    "bool": {
                        "must_not": {
                            "match": {
                                "_id": qid1
                            }
                        },
                        "should": [
                            {
                                "match": {
                                    "title": {
                                        "query": qid1_title,
                                        "boost": 3.0,
                                        "analyzer": "my_analyzer"
                                    }
                                }
                            },
                            {
                                "match": {
                                    "body": {
                                        "query": qid1_title,
                                        "boost": 0.5,
                                        "analyzer": "my_analyzer"
                                    }
                                }
                            },
                            {
                                "match": {
                                    "answer": {
                                        "query": qid1_title,
                                        "boost": 0.5,
                                        "analyzer": "my_analyzer"
                                    }
                                }
                            }
                        ]
                    }
                }
            },
            "ratings": ratings
            }
        ],
        "metric": {
            "dcg": {
                "k": 10,
                "normalize": True
            }
        }
    }
    return _search

在_search下,我的评分文件是一个json,格式为

[
{"_index": "java_lm", "_id": "15194804", "rating": 0},
{"_index": "java_lm", "_id": "18264178", "rating": 0},
{"_index": "java_lm", "_id": "16225177", "rating": 1},
{"_index": "java_lm", "_id": "16445238", "rating": 0},
{"_index": "java_lm", "_id": "17233226", "rating": 0}
]

我在Kibana上用于加载模板的PUT命令是

PUT /java_lm
{
  "settings": {
    "similarity": {
      "LM": {
        "type": "LMDirichlet",
        "mu": 2000
      }
    },
    "analysis": {
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "whitespace",
          "filter": [
            "lowercase",
            "porter_stem"
          ]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "analyzer": "my_analyzer",
        "similarity": "LM"
      },
      "body": {
        "type": "text",
        "analyzer": "my_analyzer",
        "similarity": "LM"
      },
      "answer": {
        "type": "text",
        "analyzer": "my_analyzer",
        "similarity": "LM"
      }
    }
  }
}

我似乎找不到我哪里错了。有人能评论一下如何纠正吗?

hts6caw3

hts6caw31#

当json文件被错误地创建用于文档排名时,会出现此错误。需要检查json文件是否具有所需的属性。

相关问题