elasticsearch 为什么GET index/_doc/entry什么都不返回？不是错误，甚至不是“_index”，“_id”，“_version”元数据，但真的什么都没有吗？

uttx8gqw 于 5个月前发布在 ElasticSearch

关注(0)|答案(1)|浏览(41)

我想下载这个pdf：
https://kerlaz.bzh/wp-content/uploads/2023/09/Compte-rendu-conseil-04-09-2023.pdf
我已经创建了一个索引（1）并成功地摄取了它（2）。我相信。
然后，如果我搜索存储在comptes_rendus_municipaux索引中的条目，Kibana 控制台会列出它：

GET comptes_rendus_municipaux/_search?pretty=true
{
 "query": {
    "match_all": {
    }
  },
  
  "size": 9000, 
  "stored_fields": ["_id"]
}

个字符
但是，如果我尝试通过GET comptes_rendus_municipaux/_doc/29090-2023-09-04查看comptes_rendus_municipaux中的条目29090-2023-09-04，Kibana控制台什么也不显示。
不是一个错误，不是什么开头是：

{
  "_index": "comptes_rendus_municipaux",
  "_id": "29090-2023-09-04",
  "_version": 1,
  "_seq_no": 1,
  "_primary_term": 1,
  "found": true,
  "_source": {
    "data": "JVBERi0xLjUKJdDU..."

型
我可以期待。
但真的没什么。为什么？
(1)我创建索引的方式

PUT comptes_rendus_municipaux/
{
  "aliases": {},
  "mappings": {
    "properties": {
      "attachment": {
        "properties": {
          "author": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "content": {
            "type": "text",
            "analyzer": "french",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "content_length": {
            "type": "long"
          },
          "content_type": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "creator_tool": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "date": {
            "type": "date"
          },
          "format": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "keywords": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "language": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "modified": {
            "type": "date"
          },
          "title": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          }
        }
      },
      "data": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword",
            "ignore_above": 256
          }
        }
      }
    }
  }
}

型
(2)我摄入的方式
脚本index.sh：

#!/bin/bash
export source=$1
export index=$2
export entree=$3

# Le paramètre source doit être alimenté
if [ -z "$source" ]; then
   echo "Le nom du fichier pdf à indexer dans Elastic est attendu en paramètre." >&2
   exit 1
fi

# Si le fichier source n'a pas d'extension, lui rajouter celle .pdf
if [[ "$source" != *"."* ]]; then
   source=$source.pdf
fi

# Il doit avoir l'extension pdf
if [[ "$source" != *".pdf" ]]; then
   echo "Le fichier à indexer dans Elastic doit avoir l'extension .pdf" >&2
   exit 1
fi

# Si l'index n'est pas alimenté, lui assigner apprentissage
if [ -z "$index" ]; then
  index=apprentissage
fi

# Si l'entrée n'est pas alimentée, lui assigner le nom du fichier sans extension
if [ -z "$entree" ]; then
  entree=$(basename "${source%.*}")
fi

host="http://localhost:9200"
user="elastic"
pwd="Ox-JoQT0po=pdOT-LeE*"

json_file=$(mktemp)
cur_url="$host/$index/_doc/$entree?pipeline=attachment"

echo '{"data"  : "'"$( base64 "$source" -w 0    )"'"}' >"$json_file"
# echo "transfert via $json_file vers $cur_url"

if ! ingest=$(curl -s -X PUT -H "Content-Type: application/json" -u "$user:$pwd" -d "@$json_file" "$cur_url"); then
  echo "Echec de l'ingestion dans Elastic de $source dans l'index $index, entrée $entree : $ingest" >&2
  exit $?
fi

rm "$json_file"
echo "$source indexé dans l'entrée $entree de l'index $index d'Elastic"

型
使用命令行运行：

index Kerlaz-Compte-rendu-conseil-04-09-2023 comptes_rendus_municipaux 29090-2023-09-04

型
一个成功的输出：

Kerlaz-Compte-rendu-conseil-04-09-2023.pdf indexé dans l'entrée 29090-2023-09-04 de l'index comptes_rendus_municipaux d'Elastic

型

elasticsearch

来源：https://stackoverflow.com/questions/77695950/why-does-a-get-index-doc-entry-return-nothing-not-an-error-not-even-an-inde

1条答案

按热度按时间

xvw2m8pv1#

后记：

在我的8.11.3上，如果我通过curl执行它，它可以工作：

curl  -X GET -H "Content-Type: application/json" 
   -u "elastic:Ox-JoQT0po=pdOT-LeE*" 
   "http://localhost:9200/comptes_rendus_municipaux/_doc/29090-2023-09-04"

个字符
单行4.6 MB大小（原始pdf为3.4 MB）
我可能在一个 Kibana 行为前面，除了一个200 - OK，它不显示任何东西，因为响应太大了。

的数据
但没有更明确的警告是令人不安的。