使用Python基于两个或多个不同级别的过滤器查询JSON对象

kninwzqo  于 2023-05-02  发布在  Python
关注(0)|答案(1)|浏览(86)

当我试图查询一个json对象时,我有点迷路了,它看起来像下面这样。我想通过子键"type": "header1"和子子键"type": "simpletext"进行查询。我想接收所有可以循环的结果。
这是初始的原始json:

{
  "obj": "typed",
  "results": [
      {
        "obj": "blocktext",
        "type": "header1",
        "header1": {
          "rich_text": [
            {
              "type": "simpletxt",
              "txt": "About me"
            }
          ]
        }
      },
      {
        "obj": "blocktext",
        "type": "header1",
        "header1": {
          "rich_text": [
            {
              "type": "simpletxt",
              "txt": "Imprint"
            }
          ]
        }
      },
      {
        "obj": "blocktext",
        "type": "header1",
        "header1": {
          "rich_text": [
            {
              "type": "simpletxt",
              "txt": "About me"
            }
          ]
        }
      },
      {
        "obj": "blocktext",
        "type": "code",
        "code": {
          "rich_text": [
            {
              "type": "javascript",
              "txt": "ABCDEFG"
            }
          ]
        }
      }
    ]
}

我试图在Google Colabs中使用Python来实现这一点,但我完全不知道如何通过两个或更多参数在不同级别上查询JSON对象。

yws3nbqq

yws3nbqq1#

假设你的json在一个名为jstr的变量中,你可以使用json.loads转换成一个dict,然后使用filter选择results中有'type' == 'header1'的子字典,然后选择子类型变量等于'simpletxt'

dd = json.loads(jstr)
res = list(filter(lambda r:r['type'] == 'header1' and r['header1']['rich_text'][0]['type'] == 'simpletxt', dd['results']))

输出:

[
    {
        "obj": "blocktext",
        "type": "header1",
        "header1": {
            "rich_text": [
                {
                    "type": "simpletxt",
                    "txt": "About me"
                }
            ]
        }
    },
    {
        "obj": "blocktext",
        "type": "header1",
        "header1": {
            "rich_text": [
                {
                    "type": "simpletxt",
                    "txt": "Imprint"
                }
            ]
        }
    },
    {
        "obj": "blocktext",
        "type": "header1",
        "header1": {
            "rich_text": [
                {
                    "type": "simpletxt",
                    "txt": "About me"
                }
            ]
        }
    }
]

请注意,上面假设(如示例数据中所示)rich_text列表只包含一个项。如果它可以包含很多,您将需要更改过滤器以检查子类型变量的allany(取决于您的需要)是'simpletxt'。例如:

# any
res = list(filter(lambda r:r['type'] == 'header1' and any(rt['type'] == 'simpletxt' for rt in r['header1']['rich_text']), dd['results']))
# all
res = list(filter(lambda r:r['type'] == 'header1' and all(rt['type'] == 'simpletxt' for rt in r['header1']['rich_text']), dd['results']))

对于您的示例数据,这两个返回相同的结果,因为rich_text列表中只有一个值。

相关问题