如何从所有子目录读取所有json文件?

3qpi33ja  于 2021-08-20  发布在  Java
关注(0)|答案(2)|浏览(371)

我想一步从多个子目录中读取大量json文件。我该怎么做?

import glob
hit = glob.glob("/./*.json")
7gyucuyw

7gyucuyw1#

您可以使用下面的逻辑检索文件路径列表。
这个答案有很多愚蠢的方法。
一旦你有了文件列表加载它们是非常简单的。。。

import os
import json

def get_files_from_path(path: str='.', extension: str=None) -> list:
    """return list of files from path"""
    # see the answer on the link below for a ridiculously 
    # complete answer for this. I tend to use this one.
    # note that it also goes into subdirs of the path
    # https://stackoverflow.com/a/41447012/9267296
    result = []
    for subdir, dirs, files in os.walk(path):
        for filename in files:
            filepath = subdir + os.sep + filename
            if extension == None:
                result.append(filepath)
            elif filename.lower().endswith(extension.lower()):
                result.append(filepath)
    return result

filelist = get_files_from_path(extension='.json')
jsonlist = []
for filepath in filelist:
    with open(filepath) as infile:
        jsonlist.append(json.load(infile))

from pprint import pprint
pprint(jsonlist)
wpx232ag

wpx232ag2#

只需对原始代码稍加修改,即可获得所有文件路径。
来自python文档
如果recursive为true,则模式“**”将匹配任何文件以及零个或多个目录、子目录和指向目录的符号链接。

from glob import glob
hits = glob("**/*.json", recursive=True)

"""
You will get a list of paths like this
hits = [

# ....

'<path>/delta/job-3/33ce4079-c8db-11eb-8683-f30d80ef99b2.json',
 '<path>/delta/pair-1/cf81188b-c8d7-11eb-8367-5fb2efe63fc6.json',
 '<path>/sub/b1d91522-cab4-11eb-a5cb-113c64720fe0.json',
 '<path>/sub/979a7c2b-cab4-11eb-a5cb-b91d90206530.json',
 '<path>/sub/977e4199-cab4-11eb-a5cb-33b60824fb94.json',
 '<path>/sub/a5fb35cd-cab4-11eb-a5cb-5f6cd57276ff.json',
 '<path>/sub/a60520de-cab4-11eb-a5cb-2723d138a03b.json',
 '<path>/sub/9805e82c-cab4-11eb-a5cb-3987a3853fca.json'
]
"""

相关问题