如何将特定格式的文本文件转换为字典?

u4dcyp6a  于 2021-08-20  发布在  Java
关注(0)|答案(1)|浏览(283)

此问题已在此处找到答案

用python从缩进文本文件创建树/深度嵌套dict(4个答案)
五小时前关门了。
我有一个这样的文件,我想把它转换成python字典。

Week1=>
    Monday=>
        Math=8:00
        English=9:00
    Tuesday=>
        Spanish=10:00
        Arts=3:00

输出应为:

{"Week1": {"Monday": {"Math": "8:00", "English": "9:00"}, "Tuesday": {"Spanish":"10:00", "Arts": "3:00"}}}

这是我的实际代码:

def ReadFile(filename, extension) -> dict:
        content = {} # Content will store the dictionary (from the file).

        prefsTXT = open(f"{filename}.{extension}", "r") # Open the file with read permissions.
        lines = prefsTXT.readlines() # Read lines.

        content = LinesToDict(lines) # Calls LinesToDict to get the dictionary.

        prefsTXT.close() # Closing file.

        return content # Return dictionary with file content.

def LinesToDict( textList: list) -> dict:
    result = {} # Result

    lines = [i.replace("\n", "") for i in textList] # Replace the line break with nothing.

    for e, line in enumerate(lines): # Iterate through each line (of the file).

        if line[0].strip() == "#": continue # If first character is # ignore the whole line.

        keyVal, lines = TextToDict(e, lines) # Interpret each line using TextToDict function and set the value to KeyVal (and lines because is modified in TexToDict).
        keyVal = list( keyVal.items() )[0] # COnvert the dictionary to a list of tuples

        result[keyVal[0]] = keyVal[1] # Add key:value to result

    return result # Returns result

def TextToDict(textIndx: int, textList: list) -> dict, list:
    result = {} # Result

    text = textList[textIndx].strip() # Strip the passed line (because of the tabulations).

    if text[0].strip() == "#": return # If first character is # ignore the whole line.

    keyVal = text.split("=", 1) # Split line by = which is the separator between key:value.

    if keyVal[1] == ">": # If value == ">"
        indentVal, textList = TextToDict(textIndx + 1, textList) # Calls itself to see the content of the next line.
        textList.pop(textIndx + 1) # Pop the interpeted content.

    else: # If value isn't ">"
        indentVal = keyVal[1] # Set indentVal to the val of keyVal

    result[keyVal[0]] = indentVal # Setting keyVal[0] as key and indentVal as value of result dictionary

    return result, textList # Return result and modified lines list

file = ReadFile("trial", "txt")
print(file)
``` `ReadFile` 读取文件并将行传递给 `LinesToDict` .  `LinesToDict` 通过这些行进行迭代,并将每一行传递给 `TextToDict` . `TextToDict` 将该行拆分为 `=` 并检查val(split[1])是否为==“>”,如果为,则使用下一行调用自身,并将该值存储在字典中以返回。
但我明白了:

{'Week1': {'Monday': {'Math': '8:00'}}, 'English': '9:00', 'Tuesday': {'Spanish': '10:00'}, 'Arts': '3:00'}

与此相反:

{"Week1": {"Monday": {"Math": "8:00", "English": "9:00"}, "Tuesday": {"Spanish":"10:00", "Arts": "3:00"}}}

iovurdzv

iovurdzv1#

格式看起来很接近yaml,但分隔符 $ pip install pyyaml ```
import re, yaml, json

mytext = open("/tmp/txt.txt").read() # read your content here
mydict = yaml.safe_load(re.sub("(\w+)=(.*)", "\1: "\2"", re.sub("=>\n", ":\n", mytext)))
print(json.dumps(mydict, indent=True))

输出:

{
"Week1": {
"Monday": {
"Math": "8:00",
"English": "9:00"
},
"Tuesday": {
"Spanish": "10:00",
"Arts": "3:00"
}
}
}

相关问题