python-3.x Pandas框架:将特定的值写入具有特定格式的文件?

mpbci0fu  于 4个月前  发布在  Python
关注(0)|答案(1)|浏览(72)

我想更新外部文件。我想通过以特定格式将新数据从Pandas对象帧写入文件来完成此操作。
下面是一个文件中前两个数据块的例子(当我将数据写入文件时,我希望从数据框中获得的新数据看起来像什么)。“identifieri“是数据块和数据框的唯一名称。

(Lines of comments, then)
identifier1       label2 = i \ label3        label4                                  
label5
A1 = -5563.88 B2 = -4998 C3 = -203.8888 D4 = 5926.8 
E5 = 24.99876 F6 = 100.6666 G7 = 30.008 H8 = 10.9999
J9 = 1000000 K10 = 1.0002 L11 = 0.1
M12

identifier2       label2 = i \ label3        label4                                  
label5
A1 = -788 B2 = -6554 C3 = -100.23 D4 = 7526.8 
E5 = 20.99876 F6 = 10.6666 G7 = 20.098 H8 = 10.9999
J9 = 1000000 K10 = 1.0002 L11 = 0.000
M12
...

字符串
在嵌套框中,新数据位于两列中,分别称为LabelsNumbers。每个标签都对应一个数字。例如,有标签A1B2C3等(参见上面的示例),并且对于“标识符2”数据块,对应的数字是-788-6554-100.23。然而,还有一些我不需要的数据。我只需要特定行的数据。
到目前为止,我已经尝试从一个嵌套框中获取我需要的特定标签和数字,然后将它们以追加模式写入一个文件。我可以从一个嵌套框中打印一个标签和它的数字:

df.set_index("Labels", inplace=True)
print("Label 1", df.loc["label1", 'Numbers'])


我的计划是得到我需要的每个标签和数字,然后将它们附加到一个文件中,沿着加上适当的空格。然而,当我尝试这样做时:

df.loc["label1", 'Numbers'].to_csv('file.txt', mode='a', index = False, sep=' ')


我得到:AttributeError: 'float' object has no attribute 'to_csv'
看起来我的方法很笨拙,而且很难弄清楚如何添加空格和换行符。如果有人知道更简单和复杂的方法,我会很感激任何建议。

mcvgt66p

mcvgt66p1#

我甚至不会费心使用内置的实用程序,因为这是一种不寻常的输出,它只有几行代码通过DataFrame无论如何,并给你很多的控制最终输出。
1.创建一个str,并添加任何标题、开头注解等
1.循环遍历每个标识符
1.从DataFrame中获取每个值并将其放入输出str
1.适当地添加新行,使其看起来像目标格式
下面是一段可运行的代码,展示了如何实现这一点(带有rngdfs的字段块只是为了生成一些示例数据,可以忽略)。请随时评论任何后续问题,或者如果我得到了错误的格式,请告诉我,我会调整。

import pandas as pd
import numpy as np
import string

# Generate some random data that matches the description
rng = np.random.default_rng(seed=42)
dfs = {
    idname: pd.DataFrame(data=[
        {
            'Labels': string.ascii_uppercase[i] + str(i + 1),
            'Numbers': rng.integers(0, 1000)
        } for i in range(20)
    ]) for idname in ['identifier1', 'identifier2', 'identifier3']
}

# Make this list whatever output fields you want in the file
desired_fields = [string.ascii_uppercase[i] + str(i + 1) for i in range(11)]
# Seems like data is put into the file in groups of 4
stride = 4
# This the final string to be written to file
outstr = ''
# Add "Lines of comments"
outstr += '// comment1\n// comment2\n// comment3\n// comment4\n'
for idname, id_data in dfs.items():
    # Append the header like region
    outstr += f'{idname}       label2 = i \ label3        label4\nlabel5\n'
    # Append the "field = #" region
    for i, field in enumerate(desired_fields):
        # Grab the value from the Numbers field for the row in this
        #   DataFrame that matches that Labels value
        try:
            value = str(id_data.loc[id_data['Labels'] == field].iloc[0]['Numbers'])
        except IndexError:
            # Handle missing data here
            value = 'N/A'

        # Format that name value pair, can easily be adjusted
        outstr += f'{field} = {value} '
        # If a newline is necessary, append it based on stride
        if i % stride == stride - 1:
            outstr += '\n'
    # If the stride leaves us without a newline, add one
    if not outstr[-1] == '\n':
        outstr += '\n'
    # Add a blank space between identifiers
    outstr += '\n'

# Write to file or print or whatever
print(outstr)
with open('outputfile.txt', 'w') as fh:
    fh.write(outstr)

字符串
下面是我得到的输出,它看起来与您的示例数据非常相似:

// comment1
// comment2
// comment3
// comment4
identifier1       label2 = i \ label3        label4
label5
A1 = 89 B2 = 773 C3 = 654 D4 = 438 
E5 = 433 F6 = 858 G7 = 85 H8 = 697 
I9 = 201 J10 = 94 K11 = 526 

identifier2       label2 = i \ label3        label4
label5
A1 = 500 B2 = 370 C3 = 182 D4 = 926 
E5 = 781 F6 = 643 G7 = 402 H8 = 822 
I9 = 545 J10 = 443 K11 = 450 

identifier3       label2 = i \ label3        label4
label5
A1 = 165 B2 = 758 C3 = 700 D4 = 354 
E5 = 67 F6 = 970 G7 = 445 H8 = 893 
I9 = 677 J10 = 778 K11 = 759

相关问题