Python写入CSV格式问题

djp7away 于 5个月前发布在 Python

关注(0)|答案(2)|浏览(93)

我在将数据写入正确格式化的CSV时遇到了麻烦。数据是动漫电视节目，包括某些数据，如标题，流派和梗概。在解析来自API调用的数据后，除了流派和梗概外，它都正确写入CSV。如果动漫有几个流派，只有第一个似乎被 Package 在““中，其余的部分与大纲结合起来，从大纲开始。
解析函数：

def parse_data(data):
    try: 
        
        # Genre parse
        genres_list = data['genres']
        genres = ', '.join(genre['name'] for genre in genres_list)
        print(genres)
        
        # Studio parse
        studio_name = "unknown"
        studio_parse = str(data.get('studios'))
        match = re.search(r"'name':\s*'([^']*)'", studio_parse)
        if match:
            studio_name = match.group(1)
        else:
            None
        
        # Synopsis parse
        synopsis_dirty = data['synopsis']
        synopsis = re.sub(r"\(Source: [^\)]+\)", "", synopsis_dirty).strip()
        synopsis = re.sub(r'\[Written by MAL Rewrite\]', '', synopsis).strip()

        
        details = str(data['id']) + ',' + data['title'].encode('utf-8').decode('cp1252', 'replace') + ',' + data['start_date'] + ',' + str(data['mean']) + ',' + str(data['rank']) + ',' + str(data['popularity']) + ',' + str(data['num_episodes']) + ',' + data['rating'] +  ',' + studio_name + ',' + genres + ',' + synopsis.encode('utf-8').decode('cp1252', 'replace')
        split_data = re.split(r"[,]", details)
        
        return split_data

字符串
词典格式：

def parsed_data_to_dict(data):
    try:
        dict = {
            "id" :  data[0],
            "title" : data[1].encode('utf-8').decode('cp1252', 'replace'),
            "start-date" : data[2],
            "mean"  : data[3],
            "rank"  : data[4],
            "popularity" : data[5],
            "num_episodes" : data[6],
            "rating" : data[7],
            "studio" : data[8],
            "genres" : data[9],
            "synopsis" : ''.join(data[10:]).encode('utf-8').decode('cp1251', 'replace')
        }

Expected CSV:

"8","Bouken Ou Beet","2004-09-30","6.93","4426","5274","52","pg","Toei Animation","Adventure, Fantasy, Shounen, Supernatural", "It is the dark century and the people are suffering under the rule of the devil Vandel who is able to manipulate monsters."

Actual CSV:

"8","Bouken Ou Beet","2004-09-30","6.93","4426","5274","52","pg","Toei Animation","Adventure"," Fantasy Shounen SupernaturalIt is the dark century and the people are suffering under the rule of the devil Vandel who is able to manipulate monsters."

csv

来源：https://stackoverflow.com/questions/77736480/python-writing-to-csv-formatting-issue

2条答案

按热度按时间

tuwxkamq1#

您可以手动创建逗号分隔的字符串以将其用作CSV原始文件。这种方法容易出错，特别是当字符串中包含逗号时。您始终需要小心地组合合并”“和”“字符串。
我建议你使用csv standard library代替。

赞(0）回复(0）举报 5个月前

af7jpaap2#

这

split_data = re.split(r"[,]", details)

字符串
在这一行之前，你已经有了data，它是字典，因此适合与csv.DictWriter（标准库的一部分）一起使用，它允许字符串包含,，也将处理包含"的字符串，考虑下面的例子

import csv
with open('names.csv', 'w', newline='') as csvfile:
    writer = csv.DictWriter(csvfile, fieldnames=['uno', 'dos', 'tres'])
    writer.writeheader()
    writer.writerow({'uno': 'simple', 'dos': 'with , comma', 'tres': 'with " quote'})

型
将给予文件names.csv以下内容

uno,dos,tres
simple,"with , comma","with "" quote"

型

赞(0）回复(0）举报 5个月前

我来回答

Python写入CSV格式问题

2条答案

相关问题

热门标签

最新问答