csv 矩阵后处理问题

jum4pzuy 于 8个月前发布在其他

关注(0)|答案(1)|浏览(52)

如果这是我的数组：

csv_arr_output = [
    ["",        "",       "Average", "",    "Red"     ],
    ["",        "",       "",        "",    "Red eyes"],
    ["",        "height", "weight",  "",    ""        ],
    ["Males",   "1.9",    "0.003",   "40%", ""        ],
    ["Females", "1.7",    "0.002",   "43%", ""        ],
]

字符串
稍后将转换为CSV文件，但我使用此函数进行后期处理：

def post_processing(csv_array, common_percent_threshold=0.5):
        # Replace longer next cell if they have a certain percent of common string
        for row in range(len(csv_array) - 1):
            for col in range(len(csv_array[row])):
                current_cell = csv_array[row][col]
                next_cell = csv_array[row + 1][col]
    
                if calculate_common_percent(current_cell, next_cell) >= common_percent_threshold:
                    if len(current_cell) < len(next_cell):
                        csv_array[row][col] = next_cell
                        csv_array[row + 1][col] = ""
    
        # Merge empty cells in the same column
        for col in range(len(csv_array[0])):
            column_cells = [row[col] for row in csv_array]
            non_empty_cells = [cell for cell in column_cells if cell != ""]
            empty_cells = [""] * (len(column_cells) - len(non_empty_cells))
    
            for row in range(len(csv_array)):
                if csv_array[row][col] == "":
                    csv_array[row][col] = empty_cells.pop(0)
    
        # Remove empty rows
        csv_array = [row for row in csv_array if any(cell != "" for cell in row)]
    
        return csv_array
    
    def calculate_common_percent(cell1, cell2):
        set1 = set(cell1.split())
        set2 = set(cell2.split())
        
        # Check if both sets are empty
        if not set1 and not set2:
            return 0.0
        
        common_words = set1.intersection(set2)
        total_words = set1.union(set2)
        
        # Avoid division by zero
        if not total_words:
            return 0.0
        
        return len(common_words) / len(total_words)

型
这是我得到的输出：

[
    ["",        "",       "Average", "",    "Red eyes"],
    ["",        "height", "weight",  "",    ""        ],
    ["Males",   "1.9",    "0.003",   "40%", ""        ],
    ["Females", "1.7",    "0.002",   "43%", ""        ],
]

型
正如你可以看到下面的列红眼所有的细胞是空的，我想shif红眼到左边，然后删除空列colud你请帮助我在这个函数中集成此逻辑
我期待后处理功能做所有的functionalty除了转移列标题，它下面有空单元格，然后removign所有的空列从矩阵

csv

来源：https://stackoverflow.com/questions/77427855/matrix-post-processing-issue

1条答案

按热度按时间

pgx2nnw81#

从一个二维列表开始，它看起来像：

l = [
    ["",        "",       "Average", "",    "Red"     ],
    ["",        "",       "",        "",    "Red eyes"],
    ["",        "height", "weight",  "",    ""        ],
    ["Males",   "1.9",    "0.003",   "40%", ""        ],
    ["Females", "1.7",    "0.002",   "43%", ""        ],
]

字符串
您可以：

把“红眼睛”向上移动
删除空行
删除最后（第5）列

比如：

# move 'Red eyes' up and over
l[0][3] = l[1][4]

# delete unnecessary row
del l[1]

# trim 5th cell from each row (effectively deleting 5th column)
l = [x[0:4] for x in l]

print(l)

型
这给了我：

[
    ["",        "",       "Average", "Red eyes"],
    ["",        "height", "weight",  ""        ],
    ["Males",   "1.9",    "0.003",   "40%"     ],
    ["Females", "1.7",    "0.002",   "43%"     ],
]

型

赞(0）回复(0）举报 8个月前

我来回答

csv 矩阵后处理问题

1条答案

相关问题

热门标签

最新问答