文件目录列表或df的动态重复

jrcvhitl  于 2021-09-08  发布在  Java
关注(0)|答案(0)|浏览(113)

所以我有工作代码来搜索我想要的目录中的文件。我可以将它们保存为 Dataframe 或列表。然后我了解了如何使用findall(),find().text从一个xml文件中获取标记和文本。现在,我想让它对一组与此 Dataframe 或列表中的位置相等的值更加动态。我收到一条错误消息“预期的是str、bytes或os.pathlike对象,而不是tuple”。我想知道我的理解在这段代码中哪里是错误的,或者我的理解有点不正确?我也不确定我是否过度复杂化了一个 Dataframe ,应该只使用一个列表。。。。。

import os  
import pandas as pd
import os  
import xml.etree.ElementTree as ET

.....
current_dur = r'Workplace Investing'

# empty data frame to put the file paths in.

file_results = []    

# logic to search through the directories.

for root, dirs, files in os.walk(current_dur):
    for file in files:
            if file.endswith('.ldm') or file.endswith('.cdm') or file.endswith('.pdm'):
                file_results.append(os.path.join(root, file))           

filesdataframe = pd.DataFrame(file_results)
filesdataframe.rename(columns = { 0 :'Directory Path'}, inplace = True)

......

# This is an example of what is inside the filesdataframe. All of these values are inside a single column but have many rows.

# \\dmn1.fmr.com\FIDFILE\FS159\FIRSCODB\WI Data Modeling\PWI\WI - Workplace Investing\WORKPLACE INVESTING.cdm

# \\dmn1.fmr.com\FIDFILE\FS159\FIRSCODB\WI Data Modeling\PWI\WI - Workplace Investing\INTERACTION CDM.cdm

# \\dmn1.fmr.com\FIDFILE\FS159\FIRSCODB\WI Data Modeling\PWI\WI - Workplace Investing\DC - Defined Contributions\PARTY.ldm

# In my working code I set the xmlfile equal to a single directory path.

# This worked for one file. Now my thinking is maybe I set xmlfile equal to dataframe.iterrows().

# That way when I make my loop it will go by each row.

xmlfile = next(filesdataframe.iterrows())

for ind in filesdataframe.index:
    #This is next line is where I am getting my issue. 
    tree = ET.parse(xmlfile)

    root = tree.getroot()
.....

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题