python—使用pandas解析html文件以提取特定表

pobjuy32  于 2021-07-14  发布在  Java
关注(0)|答案(0)|浏览(141)

我使用下面的代码试图确定html文件中的表数并读取前两个。我是python新手,不熟悉如何处理以下错误。
我正在尝试确定表的顺序,因为数据文件没有固定的格式,我希望我需要的表至少每次都以相同的顺序放置。
代码:

import pandas as pd

file = r'C:\Users\Ahmed_Abdelmuniem\Desktop\ADC-12 ABQQ-48.html'
table = pd.read_html(file)[0]

print ('tables found:', len(table))
df1 = table[0]
df2 = table[1]

print ('First Table')
print (df1)
print ('Second Table')
print (df2)

错误:

C:\Users\Ahmed_Abdelmuniem\AppData\Local\Programs\Python\Python39\python.exe C:/Users/Ahmed_Abdelmuniem/PycharmProjects/PandaHTML/main.py
tables found: 201
Traceback (most recent call last):
  File "C:\Users\Ahmed_Abdelmuniem\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\indexes\base.py", line 3080, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas\_libs\index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\hashtable_class_helper.pxi", line 4554, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas\_libs\hashtable_class_helper.pxi", line 4562, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 0

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\Ahmed_Abdelmuniem\PycharmProjects\PandaHTML\main.py", line 7, in <module>
    df1 = table[0]
  File "C:\Users\Ahmed_Abdelmuniem\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\frame.py", line 3023, in __getitem__
    return self._getitem_multilevel(key)
  File "C:\Users\Ahmed_Abdelmuniem\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\frame.py", line 3074, in _getitem_multilevel
    loc = self.columns.get_loc(key)
  File "C:\Users\Ahmed_Abdelmuniem\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\indexes\multi.py", line 2876, in get_loc
    loc = self._get_level_indexer(key, level=0)
  File "C:\Users\Ahmed_Abdelmuniem\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\indexes\multi.py", line 3158, in _get_level_indexer
    idx = self._get_loc_single_level_index(level_index, key)
  File "C:\Users\Ahmed_Abdelmuniem\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\indexes\multi.py", line 2809, in _get_loc_single_level_index
    return level_index.get_loc(key)
  File "C:\Users\Ahmed_Abdelmuniem\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\indexes\base.py", line 3082, in get_loc
    raise KeyError(key) from err
KeyError: 0

Process finished with exit code 1

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题