如何解决valueerror:列的长度必须与键的长度相同?

hlswsv35  于 2021-09-08  发布在  Java
关注(0)|答案(1)|浏览(645)

我有一个巨大的数据集(104259行)和列中的某个地方 game ,有一个或多个值具有多个 " - " 我正试图将这些列拆分为。
我的示例 Dataframe 是:

df:
+-----+--------------+----------------+-------------+--------+---------------------------------------+----------+-------------+-------------+-------------+-----------+------------------------------+
|     |   Unnamed: 0 |   Unnamed: 0.1 | date        | time   | game                                  | score    | home_odds   | draw_odds   | away_odds   | country   | league                       |
+=====+==============+================+=============+========+=======================================+==========+=============+=============+=============+===========+==============================+
|   0 |            0 |              0 | nan         | 15:30  | Iliria Kruja - Cerrik                 | 0:3      | -           | -           | -           | Albania   | First Division               |
+-----+--------------+----------------+-------------+--------+---------------------------------------+----------+-------------+-------------+-------------+-----------+------------------------------+
|   1 |            1 |              1 | 25 Jul 2020 | 15:30  | Elbasani - Devolli                    | 3:1      | -           | -           | -           | Albania   | First Division               |
+-----+--------------+----------------+-------------+--------+---------------------------------------+----------+-------------+-------------+-------------+-----------+------------------------------+
|   2 |            2 |              2 | 11 Jul 2020 | 15:30  | Beselidhja Lezha - Kastrioti          | 2:0      | 1.46        | 3.80        | 6.40        | Albania   | First Division               |
+-----+--------------+----------------+-------------+--------+---------------------------------------+----------+-------------+-------------+-------------+-----------+------------------------------+
|   3 |            3 |              3 | 05 Jul 2020 | 15:30  | Lushnja - Apolonia Fier               | 1:2      | 2.39        | 3.56        | 2.44        | Albania   | First Division               |
+-----+--------------+----------------+-------------+--------+---------------------------------------+----------+-------------+-------------+-------------+-----------+------------------------------+

当我运行这部分代码时:

df[['home_team', 'away_team']] = df['game'].str.split(' - ', expand=True)

我得到这个错误:

Traceback (most recent call last):
  File "C:/Users/harsh/AppData/Roaming/JetBrains/PyCharmCE2021.1/scratches/scratch_37.py", line 22, in <module>
    df[['home_team', 'away_team']] = df['game'].str.split(' - ', expand=True)
  File "C:\Python\lib\site-packages\pandas\core\frame.py", line 3160, in __setitem__
    self._setitem_array(key, value)
  File "C:\Python\lib\site-packages\pandas\core\frame.py", line 3189, in _setitem_array
    raise ValueError("Columns must be same length as key")
ValueError: Columns must be same length as key

而我怀疑是否有一行或多行 str.split 但是,我不确定是哪一行。
现在,我可以选择:
如果有较少的行包含此类数据(少于10),我可以安全地选择删除或删除它们
如果有超过10行,我可以在第一次遇到此分隔符时拆分它们。
我只是不知道如何在代码方面做到这一点。
我如何检查和处理这个问题?

zlhcx6iw

zlhcx6iw1#

可能您的某些行不止一行 " - " 一串如果您对结构非常确定,并且希望在第一次遇到分隔符时分割,请使用以下命令 n 参数,如下所示:

df['game'].str.split(' - ', expand=True, n=1)

这里有文档。

相关问题