Pandas future warning for idxmax()& idxmin()question？

wqnecbli 于 5个月前发布在其他

关注(0)|答案(1)|浏览(32)

python 3.9.13，pandas 2.1.3

blahblah.py:29: FutureWarning: The behavior of DataFrame.idxmax with all-NA values, or any-NA and skipna=False, is deprecated. In a future version this will raise ValueError   df['max days'] = df[columns].idxmax(axis=1)

字符串
相同的警告：skipna=False或skipna=True
尽管有警告，程序仍按预期运行。
有人知道这是怎么回事吗？谷歌没有返回任何有用的信息。

pandas

来源：https://stackoverflow.com/questions/77616969/pandas-future-warning-for-idxmax-idxmin-question

1条答案

按热度按时间

ki0zmccv1#

警告很简单，它说它会在两种情况下抛出警告：
1.您的列/行包含所有NA值
1.列/行至少包含一个NA值，并且指定skipna=False
解决方案也很简单：

对于情况1：删除/填充包含所有NA值的列/行
对于case 2：指定skipna=True

`Series.idxmax()`示例

首先以Series.idxmax()为例。
假设我有一个下面的框架

col1  col2
0  <NA>  <NA>
1  <NA>     2

字符串

col1包含所有NA值
col2包含一个NA值
row0包含所有NA值
row1包含一个NA值

当我使用df['col1'].idxmax()时，它会抛出以下错误：

C:\Users\Desktop\1.py:12: FutureWarning: The behavior of Series.idxmax with all-NA values, or any-NA and skipna=False, is deprecated. In a future version this will raise ValueError
  df['idxmax col1'] = df['col1'].idxmax()

型
对于df['col1'].idxmax(skipna=True)，它是

C:\Users\Desktop\1.py:12: FutureWarning: The behavior of Series.idxmax with all-NA values, or any-NA and skipna=False, is deprecated. In a future version this will raise ValueError
  df['idxmax col1'] = df['col1'].idxmax(skipna=True)

型
问题是col1包含所有NA值，所以它匹配我的case 1，并且无论您添加skipna都抛出警告。
对于col2包含一个NA值的df['col2'].idxmax(skipna=False)，它匹配case 2，您将得到警告

C:\Users\Desktop\1.py:11: FutureWarning: The behavior of Series.idxmax with all-NA values, or any-NA and skipna=False, is deprecated. In a future version this will raise ValueError
  df['col2'].idxmax(skipna=False)

型
df['col2'].idxmax()没有给予任何警告的原因是Series.idxmax()的默认值为True。

`DataFrame.idxmax()`示例

回到DataFrame.idxmax(axis=1)，问题是其中一行包含所有NA值，就像我的示例中的row 0。

C:\Users\Desktop\1.py:11: FutureWarning: The behavior of DataFrame.idxmax with all-NA values, or any-NA and skipna=False, is deprecated. In a future version this will raise ValueError
  df.idxmax(axis=1)

型
在这种情况下，您需要首先处理所有NA值行，例如删除这些行。

out = df.dropna(axis=0, how='all')  # Drop rows which contain all NA values.

res = out.idxmax(axis=1)            # skipna for DataFrame.idxmax default is True

print(out)

   col1 col2
1  <NA>    2

print(res)

1    col2
dtype: object

的字符串
删除包含df中所有NA值的行后，我们可以安全地获得out.idxmax(axis=1)列上第一次出现的最大值的索引。
注意，out的行命中了case 2，如果我们执行out.idxmax(axis=1, skipna=False)，你也会得到警告

C:\Users\Desktop\1.py:13: FutureWarning: The behavior of DataFrame.idxmax with all-NA values, or any-NA and skipna=False, is deprecated. In a future version this will raise ValueError
  res = out.idxmax(axis=1, skipna=False)

型

赞(0）回复(0）举报 5个月前

我来回答

Pandas future warning for idxmax()& idxmin()question？

1条答案

`Series.idxmax()`示例

`DataFrame.idxmax()`示例

相关问题

热门标签

最新问答

Pandas future warning for idxmax()& idxmin()question？

1条答案

Series.idxmax()示例

DataFrame.idxmax()示例

相关问题

热门标签

最新问答

`Series.idxmax()`示例

`DataFrame.idxmax()`示例