在列中每次出现值(true)时，按 Dataframe 拆分或分组

hivapdat 于 2021-08-25 发布在 Java

关注(0)|答案(1)|浏览(315)

a有这样一个df：

df = pd.DataFrame({'words':['hi', 'this', 'is', 'a', 'sentence', 'this', 'is', 'another', 'sentence'], 'indicator':[1,0,0,0,0,1,0,0,0]})

这给了我：

words  indicator
0        hi          1
1      this          0
2        is          0
3         a          0
4  sentence          0
5      this          1
6        is          0
7   another          0
8  sentence          0

现在，我想合并列“words”的所有值，这些值位于指示器中“1”之后，直到出现下一个“1”。这样的结果才是理想的结果：

words  indicator  counter
0     hi this is a sentence          1        5
1  this is another sentence          1        4

解释起来并不容易，这就是为什么我依赖这个例子。我尝试了groupby和split，但没有找到解决方案。最后一次尝试是设置某种类型的df.iterrows（），但我现在不想这样做，因为实际的df相当大。
提前感谢您的帮助！

python DataFrame pandas group-by

来源：https://stackoverflow.com/questions/68301714/pandas-dataframe-split-or-groupby-dataframe-at-each-occurence-of-value-true-in

1条答案

按热度按时间

new9mtju1#

您可以获得指标的累计和，然后将其分组，将所有单词合并到一个空格中，并计算每个句子中的单词数。

df["indicator"] = df["indicator"].cumsum()
df = df.groupby(
    "indicator", as_index=False
).agg(
    words=("words", " ".join), 
    counter=("indicator", "size")
)

# indicator                     words  counter

# 0          1     hi this is a sentence        5

# 1          2  this is another sentence        4

赞(0）回复(0）举报 2021-08-25

我来回答

在列中每次出现值(true)时，按 Dataframe 拆分或分组

1条答案

相关问题

热门标签

最新问答