如何在python中按列查找交替的真值和假值?

lnxxn5zx  于 2021-07-14  发布在  Java
关注(0)|答案(2)|浏览(233)

我有一个大约100列的Dataframe,这个子集是问题的复制品

df=pd.DataFrame({"a":["False","False","True","True","False","False","True","True","False"],
                "b":["False","False","True","True","True","False","True","True","False"]})

我想遍历每一列,这将执行以下操作:
搜索第一个出现的true:—如果找到了,保持这种方式,并使每件事np.nan在它之前
那么它应该寻找假:-如果找到了使每件事在真之后和假之前np.nan
继续这个过程直到最后一行

所需输出

trl=pd.DataFrame({"a":[np.nan,np.nan,"True",np.nan,"False",np.nan,"True",np.nan,"False"],
                "b":[np.nan,np.nan,"True",np.nan,np.nan,"False","True",np.nan,"False"]})
5m1hhzi4

5m1hhzi41#

我发现这样:

import pandas as pd
import numpy as np
flag = "True" #Initial value
def checkAndInvert(x):
    global flag #Using the global value, because we need the previous states
    if x == flag:
        #If the flag matches then invert and return the same value
        flag = "False" if flag=="True" else "True"
        return x
    #if the flag does not match then return NAN
    return np.nan

df=pd.DataFrame({"a":["False","False","True","True","False","False","True","True","False"],
                "b":["False","False","True","True","True","False","True","True","False"]})

for col in df.columns:
    flag = "True" #Initialise the global value back to True 
    df[col] = df[col].apply(checkAndInvert)

df.to_dict(orient='list')

输出:

{'a': [nan, nan, 'True', nan, 'False', nan, 'True', nan, 'False'],
 'b': [nan, nan, 'True', nan, nan, 'False', 'True', nan, 'False']}
4zcjmb1e

4zcjmb1e2#

下面是另一个使用Dataframe滚动功能的解决方案:

def alternate_values(df):
    # Start by making a row with only the "False" value
    first_row = df.iloc[0].replace({"True":"False"}).rename(-1).to_frame().T

    # We will prepend this row because we want to always start with "True"
    df = pd.concat([first_row, df])

    # Compare every element with its previous and set to NaN if they match
    def f(x):
        return np.nan if x.iloc[0] == x.iloc[1] else x.iloc[1]

    # Apply the rolling function on numeric values
    df = df.replace({"True":1, "False":0})\
           .rolling(2).apply(f)\
           .replace({1:"True", 0:"False"})

    # Finally drop the extra row at the beginning
    df = df.drop(-1, axis=0)
    return df

此函数的工作原理如下:

>>> df=pd.DataFrame({"a": ["False","False","True","True","False","False","True","True","False"],
                     "b":["False","False","True","True","True","False","True","True","False"],
                     "c":["True","False","True","True","True","False","True","True","False"]})

>>> df
       a      b      c
0  False  False   True
1  False  False  False
2   True   True   True
3   True   True   True
4  False   True   True
5  False  False  False
6   True   True   True
7   True   True   True
8  False  False  False

>>> alternate_values(df)
       a      b      c
0    NaN    NaN   True
1    NaN    NaN  False
2   True   True   True
3    NaN    NaN    NaN
4  False    NaN    NaN
5    NaN  False  False
6   True   True   True
7    NaN    NaN    NaN
8  False  False  False

相关问题