如何使用pandas统一列名以附加 Dataframe ?

bvjxkvbb  于 2021-09-29  发布在  Java
关注(0)|答案(3)|浏览(223)

我有两个 Dataframe ,如下所示

df1 = pd.DataFrame({'person_id': [101,101,101,101,202,202,202],
                   'person_type':['A','A','B','C','D','B','A'],
                   'test_id':[1,2,3,3,4,4,5],
                   'login_date':['5/7/2013 09:27:00 AM','09/08/2013 11:21:00 AM','06/06/2014 08:00:00 AM','06/06/2014 05:00:00 AM','12/11/2011 10:00:00 AM','13/10/2012 12:00:00 AM','13/12/2012 11:45:00 AM']})

df2 = pd.DataFrame({'subject_id': [101,101,101,101,202,202,202],
                   'test_date':['5/7/2013 09:27:00 AM','09/08/2013 11:21:00 AM','06/06/2014 08:00:00 AM','06/06/2014 05:00:00 AM','12/11/2011 10:00:00 AM','13/10/2012 12:00:00 AM','13/12/2012 11:45:00 AM']})

我想换个形状 df2df1 . 所谓形状,我指的只是列名。
例如:我想 df2 一模一样 df1 在列名方面,但保留df2的值。
我试过下面的方法

df2.rename(columns={'subject_id':'person_id', 'test_date':'login_date'}, inplace=True)
final_columns = df1.columns
previous_columns = df2.columns.tolist()
mapping = {previous_columns[i]: final_columns[i] for i in range(2)}
df2.rename(mapping, inplace=True)
final_df = df1.append(df2)

我希望我的输出如下所示

7xzttuei

7xzttuei1#

试用 pd.concat ```
import pandas as pd

pd.concat([
df1.assign(Data_From="df1"),
df2.assign(Data_From="df2")
.rename(columns={"subject_id": "person_id", "test_date": "login_date"})
])

person_id person_type test_id login_date Data_From
0 101 A 1.0 5/7/2013 09:27:00 AM df1
1 101 A 2.0 09/08/2013 11:21:00 AM df1
2 101 B 3.0 06/06/2014 08:00:00 AM df1
3 101 C 3.0 06/06/2014 05:00:00 AM df1
4 202 D 4.0 12/11/2011 10:00:00 AM df1
5 202 B 4.0 13/10/2012 12:00:00 AM df1
6 202 A 5.0 13/12/2012 11:45:00 AM df1
0 101 NaN NaN 5/7/2013 09:27:00 AM df2
1 101 NaN NaN 09/08/2013 11:21:00 AM df2
2 101 NaN NaN 06/06/2014 08:00:00 AM df2
3 101 NaN NaN 06/06/2014 05:00:00 AM df2
4 202 NaN NaN 12/11/2011 10:00:00 AM df2
5 202 NaN NaN 13/10/2012 12:00:00 AM df2
6 202 NaN NaN 13/12/2012 11:45:00 AM df2

n6lpvg4x

n6lpvg4x2#

使用 concatkeys 论点

df3 = pd.concat([df1,df2.rename(columns=
                      {'subject_id' : 'person_id',
                      'test_date' : 'login_date'})],
             join='outer',
             keys=['df1','df2'])

然后使用 .loc 来切你的df。

print(df3.loc['df1'])

   person_id person_type  test_id              login_date
0        101           A      1.0    5/7/2013 09:27:00 AM
1        101           A      2.0  09/08/2013 11:21:00 AM
2        101           B      3.0  06/06/2014 08:00:00 AM
3        101           C      3.0  06/06/2014 05:00:00 AM
4        202           D      4.0  12/11/2011 10:00:00 AM
5        202           B      4.0  13/10/2012 12:00:00 AM
6        202           A      5.0  13/12/2012 11:45:00 AM

打印(df3)

person_id person_type  test_id              login_date
df1 0        101           A      1.0    5/7/2013 09:27:00 AM
    1        101           A      2.0  09/08/2013 11:21:00 AM
    2        101           B      3.0  06/06/2014 08:00:00 AM
    3        101           C      3.0  06/06/2014 05:00:00 AM
    4        202           D      4.0  12/11/2011 10:00:00 AM
    5        202           B      4.0  13/10/2012 12:00:00 AM
    6        202           A      5.0  13/12/2012 11:45:00 AM
df2 0        101         NaN      NaN    5/7/2013 09:27:00 AM
    1        101         NaN      NaN  09/08/2013 11:21:00 AM
    2        101         NaN      NaN  06/06/2014 08:00:00 AM
    3        101         NaN      NaN  06/06/2014 05:00:00 AM
    4        202         NaN      NaN  12/11/2011 10:00:00 AM
    5        202         NaN      NaN  13/10/2012 12:00:00 AM
    6        202         NaN      NaN  13/12/2012 11:45:00 AM
l7mqbcuq

l7mqbcuq3#

首先在两个df中指定列

df1['DATA FROM']='df1'
df2['DATA FROM']='df2'

最后:
通过 append() + rename() :

df1.append(df2.rename(columns={'subject_id':'person_id','test_date':'login_date'}))


通过 concat() + rename() :

pd.concat([df1,df2.rename(columns={'subject_id':'person_id','test_date':'login_date'})])

输出:

person_id person_type  test_id              login_date   DATA FROM
0        101           A      1.0    5/7/2013 09:27:00 AM       df1
1        101           A      2.0  09/08/2013 11:21:00 AM       df1
2        101           B      3.0  06/06/2014 08:00:00 AM       df1
3        101           C      3.0  06/06/2014 05:00:00 AM       df1
4        202           D      4.0  12/11/2011 10:00:00 AM       df1
5        202           B      4.0  13/10/2012 12:00:00 AM       df1
6        202           A      5.0  13/12/2012 11:45:00 AM       df1
0        101         NaN      NaN    5/7/2013 09:27:00 AM       df2
1        101         NaN      NaN  09/08/2013 11:21:00 AM       df2
2        101         NaN      NaN  06/06/2014 08:00:00 AM       df2
3        101         NaN      NaN  06/06/2014 05:00:00 AM       df2
4        202         NaN      NaN  12/11/2011 10:00:00 AM       df2
5        202         NaN      NaN  13/10/2012 12:00:00 AM       df2
6        202         NaN      NaN  13/12/2012 11:45:00 AM       df2

相关问题