#每天一點點#
python merge 資料合併
1:只有乙個key
import pandas as pd
left = pd.dataframe()
right = pd.dataframe()
#以某個條件作為合併媒介
res = pd.merge(left,right,on='key')
輸出結果
a b key c d
0 a0 b0 k0 c0 d0
1 a1 b1 k1 c1 d1
2 a2 b2 k2 c2 d2
3 a3 b3 k3 c3 d3
2:2個key
import pandas as pd
left = pd.dataframe()
right = pd.dataframe()
2.1 inner 連線
res1 = pd.merge(left,right,on=['key1','key2'],how='inner')
how=『inner』 輸出結果:
a b key1 key2 c d
0 a0 b0 k0 k0 c0 d0
1 a2 b2 k1 k0 c1 d1
2 a2 b2 k1 k0 c2 d2
2.2 outer 連線
res2 = pd.merge(left,right,on=['key1','key2'],how='outer')
how=『outer』 輸出結果:缺失的以nan填充
a b key1 key2 c d
0 a0 b0 k0 k0 c0 d0
1 a1 b1 k0 k1 nan nan
2 a2 b2 k1 k0 c1 d1
3 a2 b2 k1 k0 c2 d2
4 a3 b3 k2 k1 nan nan
5 nan nan k2 k0 c3 d3
2.3 left 連線
res3 = pd.merge(left,right,on=['key1','key2'],how='left')
how=『left』 輸出結果:缺失的以nan填充
a b key1 key2 c d
0 a0 b0 k0 k0 c0 d0
1 a1 b1 k0 k1 nan nan
2 a2 b2 k1 k0 c1 d1
3 a2 b2 k1 k0 c2 d2
4 a3 b3 k2 k1 nan nan
2.4 right 連線
res4 = pd.merge(left,right,on=['key1','key2'],how='right')
how=『right』 輸出結果:缺失的以nan填充
a b key1 key2 c d
0 a0 b0 k0 k0 c0 d0
1 a2 b2 k1 k0 c1 d1
2 a2 b2 k1 k0 c2 d2
3 nan nan k2 k0 c3 d3
3:indicator 引數
df1 = pd.dataframe()
df2 = pd.dataframe()
res = pd.merge(df1,df2,on='col1',how='outer',indicator=true)#預設false
輸出結果:缺失的以nan填充
col1 col_left col_right _merge
0 0 a nan left_only
1 1 b 2.0 both
2 2 nan 2.0 right_only
3 2 nan 2.0 right_only
修改列名字
res2 = pd.merge(df1,df2,on='col1',how='outer',indicator='indicator_name')
輸出結果:缺失的以nan填充
col1 col_left col_right indicator_name
0 0 a nan left_only
1 1 b 2.0 both
2 2 nan 2.0 right_only
3 2 nan 2.0 right_only
4:merged by index
left = pd.dataframe(,
index = ['k0','k1','k2'])
right = pd.dataframe(,
index = ['k0','k2','k3'])
#left_index,right_index
res1 = pd.merge(left,right,left_index=true,right_index=true,how='outer')
輸出結果:缺失的以nan填充
a b c d
k0 a0 b0 c0 d0
k1 a1 b1 nan nan
k2 a2 b2 c2 d2
k3 nan nan c3 d3
res2 = pd.merge(left,right,left_index=true,right_index=true,how='inner')
輸出結果:缺失的以nan填充
a b c d
k0 a0 b0 c0 d0
k2 a2 b2 c2 d2
boys = pd.dataframe()
girls = pd.dataframe()
res = pd.merge(boys,girls,on='k',suffixes=['_boys','_girls'],how='outer')
輸出結果:缺失的以nan填充
age_boys k age_girls
0 1.0 k0 4.0
1 1.0 k0 5.0
2 2.0 k1 nan
3 3.0 k2 nan
4 nan k3 6.0
merge into合併資料
語法 其中as可以省略 merge into table name as table alias using table view sub query as alias on join condition when matched then update set col1 col val1,col2...
pandas資料合併
pandas 提供了三種主要方法可以對資料進行合併 pandas.merge 方法 資料庫風格的合併 例如,通過merge 方法將兩個dataframe合併 on name 的意思是將name列當作鍵 預設情況下,merge做的是內連線 inner 即鍵的交集。其他方式還有左連線 left 右連線 ...
DataFrame資料合併
一 join 作用 預設情況下,他是把行索引相同的資料合併到一起 注意 以左為準,沒有的部分用nan補全 例子import pandas as pd import numpy as np df1 pd.dataframe data np.zeros 2,5 index list ab columns...