5 4學習筆記(pandas)

2021-09-23 14:23:39 字數 3138 閱讀 4719

通過軸排序

import pandas as pd

import numpy as np

dates=pd.date_range('20190301`,periods=6)

df = pd.dataframe(np.random.randn(6,4),index=dates,columns=list('abcd'))

print(df.sort_index(axis=1,ascending=false)

結果

d c b a

2019-03-01 -0.791684 -0.487994 0.609415 -0.130605

2019-03-02 -1.547686 0.424872 0.774110 -0.444310

2019-03-03 0.017846 1.498144 -0.384752 -2.453477

2019-03-04 0.727243 -0.112879 1.297718 0.278707

2019-03-05 0.506601 0.796999 1.532483 -1.643227

2019-03-06 -0.217591 -2.295332 -1.521089 -0.051233

按b軸排序

print(df.sort_values(by='b'))
結果

a b c d

2019-03-06 -0.051233 -1.521089 -2.295332 -0.217591

2019-03-03 -2.453477 -0.384752 1.498144 0.017846

2019-03-01 -0.130605 0.609415 -0.487994 -0.791684

2019-03-02 -0.444310 0.774110 0.424872 -1.547686

2019-03-04 0.278707 1.297718 -0.112879 0.727243

2019-03-05 -1.643227 1.532483 0.796999 0.506601

逆序排print(df.sort_values(by='b',ascending=false))

pandas取數獲取

選擇一列,產生乙個系列,相當於df.a

print(df['a'])
結果

2019-03-01 -0.130605

2019-03-02 -0.444310

2019-03-03 -2.453477

2019-03-04 0.278707

2019-03-05 -1.643227

2019-03-06 -0.051233

freq: d, name: a, dtype: float64

通過操作符選擇切片行(預設axis=0)

dates=pd.date_range('20190101',periods=6)

df=pd.dataframe(np.random.randn(6.4),index=dates,columns=list('abcd'))

df=pd.dataframe(np.random.randn(6,4),index=dates,columns=list('abcd'))

print(df[0:3])

結果:

a b c d

2019-01-01 0.856435 -1.004727 1.922469 0.137247

2019-01-02 -0.197909 1.979294 1.754650 0.900661

2019-01-03 0.298422 0.446164 -0.281074 -1.514126

print(df['20190102':'20190103'])
結果:

a b c d

2019-01-02 -0.197909 1.979294 1.754650 0.900661

2019-01-03 0.298422 0.446164 -0.281074 -1.514126

按標籤選擇

使用標籤獲取橫截面

import pandas as pd

import numpy as np

dates = pd.date_range('20190101',periods=6)

df = pd.dataframe(np.random.randn(6,4),index=dates,columns=list('abcd'))

print(df.loc[dates[0]])

結果:

a 1.755160

b 1.889182

c 1.646745

d 0.418026

name: 2019-01-01 00:00:00, dtype: float64

print(df.loc[:,['a','b']])
結果:

a b2019-01-01 1.755160 1.889182

2019-01-02 -1.687622 1.189862

2019-01-03 0.242299 -0.810373

2019-01-04 -0.758775 0.855404

2019-01-05 -0.316525 -1.058533

2019-01-06 1.584005 1.586659

print(df.loc['20190102':'20190104',['a','c']])
結果:

a c2019-01-02 -1.687622 0.565117

2019-01-03 0.242299 0.703034

2019-01-04 -0.758775 0.297482

print(df.loc['20190102',['a','b']])
結果:

a -1.687622

b 1.189862

name: 2019-01-02 00:00:00, dtype: float64

pandas學習筆記

import numpy as np import pandas as pd obj2 pd.series 4,7,5,3 index d b a c obj2 out 99 d 4 b 7 a 5 c 3 dtype int64 a b pd.series a bout 102 a 1 b 2 c...

pandas學習筆記

1 建立物件,瀏覽資料 建立物件,瀏覽資料 import pandas as pd import numpy as np import matplotlib.pyplot as plt 建立series s pd.series 1,2,4,6,np.nan,9,10 index list abcde...

pandas學習筆記

1.series 類似numpy中的一維陣列,表示為索引 從0開始 和值。建立 import pandas as pd,numpy as np s1 pd.series np.arange 10 s2 pd.series 12 2,5 s3 pd.series 含有的屬性 s1.values s1....