pandas中資料結構 Series

pandas是乙個開源的，bsd許可的python庫，為python程式語言提供了高效能，易於使用的資料結構和資料分析工具。python與pandas一起使用的領域廣泛，包括學術和商業領域，包括金融，經濟學，統計學，分析等。在本教程中，我們將學習pythonpandas的各種功能以及如何在實踐中使用它們。

安裝

pip install pandas

匯入

import pandas as pd
from pandas import series, dataframe

>>> import pandas as pd
>>> obj=pd.series([4,7,-5,3])
>>> obj
0    4
1    7
2   -5
3    3
dtype: int64

series的字串表現形式為：索引在左邊，值在右邊。由於我們沒有為資料指定索引，於是會自動建立乙個0到n-1（n為資料的長度）的整數型索引。你可以通過series 的values和index屬性獲取其陣列表示形式和索引物件：

>>> import pandas as pd
>>> obj.values
array([ 4,  7, -5,  3], dtype=int64)
>>> obj.index
rangeindex(start=0, stop=4, step=1)

通常，我們希望所建立的series帶有乙個可以對各個資料點進行標記的索引：索引和值是一一對應的關係

>>> obj2=pd.series([4,7,-5,3],index=['d','b','a','c'])
>>> obj2
d    4
b    7
a   -5
c    3
dtype: int64

>>> obj2['a']
-5>>> obj2['d']
4>>> obj2['c','a','d']
>>> obj2[['c','a','d']]
c    3
a   -5
d    4
dtype: int64

>>> obj2[obj2>0]
d    4
b    7
c    3
dtype: int64

>>> obj2*2
d     8
b    14
a   -10
c     6
dtype: int64

>>> import numpy as np
>>> np.exp(obj2)
d      54.598150
b    1096.633158
a       0.006738
c      20.085537
dtype: float64

還可以將series看成是乙個定長的有序字典，因為它是索引值到資料值的乙個對映。它可以用在許多原本需要字典引數的函式中：

>>> 'b' in obj2
true
>>> 'e' in obj2
false

1.傳入乙個字典來建立乙個series

>>> sdata = 
>>> obj3=pd.series(sdata)
>>> obj3
ohio      35000
texas     71000
oregon    16000
utah       5000
dtype: int64

2.傳入新的索引來改變字典的順序

由於新增的california沒有值與它對應，所以表示資料缺失

>>> states = ['california', 'ohio', 'oregon', 'texas']
>>> obj4 = pd.series(sdata, index=states)
>>> obj4
california        nan
ohio          35000.0
oregon        16000.0
texas         71000.0
dtype: float64

3.檢測資料的缺失

>>> pd.isnull(obj4)
california     true
ohio          false
oregon        false
texas         false
dtype: bool
>>> pd.notnull(obj4)
california    false
ohio           true
oregon         true
texas          true
dtype: bool

簡單的說就是對應索引的值相加

>>> obj3
ohio      35000
texas     71000
oregon    16000
utah       5000
dtype: int64
>>> obj4
california        nan
ohio          35000.0
oregon        16000.0
texas         71000.0
dtype: float64
>>> obj3+obj4
california         nan
ohio           70000.0
oregon         32000.0
texas         142000.0
utah               nan
dtype: float64

>>> obj4.name='population'
>>> obj4.index.name='state'
>>> obj4
state
california        nan
ohio          35000.0
oregon        16000.0
texas         71000.0
name: population, dtype: float64

>>> obj
0    4
1    7
2   -5
3    3
dtype: int64
>>> obj.index=['bob','steve','jeff','ryan']
>>> obj
bob      4
steve    7
jeff    -5
ryan     3
dtype: int64

pandas中資料結構 Series

pandas資料結構

Pandas資料結構

pandas中的資料結構 DataFrame

pandas中資料結構 Series

pandas資料結構

Pandas資料結構

pandas中的資料結構 DataFrame

相關推薦