pandas學習筆記（一）

series是帶有標籤的一維陣列，可以儲存任何資料型別（整數，字串，浮點數，python物件等），軸標籤統稱為索引

ar=np.random.rand(5)
pd.series(ar,index=list('abcde')) #index是為資料新增索引，用列表的形式
s=pd.series(ar)
print(ar)
print(s)

index檢視series索引，型別為rangeindex

values檢視series值，型別為ndarray

sdata=
obj3=pd.series(data)

arr=np.random.rand(10)
s=pd.series(arr)
print(s)

rename()重新命名乙個陣列名稱，並且新指向乙個陣列，原陣列不變

pd.isnull()
pd.notnull()

>>>obj2=pd.series([4,7,-5,3],index=['d','b','a','c'])
d 4b 7
a -5
c 3>>>obj2['a']
-5>>>print(obj2[['d','b','a']])
d    4
b    7
a   -5
dtype: int64
>>>obj2[obj2>0]
d 6b 7
c 3>>>obj2*2
d 12
b 14
a -10
c 6>>>'b'
in obj2
true
#切片》s[1:4]，下標，  末端不包含
>>>s['a':'c'] 標籤，末端包含

如果需要選擇多個標籤的值，用來表示

多標籤索引結果是新的陣列

>>>obj2=pd.series([4,7,-5,3],index=['d','b','a','c'])
d 4b 7
a -5
c 3>>>obj2.name='population'
>>>obj2.endex.name='states'

s=pd.series(np.random.rand(15))

print(s.head(2))

print(s.tail())

不是改變索引名字，而是重新排列，如果沒有的索引會填充nan(可以通過fill_value設定預設值)

s=pd.series(np.random.rand(5)，index=[『a』,』b』,』c』,』d』,』e』)

s.reindex([『c』,』d』],fill_valuse=0)

drop刪除元素之後返回副本(implace=false)

s=pd.series(np.random.rand(5),index=list(『sdsfg』))

s1=s.drop(『d』) #刪除d標籤,不會改變s

s.drop(『d』,inplace=true) #直接改變s

**型的資料結構，既有行索引index也有列索引columns

pd.dataframe()

data1=
data2=
df1=pd.dataframe(data1,columns=['b','c','a','d'])

data1=
data2=,index=['a','b','c']

ar=np.random.rand(9).reshape(3,3)
df1=pd.dataframe(ar)
df2=pd.dataframe(ar,
index=['a','b','c'],
columns=['one','two','three'])
#如果不知道index和colums則預設都是數字格式

data=[,]
df1=pd.dataframe(data)
df2=pd.dtaframe(date,index=['a','b'])
df3=pd.dtaframe(date,columns=['one','two'])
#由字典組成的列表建立dataframe，colums為字典的key，index不指定則預設陣列標籤

data=,
'marry':
'tom':}
df1=pd.dataframe(data)
#由字典組成的字典建立dataframe，colums為字典的key，index為子字典的key

ar=np.random.rand(9).reshape(3,3)
data1=pd.dataframe(ar,
index=['a','b','c'],
columns=['one','two','three'])
#選擇列：
data1['a']
data1[['b','c']]
#選擇行
data1.loc['one']
data1.loc[['one','two']]

ar=np.random.rand(9).reshape(3,3)
data1=pd.dataframe(ar,
index=['a','b','c'],
columns=['one','two','three'])
data1.iloc[0]  #取第一行
data1.iloc[-1] #取倒數第一行
data1.iloc[0，1]  #取第一行第二列的資料
data1.iloc[[0，1]] #取前兩行的資料
data1.iloc[0:2] ##取前兩行的資料，也是末端不包含

ar=np.random.rand(9).reshape(3,3)
df=pd.dataframe(ar,
index=['a','b','c'],
columns=['one','two','three'])
#不做索引則會對資料每個值做判斷
#索引結果保留所有資料，true返回原資料，false返回值為nan
b1=df<20
df[b1]  #或者df[df<20]
#單列做判斷
#索引結果保留單列判斷為true的行資料，包括其他列
b2=df['a']>50
#多列做判斷
#索引結果保留所有資料：true返回原資料，false返回值是nan
b3=df[['a','b']]>50
#多行做判斷
#索引結果保留所有資料：true返回原資料，false返回是nan
b4=df.loc[['one','three']]<50

df[『a』].loc[[『three』,』two』]]

df[[『b』,』c』,』d』]].iloc[：2]

df[df<50].loc[[『one』,』two』]]

#del,刪除列
del df['a']
#drop()刪除行
df.drop(0)
#drop()刪除列
df.drop(['d'],axis=1)

按值排序

ar=np.random.rand(9).reshape(3,3)
df=pd.dataframe(ar,
index=['a','b','c'],
columns=['one','two','three'])
#單列排序
df.sort_values(['a'],ascending=true)
#多列排序,按列順序排序，也就是先按照a列排
sf.sort_values(['a','c'])

索引排序

#按照index排序，預設ascending=true,inplace=false
ar=np.random.rand(9).reshape(3,3)
df=pd.dataframe(ar,
index=['a','b','c'],
columns=['one','two','three'])
df.sort_index()

Pandas學習筆記 Pandas概覽（一）

pandas是python的核心資料分析支援庫，提供了快速靈活明確的資料結構，旨在簡單直觀的處理關係型資料型的資料。pandas適用於處理以下型別的資料維數名稱描述1 series 帶標籤的一維同構陣列 2dataframe 帶標籤的，大小可變的，二維異構 pandas資料結構就像是低維資...

pandas學習筆記

import numpy as np import pandas as pd obj2 pd.series 4,7,5,3 index d b a c obj2 out 99 d 4 b 7 a 5 c 3 dtype int64 a b pd.series a bout 102 a 1 b 2 c...

pandas學習筆記

1 建立物件，瀏覽資料建立物件，瀏覽資料 import pandas as pd import numpy as np import matplotlib.pyplot as plt 建立series s pd.series 1,2,4,6,np.nan,9,10 index list abcde...

pandas學習筆記（一）

Pandas學習筆記 Pandas概覽（一）

pandas學習筆記

pandas學習筆記

相關推薦