dataframe的基本特徵:
1、是乙個**型資料結構
2、含有一組有序的列
3、大致可看成共享同乙個index的series集合
>>> import pandas as pd
>>> data=
>>> frame=pd.dataframe(data)
>>> frame
name pay
0 wangdachui 4000
1 linling 5000
2 niuyun 6000
>>> import pandas as pd
>>> import numpy as np
>>> data=np.array([('wangdachui',4000),('linling',5000),('niuyun',6000)])
>>> frame=pd.dataframe(data,index=range(1,4),columns=['name','pay'])
>>> frame
name pay
1 wangdachui 4000
2 linling 5000
3 niuyun 6000
>>> frame.index
rangeindex(start=1, stop=4, step=1)
>>> frame.columns
index(['name', 'pay'], dtype='object')
>>> frame.values
array([['wangdachui', '4000'],
['linling', '5000'],
['niuyun', '6000']], dtype=object)
>>> frame.index=[2,4,6]
>>> frame
name pay
2 wangdachui 4000
4 linling 5000
6 niuyun 6000
dataframe的基本操作
· 取dataframe物件的行和列可獲得series:
>>> frame['name']
2 wangdachui
4 linling
6 niuyun
name: name, dtype: object
>>> frame.pay
2 4000
4 5000
6 6000
name: pay, dtype: object
>>> frame.iloc[:2,1]
2 4000
4 5000
name: pay, dtype: object
· dataframe物件的修改和刪除:
>>> frame['name']='admin'
>>> frame
name pay
2 admin 4000
4 admin 5000
6 admin 6000
>>> del frame['pay']
>>> frame
name
2 admin
4 admin
6 admin
dataframe的統計功能
>>> import pandas as pd
>>> import numpy as np
>>> data=np.array([('wangdachui',4000),('linling',5000),('niuyun',6000)])
>>> frame=pd.dataframe(data,index=range(1,4),columns=['name','pay'])
>>> frame
name pay
1 wangdachui 4000
2 linling 5000
3 niuyun 6000
>>> frame.pay.min()
'4000'
>>> frame[frame.pay>='5000']
name pay
2 linling 5000
3 niuyun 6000
python讀取hdfs並返回dataframe
不多說,直接上 filename tmp preprocess part 00000 hdfs檔案路徑 columnnames xx def readhdfs 讀取hdfs檔案 returns df dataframe hdfs資料 client client hdfshost 目前讀取hdfs檔案...
python教學筆記 python學習筆記(一)
1.eval 函式 eval是單詞evaluate的縮寫,就是 求.的值的意思。eval 函式的作用是把str轉換成list,dict,tuple.li 1 1,2,3 print eval li 1 di 1 print eval di 1 tu 1 2,4,6 print eval tu 1 執...
python學習筆記
coding utf 8 coding utf 8 應該像八股文一樣在每個指令碼的頭部宣告,這是個忠告 為了解決中文相容問題,同時你應該選擇支援 unicode 編碼的編輯器環境,保證在執行指令碼中的每個漢字都是使用 utf 8 編碼過的。cdays 5 exercise 3.py 求0 100之間...