為資料分配索引,例如:
data=np.random
.randn(5)
pd.series(data, index=['a', 'b', 'c', 'd', 'e'])
>>>
a -0.287461
b 0.736157
c 1.759875
d -0.238167
e 0.621458
dtype: float64
pd.series(np.random
.randn(5))
>>>
0 -0.334205
1 -1.033102
2 -0.349577
3 -1.459086
40.148646
dtype: float64
df1 = pd.dataframe(,
index=[0, 1, 2, 3])
df2 = pd.dataframe(,
index=[0, 1, 2, 3])
s=pd.concat([df1,df2], axis=1) # 1是在x軸方向合併,0是在y軸方向合併
df.drop([『column』],axis=1) # 臨時刪除
df.drop([『column』],axis=1,inplace=true) <—–> df = df.drop([『column』],axis=1)
df[『column』]=df[『column』].astype(『int』)
df[『column』]=df.column.astype(int)
df[『column』].fillna(value=0, inplace=true)
df[『column』][df[『column』]==value1]=value2
類似於將sql語句:select df1.id from df1, df2 where df1.id=df2.id
轉換成pandas語句
df1[df1['id']==df2['id']]
這個辦法表df1, df2必須有相同的index
否則會出現
valueerror: can only compare identically-labeled series objects錯誤
col_dates = df.dtypes[df.dtypes == 'datetime64[ns]'].index
for d in col_dates:
df[d] = df[d].dt.to_period('m')
df['emp_length'] = df['emp_length'].fillna(df.emp_length.median())
python資料分析之pandas學習(一) Pandas常用的方法
讀取 寫入read csv to csv read excel to excel read hdf to hdf read sql to sql read json to json read msgpack experimental to msgpack experimental read html...
pandas常用方法
import pandas as pd import numpy as np import matplotlib.pyplot as plt import datetime import redf pd.read csv path file.csv 引數 header none 用預設列名,0,1,...
pandas 常用方法
import pandas as pd pd.read csv filename,encoding utf 8 讀取csv pd.to csv filename 儲存檔案,filename為檔案路徑,可以是相對路徑or絕對路徑 pd.to csv filename,index 0 儲存到檔案時,不要...