python初探 pandas使用

pandas 是基於numpy 的一種工具，該工具是為了解決資料分析任務而建立的。pandas 納入了大量庫和一些標準的資料模型，提供了高效地操作大型資料集所需的工具。pandas提供了大量能使我們快速便捷地處理資料的函式和方法，pandas為時間序列分析提供了很好的支援。

series 和 dataframepandas自己獨有的基本資料結構。應該注意，它固然有著兩種資料結構，因為它依然是 python 的乙個庫，所以，python 中有的資料型別在這裡依然適用，也同樣還可以使用類自己定義資料型別。

#
data_structure.py
import
pandas as pd
import
numpy as np
series1 = pd.series([1, 2, 3, 4])
print("
series1:\n{}\n
".format(series1))

series1:

0 1

1 2

2 3

3 4dtype: int64#此行表示資料的型別為int64，輸出中第一行是index，第二行是value

我們可以分別列印出series中的資料和索引：

#
data_structure.py
print("
series1.values: {}\n
".format(series1.values))
print("
series1.index: {}\n
".format(series1.index))

series1.values: [1 2 3 4]#預設的index是從0開始的數字形式series1.index: rangeindex(start=0, stop=4, step=1)

索引可以是任何資料型別，例如字串：

#
data_structure.py
series2 = pd.series([1, 2, 3, 4, 5, 6, 7],
index=["
c", "
d", "
e", "
f", "
g", "
a", "b"
])print("
series2:\n{}\n
".format(series2))
print("
e is {}\n
".format(series2["
e"]))

series2:

c 1d 2e 3f 4g 5a 6b 7dtype: int64

e is 3

不指定資料內容，建立乙個4*4的dataframe：

#
data_structure.py
df1 = pd.dataframe(np.arange(16).reshape(4,4))
print("
df1:\n{}\n
".format(df1))

輸出如下（列叫做column，行叫做index，都是從0開始的整數）：

df1:

0 1 2 30 0 1 2 3

1 4 5 6 7

2 8 9 10 11

3 12 13 14 15

指定column和index來建立dataframe：

#
data_structure.py
df2 = pd.dataframe(np.arange(16).reshape(4,4),
columns=["
column1
", "
column2
", "
column3
", "
column4"],
index=["
a", "
b", "
c", "d"
])print("
df2:\n{}\n
".format(df2))

結果如下：

df2:

column1 column2 column3 column4

a 0 1 2 3b 4 5 6 7c 8 9 10 11d 12 13 14 15

指定資料列建立dataframe：

#
data_structure.py
df3 = pd.dataframe()
print("
df3:\n{}\n
".format(df3))

結果如下（dataframe的不同列可以是不同的資料型別）：

df3:

note weekday

0 c mon

1d tue

2e wed

3f thu

4g fri

5a sat

6 b sun

新增或者刪除列：

#
data_structure.py
df3[
"no.
"] = pd.series([1, 2, 3, 4, 5, 6, 7])
print("
df3:\n{}\n
".format(df3))
del df3["
weekday"]
print("
df3:\n{}\n
".format(df3))

結果如下：

訪問行索引是0和1，列索引是note的元素

print("

note c, d is:\n{}\n

".format(df3.iloc[[0, 1], 0])) #

訪問行下標是0和1，列下標是0的元素

結果如下（對於df3來說，行下標和行索引是一樣的）：

note c, d is

:0 c

1dname: note, dtype: object

note c, d is:

0 c

1dname: note, dtype: object

python初探 pandas使用

pandas學習（1）初探

初探pandas 安裝和了解pandas資料結構

初探pandas 安裝和了解pandas資料結構

python初探 pandas使用

pandas學習（1） 初探

初探pandas 安裝和了解pandas資料結構

初探pandas 安裝和了解pandas資料結構

相關推薦

pandas學習（1）初探