python資料分析1

1.3檢視列、行、單元格

為啥要用python中的pandas庫進行資料分析，用excel不可以嗎？不可以，excel處理上萬條資料時通常會宕機或者出錯，python不會有這種問題。相信鯉魚學長，在學習乃至日後工作中，pandas庫將會風靡相當長一段時間。

示例：pandas 是基於numpy 的一種工具，該工具是為了解決資料分析任務而建立的。

**如下（示例）：

import pandas 
import pandas as pd#這裡是給pandas函式取了乙個名字：pd，後面呼叫時寫pd就行。

**如下（示例）：

df=pd.
read_csv
(r'c:\users\wly\desktop\python資料分析(活用pandas庫)\pandas_for_everyone-master\data\gapminder.tsv'
,sep=
'\t'
)print
(df.
head()
)

1、預設情況下，read_csv函式會讀取逗號分隔檔案。

2、這裡將sep引數設定為\t，是指明使用製表符分隔的意思。

3、呼叫head()方法，只顯示前5行資料。

執行結果如下：

country continent year lifeexp pop g***ercap 0 afghanistan asia 1952 28.801 8425333 779.445314 1 afghanistan asia 1957 30.332 9240934 820.853030 2 afghanistan asia 1962 31.997 10267083 853.100710 3 afghanistan asia 1967 34.020 11537966 836.197138 4 afghanistan asia 1972 36.088 13079460

739.981106

df=pd.
read_csv
(r'c:\users\wly\desktop\python資料分析(活用pandas庫)\pandas_for_everyone-master\data\gapminder.tsv'
,sep=
'\t'
)print
(type
(df)
)

執行結果如下：

print
(df.shape)

執行結果如下：

(
1704,6
)

也就是說這個資料集有1704行，6列。

print
(df.columns)

執行結果如下：

index([
'country'
,'continent'
,'year'
,'lifeexp'
,'pop'
,'g***ercap'
], dtype=
'object'
)

這裡可以看到，列名的型別是object。

print
(df.dtypes)

執行結果如下：

country object continent object year int64 lifeexp float64 pop int64 g***ercap float64

dtype: object

pandas型別

python型別

說明object

string

最常用的資料型別

int64

int整型

float64

float

帶小數的數字

datatime64

datatime

python標準庫里包含datatime，但是預設不載入，需要匯入才能用

獲取資料中的某列，比如獲取country列。，這裡將她儲存到乙個變數裡。

country_df=df[
'country'
]print
(country_df.
head()
)#顯示前5行
print
(country_df.
tail()
)#顯示後5行

執行結果如下：

#顯示前5行 0 afghanistan 1 afghanistan 2 afghanistan 3 afghanistan 4 afghanistan name: country, dtype: object #顯示後5行 1699 zimbabwe 1700 zimbabwe 1701 zimbabwe 1702 zimbabwe 1703 zimbabwe

name: country, dtype: object

通過列名獲取多列。

subset=df[
['country'
,'continent'
,'year']]
print
(subset.
head()
)print
(subset.
tail()
)

執行結果如下：

country continent year 0 afghanistan asia 1952 1 afghanistan asia 1957 2 afghanistan asia 1962 3 afghanistan asia 1967 4 afghanistan asia 1972 country continent year 1699 zimbabwe africa 1987 1700 zimbabwe africa 1992 1701 zimbabwe africa 1997 1702 zimbabwe africa 2002

1703 zimbabwe africa 2007

我用的python版本是python3.7 64位，pycharm是2017.1 64位。大家根據自己的電腦來安裝python和編譯器。

python資料分析1

python資料分析1

python資料分析基礎1

小白學 Python 資料分析（1）資料分析基礎

python資料分析1

python資料分析1

python資料分析基礎1

小白學 Python 資料分析（1） 資料分析基礎

相關推薦

小白學 Python 資料分析（1）資料分析基礎