Python視覺化 Seaborn（三）

時間序列模型決定了**的準確性，良好的視覺化展示能為模型效果增益。seaborn因其高相容性和互動性，在時間序列資料視覺化設計中獨佔優勢。

由美國國家**中心收集，記錄了全世界所有顯著**的地點和震級（自2023年報告的震級為5.5以上**的發生日期，時間，位置，深度，大小和**資料）。

import pandas as pd
import numpy as np
data = pd.read_csv('earthquake.csv')
data.head()
複製**

seaborn中的plot function:

data['date'] = pd.to_datetime(data['date'])
data['year'] = data['date'].dt.year
data['month'] = data['date'].dt.month
data = data[data['type'] == 'earthquake']複製**

countplot

我們可以先用countplot作圖，看一看在1965-2023年間，每年各有多少次**。

import
warnings
warnings.filterwarnings("ignore")
import
seaborn as sns
import
matplotlib.pyplot as plt
%matplotlib
inline
plt.figure(1,figsize=(12,6))
year
= [i for i in
range(1965,2017,5)]
idx= [i for i in
range(0,52,5)]
sns.countplot(data['year'])
plt.setp(plt.xticks(idx,year)[1],rotation=45)
plt.title('earthquake counts in history from year 1965 to year
2016')
plt.show()複製**

heatmap

其次，我們可以按年、月份製作熱力圖（heatmap），觀察近十年的**記錄，

熱力圖的特點在於，定義兩個具有意義的dimension，看資料在這兩個dimension下的統計情況，完成對比分析。

test
= data.groupby([data['year'],data['month']],as_index=false).count()new = test[['year','month','id']]
temp
= new.iloc[-120:,:]
temp
= temp.pivot('year','month','id')
sns.heatmap(temp)
plt.show()
複製**

timeseries

也可以利用時間序列圖（timeserise），**以年為單位**次數趨勢。

temp
= data.groupby('year',as_index=false).count()
temp
= temp.loc[:,['year','id']]
plt.figure(1,figsize=(12,6))
sns.tsplot(temp.id,temp.year,color="r")
plt.show()
複製**

regression

可以對以年為單位的**記錄作線性回歸擬合。以下兩張圖分別對應一階線性回歸擬合、擬合後殘值分布情況圖。

plt.figure(figsize=(12,6))
plt.subplot(121) sns.regplot(x="year", y="id",
data=temp,order=1) # default by 1plt.ylabel(' ')
plt.title('regression fit of earthquake records by year,order = 1')
plt.subplot(122)
sns.residplot(x="year", y="id",
data=temp)
plt.ylabel(' ')
plt.title('residual plot when using a simplt regression
model,order=1')
plt.show()複製**

也可以嘗試二階擬合：

plt.figure(figsize=(12,6))
plt.subplot(121)
sns.regplot(x="year", y="id",
data=temp,order=2) # default by 1plt.ylabel(' ')
plt.title('regression fit of earthquake records by year,order = 2')
plt.subplot(122)
sns.residplot(x="year", y="id",
data=temp,order=2)
plt.ylabel(' ')
plt.title('residual plot when using a simplt regression
model,order=2')
plt.show()複製**

或者針對**記錄中的深度depth、強度magnitude做線性擬合。

plt.figure(figsize=(12,6))
plt.subplot(121)
sns.regplot(x="year", y="depth",
data=data,x_jitter=.05,
x_estimator=np.mean,order=3)
plt.ylabel(' ')
# x_estimator是乙個引數，相當於對每年**記錄中引數取平均值，**平均值的趨勢
plt.title('regression fit of depth,order = 3')
plt.subplot(122)
sns.regplot(x="year", y="magnitude",
data=data,x_jitter=.05,
x_estimator=np.mean,order=3)
# x_estimator是乙個引數，相當於對每年**記錄中引數取平均值，**平均值的趨勢
plt.title('regression fit of magnitude,order=3')
plt.show()
複製**

seaborn視覺化系列

seaborn視覺化學習之categorial visualization

seaborn視覺化學習之distribution visualization

Python 資料視覺化

資料視覺化指的是通過視覺化表示來探索資料，它與資料探勘緊緊相關，而資料探勘指的是使用來探索資料集的規律和關聯。資料集可以是用一行就能表示的小型數字列表，也可以是數以吉位元組的資料。漂亮地呈現資料關乎的並非僅僅是漂亮的。以引人注目的簡潔方式呈現資料，讓人能夠明白其含義，發現資料集中原本未意識到的規...

python 視覺化庫

在做titanic分析的過程中，看了一些大神的想法，發現在分析資料的過程中，許多大神會使用到seaborn，plotly這些庫，而我等小白僅僅知道matplotlib這個唯一的資料視覺化庫而已。上網查詢資料後整理如下資料視覺化庫可以根據其應用場景來分為以下幾類基礎的2d,3d圖繪製庫，互動資訊視...

Python資料視覺化2 1 為什麼視覺化需要規劃

摘要資料分析與視覺化大多數視覺化故事是圍繞問題或話題展開的資料探索或收集的起源。這問題包含了整個故事的起因，構成整個故事。這樣的資料征程以乙個問題開始，比如，2014年，報道的埃博拉病毒死亡人數是多少？回答這個問題需要乙個彼此協作的團隊完成。資料傳播者的作用應該是創造一種轉變觀眾看法的經歷。故...

Python視覺化 Seaborn（三）

Python 資料視覺化

python 視覺化庫

Python資料視覺化2 1 為什麼視覺化需要規劃

相關推薦