Pandas中的resample，重新取樣

pandas中的resample，重新取樣，是對原樣本重新處理的乙個方法，是乙個對常規時間序列資料重新取樣和頻率轉換的便捷的方法。

方法的格式是：

dataframe.resample(rule, how=none, axis=0, fill_method=none, closed=none, label=none, convention=『start』,kind

=none, loffset=none, limit=none, base=0)

引數詳解是：

parameters:

rule : string

偏移量表示目標字串或物件轉換

axis : int, optional, default 0

closed :

哪乙個方向的間隔是關閉的

label :

which bin edge label to label bucket with

convention :

loffset : timedelta

調整重新取樣時間標籤

base : int, default 0

頻率均勻細分1天,「起源」的聚合的間隔。例如,對於「5分鐘」頻率,基地可能範圍從0到4。預設值為0

首先建立乙個series，取樣頻率為一分鐘。

index = pd.date_range(『1/1/2000』, periods=9, freq=『t』)

series = pd.series(range(9), index=index)

series

2000-01-01 00:00:00 0

2000-01-01 00:01:00 1

2000-01-01 00:02:00 2

2000-01-01 00:03:00 3

2000-01-01 00:04:00 4

2000-01-01 00:05:00 5

2000-01-01 00:06:00 6

2000-01-01 00:07:00 7

2000-01-01 00:08:00 8

freq: t, dtype: int64

降低取樣頻率為三分鐘

series.resample(『3t』).sum()

2000-01-01 00:00:00 3

2000-01-01 00:03:00 12

2000-01-01 00:06:00 21

freq: 3t, dtype: int64

降低取樣頻率為三分鐘，但是每個標籤使用right來代替left。請注意，bucket中值的用作標籤。

series.resample(『3t』, label=『right』).sum()

2000-01-01 00:03:00 3

2000-01-01 00:06:00 12

2000-01-01 00:09:00 21

freq: 3t, dtype: int64

降低取樣頻率為三分鐘，但是關閉right區間。

series.resample(『3t』, label=『right』, closed=『right』).sum()

2000-01-01 00:00:00 0

2000-01-01 00:03:00 6

2000-01-01 00:06:00 15

2000-01-01 00:09:00 15

freq: 3t, dtype: int64

增加取樣頻率到30秒

series.resample(『30s』).asfreq()[0:5] #select first 5 rows

2000-01-01 00:00:00 0

2000-01-01 00:00:30 nan

2000-01-01 00:01:00 1

2000-01-01 00:01:30 nan

2000-01-01 00:02:00 2

freq: 30s, dtype: float64

增加取樣頻率到30s,使用pad方法填充nan值。

series.resample(『30s』).pad()[0:5]

2000-01-01 00:00:00 0

2000-01-01 00:00:30 0

2000-01-01 00:01:00 1

2000-01-01 00:01:30 1

2000-01-01 00:02:00 2

freq: 30s, dtype: int64

增加取樣頻率到30s,使用bfill方法填充nan值。

def custom_resampler(array_like):

… return np.sum(array_like)+5

原文：

Pandas中的分類

一分類變數的結構乙個分類變數包括三個部分，元素值 values 分類類別 categories 是否有序 order 從上面可以看出，使用cut函式建立的分類變數預設為有序分類變數一獲取分類屬性 a describe方法該方法描述了乙個分類序列的情況，包括非缺失值個數元素值類別數不是分...

pandas中DataFrame mean函式用法

mean 平均數 pandas中的df.mean 函式預設是等價於df.mean 0 即按軸方向求平均，得到每列資料的平均值。相反的df.mean 1 則代表按行方向求平均，得到每行資料的平均值。舉例我們首先匯入pandas包 import pandas as pd建立矩陣首先使用預設方法會...

pandas中merge的用法

pandas中的merge和concat類似,但主要是用於兩組有key column的資料統一索引的資料.通常也被用在database的處理當中。import pandas as pd 定義資料集並列印出 left pd.dataframe right pd.dataframe print le...

Pandas中的resample，重新取樣

Pandas中的分類

pandas中DataFrame mean函式用法

pandas中merge的用法

相關推薦