下面是**例子:
import pandas as pd
import datetime #用來計算日期差的包
def datainterval(data1,data2):
d1 = datetime.datetime.strptime(data1, '%y-%m-%d')
d2 = datetime.datetime.strptime(data2, '%y-%m-%d')
delta = d1 - d2
return delta.days
def getinterval(arrlike): #用來計算日期間隔天數的呼叫的函式
publishedtime = arrlike['publishedtime']
receivedtime = arrlike['receivedtime']
# print(publishedtime.strip(),receivedtime.strip())
days = datainterval(publishedtime.strip(),receivedtime.strip()) #注意去掉兩端空白
return days
if __name__ == '__main__':
filename = "ns_new.xls";
df = pd.read_excel(filename)
如果函式有多個引數的話,就可以借用其中的args=()和 **kwds
import pandas as pd
import datetime #用來計算日期差的包
def datainterval(data1,data2):
d1 = datetime.datetime.strptime(data1, '%y-%m-%d')
d2 = datetime.datetime.strptime(data2, '%y-%m-%d')
delta = d1 - d2
return delta.days
def getinterval_new(arrlike,before,after): #用來計算日期間隔天數的呼叫的函式
before = arrlike[before]
after = arrlike[after]
# print(publishedtime.strip(),receivedtime.strip())
days = datainterval(after.strip(),before.strip()) #注意去掉兩端空白
return days
if __name__ == '__main__':
filename = "ns_new.xls";
df = pd.read_excel(filename)
axis = 1, args = ('receivedtime','publishedtime')) #呼叫方式一
#下面的呼叫方式等價於上面的呼叫方式
axis = 1, **) #呼叫方式二
#下面的呼叫方式等價於上面的呼叫方式
axis = 1, before='receivedtime',after='publishedtime') #呼叫方式三
pandas的apply函式使用
這個函式很有用,隔一段時間不用就老忘記,在這裡舉例總結一下。問題一 sales是乙個dataframe 它有一列叫distance 每乙個值為 1.5km 5.0km 這種長相 請把每乙個值變成 1.5 5.0 這種長相。def fun x return str x split k 0 sales ...
Pandas 統計函式與apply
import numpy as np import pandas as pd from pandas import series,dataframe方法 說明count 非na值的數量 describe 針對series或各dataframe列計算匯 計 min max 計算最小值和最大值 argm...
Pandas資料分析初學 Apply函式
首先匯入pandas庫 import pandas as pd 1 將資料框命名為crime 因為這個表中的資料有不同,所以需要將col 0的列設為index crime pd.read csv us crime rates 1960 2014.csv index col 0 2 每一列 colum...