天池新人賽構造次日購買特徵

#構造次日購買特徵
#導入庫檔案
print('構造次日購買特徵')
import pandas
import numpy
from pandas import read_csv
from pandas import series
#讀取並設定資料表
df=read_csv('d:\\sample.csv',low_memory=false)
df=df.drop(columns= ['unnamed: 0'],axis=1)
del df['購買商品']
print(df)
#刪除無用列
n=df.loc[df['日'].isin([18])]
n=n.loc[df['購買'].isin([1])]
n=n.drop_duplicates()
#重置索引
a=numpy.arange(0,len(n['使用者標識']),1)
n.index=a
#只留下商品標識，使用者標識，購買標記
del n['購買']
del n['日']
del n['收藏']
del n['瀏覽']
del n['加購物車']
del n['商品分類']
del n['使用者行為']
'''m.rename(columns=,inplace=true)
n.rename(columns=,inplace=true)
'''group=df.loc[df['日'].isin([14,15,16,17,18])]
print(group)
a=numpy.arange(0,len(group['使用者標識']),1)
group.index=a
#刪除商品標識不同的行
i = 0
j = 0
d = len(group['使用者標識'])
l = len(n['使用者標識'])
a = numpy.zeros(group.shape[0])
while ij=0
while jif  group['使用者標識'][i]==n['使用者標識'][j]:
if group['商品標識'][i]==n['商品標識'][j]:
a[i]=1
j=j+1
i=i+1
group.insert(9,'18日購買',a)#先建立再併入資料框
print(n)
print(group)
print(sum(group['18日購買']))
group.to_csv('d:\\sample_1.3.csv')

天池新人賽資料探勘

組別 wlh 奧林匹克百公尺跑資料 python 程式設計是基於python的環境進行，可以在環境中構造資料模型畫出建立的方程模型影象和散點圖的影象，進行對比 x test np.linspace 1896,2008,100 none 在文件中讀取x的值 f test w0 w1 x test 之...

天池新人實戰賽之離線賽嘗試（四）

之前的特徵值有10幾個，有點少了。增加特徵值到40多個。特徵選取參考使用pyspark.ml.classification import gbtclassifier 裡的模型。幾個引數解釋 maxiter 迭代次數 maxdepth 樹的最大深度 stepsize 每次迭代優化步長學習速率 se...

構造次優查詢樹

似乎有些錯誤，但是錯在哪了呢？include include using namespace std const int num 9 int value num float weight num float sum weight num void init sum weight struct tre...

天池新人賽 構造次日購買特徵

天池新人賽 資料探勘

天池新人實戰賽之 離線賽 嘗試（四）

構造次優查詢樹

相關推薦

天池新人賽構造次日購買特徵

天池新人賽資料探勘

天池新人實戰賽之離線賽嘗試（四）