libsvm 學習筆記

libsvm中tools 中提供了乙個一條龍式的程式 easy.py

出於興趣對原碼做了一些學習

if len(sys.argv) <= 1:
print('usage:  training_file [testing_file]'.format(sys.argv[0]))
raise systemexit

(1)sys.argv: 實現從程式外部向程式傳遞引數，實質是個list

sys.argv[0] --> py程式本身

(2)raise systemexit --> 退出程式

也可以用下面的格式

raise systemexit('......')

(3) .format 用於字串的格式化。

即代表著sys.argv[0]會出現在的位置上。這裡理應就是easy.py

除了通過位置，還可以通過關鍵字或者字典進行對映。此外,.format還可以實現左右對齊，精度進製控制等

這種對映方式相對之前類c的%f方式有何優勢？

大概是這樣：

print(',,,'.format('wyz','21')

對映看起來要靈活得多

小結：這一段是對引數個數的檢查，如果引數個數只有乙個，即只有 python easy.py ，就提示正確用法，並退出

is_win32 = (sys.platform == 'win32')
if not is_win32:
svmscale_exe = "../svm-scale"
svmtrain_exe = "../svm-train"
svmpredict_exe = "../svm-predict"
grid_py = "./grid.py"
gnuplot_exe = "/usr/bin/gnuplot"
else:
# example for windows
svmscale_exe = r"..\windows\svm-scale.exe"
svmtrain_exe = r"..\windows\svm-train.exe"
svmpredict_exe = r"..\windows\svm-predict.exe"
gnuplot_exe = r"c:\tmp\gnuplot\binary\pgnuplot.exe"
grid_py = r".\grid.py"

(1) sys.platform 返回作業系統名稱，windows平台返回值是'win32'

(2) ./表示同級檔案， ../表示上級檔案，這裡定義了幾個執行檔案的路徑

assert os.path.exists(svmscale_exe),"svm-scale executable not found"
assert os.path.exists(svmtrain_exe),"svm-train executable not found"
assert os.path.exists(svmpredict_exe),"svm-predict executable not found"
assert os.path.exists(gnuplot_exe),"gnuplot executable not found"
assert os.path.exists(grid_py),"grid.py not found"
train_pathname = sys.argv[1]
assert os.path.exists(train_pathname),"training file not found"
file_name = os.path.split(train_pathname)[1]
scaled_file = file_name + ".scale"
model_file = file_name + ".model"
range_file = file_name + ".range"

(1) os.path.exists()函式用來檢驗給出的路徑是否真地存在

(2) assert 斷言函式用法

assert expression [, arguments]

如果expression 為假，就會輸出後面這一句

(3) sys.argv[1] --> 等待training 的data集

if len(sys.argv) > 2:
test_pathname = sys.argv[2]
file_name = os.path.split(test_pathname)[1]
assert os.path.exists(test_pathname),"testing file not found"
scaled_test_file = file_name + ".scale"
predict_test_file = file_name + ".predict"

這一段檢查引數是否包含測試檔名

cmd = ' -s "" "" > ""'.format(svmscale_exe, range_file, train_pathname, scaled_file)
print('scaling training data...')
popen(cmd, shell = true, stdout = pipe).communicate()	
cmd = ' -svmtrain "" -gnuplot "" ""'.format(grid_py, svmtrain_exe, gnuplot_exe, scaled_file)
print('cross validation...')
f = popen(cmd, shell = true, stdout = pipe).stdout

這一段是最為難懂的

(1)cmd定義了乙個字串 svmscale_exe -s "range_file" "train_pathname" > "scaled_file"

(2)popen 是subprocess模組中定義的乙個類

實際建立了乙個子程序。其中shell=true,　而 args 是字串，它將作為命令列字串通過shell 執行。

popen的communicate方法使父程序與子程序能夠交流，二元組 (stdoutdata, stderrdata) 分別表示從標準出和標準錯誤中讀出的資料。

而設定stdout = pipe ，則使得父程序可以接受子程序返回的資料。communicate會阻塞父程序直至子程序結束。

詳細參見baby_ape的部落格非常詳細易懂

小結而言，這段**首先建立子程序scale訓練集；然後建立子程序利用grid.py和gnuplot來尋找最好的svm引數

line = ''
while true:
last_line = line
line = f.readline()
if not line: break
c,g,rate = map(float,last_line.split())

看起來f是grid.py 輸出的檔案。 c,g 是svm的引數，rate則是結果

(1)readline() 方法用於從檔案讀取整行，包括 "\n" 字元用法：fileobject.readline();

(2)if not line:break 讀到空行就退出

(3).split 分隔符預設為所有的空字元，包括空格、換行(\n)、製表符(\t)等。用法： str.split(str="", num=string.count(str))

num為分割次數

(4)map函式，將分割出的內容都變為float，然後賦給c，g，rate。處在float位置的通常是某個函式

print('best c=, g= cv rate='.format(c,g,rate))
cmd = ' -c  -g  "" ""'.format(svmtrain_exe,c,g,scaled_file,model_file)
print('training...')
popen(cmd, shell = true, stdout = pipe).communicate()
print('output model: '.format(model_file))

利用最好的引數對訓練集再次進行訓練

if len(sys.argv) > 2:
cmd = ' -r "" "" > ""'.format(svmscale_exe, range_file, test_pathname, scaled_test_file)
print('scaling testing data...')
popen(cmd, shell = true, stdout = pipe).communicate()	
cmd = ' "" "" ""'.format(svmpredict_exe, scaled_test_file, model_file, predict_test_file)
print('testing...')
popen(cmd, shell = true).communicate()	
print('output prediction: '.format(predict_test_file))

如果有測試集，利用最優引數對測試集進行**

總結： easy.py （1）引數，檔案路徑等進行檢查

（2）建立子程序scale資料，利用grid.py找到最優引數

（3）利用最優引數跑訓練集和測試集，並輸出結果

MATLAB 機器學習安裝LIBSVM

本文僅僅介紹window下安裝步驟。3 新增路徑主頁設定路徑新增資料夾及其子資料夾解壓的資料夾裡的乙個資料夾windows即e program files libsvm 3.21 windows 儲存後關閉即可。tip 其實網上，大多數部落格都有介紹需要，mex進行c語言編譯，在我的安裝過程...

機器學習之 libsvm 引數說明

因為要用svm做regression，所以看了一些關於libsvm，總結以備用 libsvm在訓練model的時候，有如下引數要設定，當然有預設的引數，但是在具體應用方面效果會大大折扣。options 可用的選項即表示的涵義如下 s svm型別 svm設定型別預設0 0 c svc 1 v svc...

Libsvm使用總結

libsvm使用心得分類，回歸 libsvm是實現svm的便捷開源工具，應用廣泛除此之外還有lightsvm，沒用過由國立台灣大學chih chung chang和 chih jen lin編寫，可以實現基於svm的分類和回歸。由於個人對svm的理論只是略懂下面只介紹libsvm在win3...

libsvm 學習筆記

MATLAB 機器學習 安裝LIBSVM

機器學習 之 libsvm 引數說明

Libsvm使用總結

相關推薦

MATLAB 機器學習安裝LIBSVM

機器學習之 libsvm 引數說明