基於Lucene的部落格搜尋系統

最近看了一下lucene，所以決定自己實現乙個簡單的部落格搜尋系統，大體如下：

首先，我們先來了解一下lucene，官網：

學一門東西，首先先學會用，然後去了解原理（純屬個人觀點），其實官方提供了詳細的api：

開啟核心包可以看到官方給的demo，大致了解了用法：

ok，話不多說，上**，首先我們需要引入依賴：

org.apache.lucene lucene-core 7.1.0 org.apache.lucene lucene-queryparser 7.1.0 org.apache.lucene lucene-analyzers-common 7.1.0

這裡先建立乙個lucene的工具類，用來建立索引和執行搜尋功能（這裡的檔案型別為txt，具體檔案具體分析）：

public class luceneutil  catch (ioexception e) }}
//執行搜尋
public static jsonobject dosearch(string value)
try 
mydocument.setcontent(content);
list.add(mydocument);
}jsonobject.put("doc",list);
reader.close();
} catch (exception e) 
return jsonobject;
}//txt檔案內容轉換成字串
private static string txttostring(file file)
br.close();
}catch(exception e)
return buffer.tostring();}}

建立乙個controller

@restcontroller
public class blogcontroller 
public string upload(multipartfile file,string title)
public object dosearch(@requestbody mapquery)
}

主頁的html

寫部落格上傳檔案查詢結果：條記錄

檔案上傳的html

測試結果，效率還是很高的

對於全文檢索，分詞器決定搜尋的準確性，我這裡採用的是smartcn分詞，常用的一些中文分詞器：

paoding ：lucene中文分詞「庖丁解牛」 paoding analysis

imdict ：imdict智慧型詞典所採用的智慧型中文分詞程式

mmseg4j ：用 chih-hao tsai 的 mmseg 演算法實現的中文分詞器

ik ：採用了特有的「正向迭代最細粒度切分演算法「，多子處理器分析模式

smartcn：源於中科院ictclas

對於搜尋結果如果想要高亮顯示，還需要引入依賴：

org.apache.lucene lucene-analyzers-smartcn 7.1.0 org.apache.lucene lucene-highlighter 7.1.0

lucene的原理：這裡面有兩個關鍵的物件：分別是indexwriter和indexsearcher

lucene的多種搜尋

lucene的搜尋相當強大,它提供了很多輔助查詢類,各自完成一種特殊的查詢,也可以相互組合使用,來完成一些複雜的操作.public class test 按詞條搜尋 public void termsearcher throws ioexception 短語搜尋 public void phrase...

lucene的多種搜尋

lucene 基於索引的查詢

根據title模糊查詢索引檔案儲存的路徑 string indexpackurl infoservice.class.getresource getpath replacefirst replaceall web inf classes static indexpack 讀取索引檔案 indexr...

基於Lucene的部落格搜尋系統

lucene的多種搜尋

lucene的多種搜尋

lucene 基於索引的查詢

相關推薦