clickhouse的使用和技巧,僅個人

curl -s | sudo bash

sudo yum list 'clickhouse*'

sudo yum -y install clickhouse*

資料型別沒有boolean其他基本和hive一樣,詳細的看官網

clickhouse 資料型別

clickhouse有很多引擎,最常用的是 mergetree家族還有distributed引擎

clickhouse可以建立本地表,分布式表,集群表

create table test()為本地表

create table image_label_all as image_label engine = distributed(distable, monchickey, image_label, rand()) 分布式表

create table test on cluster()為集群表

貼乙個完整的建表語句,使用replicatedmergetree引擎

create table m.n_mdw_pcg (
storekey int32, 
custkey int32,  
cardholderkey int32,  
pcg_main_cat_id int32,  
pcg_main_cat_desc string,  
count int32,  
quartly string
) engine = replicatedmergetree('
/clickhouse/tables/m/n_mdw_pcg
', '
') partition by (quartly, pcg_main_cat_id) 
order by (storekey, custkey, cardholderkey)

保證資料複製

增加可以使用insert;

不能修改,也不能指定刪除;

可以刪除分割槽,會刪除對應的資料我使用--help看了一下有truncate table,但是沒有具體使用過,如果要全部刪除資料可以刪除表,然後在建表查資料

可以使用指令碼操作

database=$1
table=$2
echo 
"truncate table 
"$2create=`clickhouse-client --database=$1 --query="
show create table $table
" | tr -d '\\'
`clickhouse-client --database=$1 --query="
drop table $table
"clickhouse-client --database=$1 --query="
$create
"

再匯入資料就可以了

匯入資料,clickhouse支援很多檔案型別詳細的看官方文件,檔案匯入匯出

貼兩個經常用的檔案的匯入

tsv,以"\t"隔開

clickhouse-client -h badd52c42f08 --input_format_allow_errors_num=1 --input_format_allow_errors_ratio=0.1 --query="
insert into tablename format tsv
"csv 以","或者"|"隔開
clickhouse-client -h adc3eaba589c --format_csv_delimiter="
|" --query='
insert into tablename format csv
'< file

資料查詢

clickhouse的查詢sql 表單查詢基本和標準sql一樣,也支援limit 分頁,但是inner join 的查詢寫法不一樣,而且我用4億+2000萬inner join的速度很慢

兩個sql對比 inner join要花費將近一分鐘,使用in子查詢僅3秒, 建議都使用in查詢,clickhouse的單錶查詢速度很快,3億資料count distinct 僅1秒左右

其它的技巧和知識,本人佔時未做了解,希望大家能一起學習,一起進步

clickhouse的使用教程

在檔案末尾新增 soft nofile 65536 hard nofile 65536 soft nproc 131072 hard nproc 131072 安裝依賴檔案示例clickhouse client q show databases clickhouse client d system...

click house函式的使用

格式 hdfs url,format,struct select toint32ornull id name jobfrom hdfs hdfs linux01 8020 doit18 user2.csv csv id string name string,job string 格式 file pa...

clickhouse的使用和技巧,僅個人

curl s sudo bash sudo yum list clickhouse sudo yum y install clickhouse 資料型別沒有boolean其他基本和hive一樣,詳細的看官網 clickhouse 資料型別 clickhouse有很多引擎,最常用的是 mergetre...

clickhouse的使用和技巧,僅個人

clickhouse的使用教程

click house函式的使用

clickhouse的使用和技巧,僅個人

相關推薦