回顧以前的知識點,重新走一次,結合工作中遇到的問題,做一些總結筆記
spark需要連hive,那麼就需要啟動這個,利用命令:
$spark_home/sbin/start-thriftserver.sh \
--hiveconf hive.server2.thrift.port=10000 \
--hiveconf hive.server2.thrift.bind.host=master \
--total-executor-cores 2 \
--master spark://master:7077
一開始我沒有加上–total-executor-cores 2 引數,但是因為我自己家的集群只配置了6個core,而這個不加預設是12個core,導致我後面在準備做部落格案例的時候,程式提交執行不了,這裡就是因為集群都沒資源了,導致的,後面再啟動的時候,加上了這個引數,就好了
順帶上連線操作:
$spark_home/bin/beeline
!connect jdbc:hive2://master:10000
賬號預設是你的當前使用者名稱,密碼無
場景:抽取oracle中的分割槽資料報錯
org.apache.spark.sql.analysi***ception: number of column aliases does not match number of columns. number of column aliases: 1; number of columns: 18.; line 1 pos 14
at org.apache.spark.sql.catalyst.analysis.package$analysiserrorat.failanalysis(package.scala:42)
at org.apache.spark.sql.catalyst.trees.currentorigin$.withorigin(treenode.scala:70)
at org.apache.spark.sql.catalyst.plans.logical.analysishelper$.allowinvokingtransformsinanalyzer(analysishelper.scala:194)
at org.apache.spark.sql.catalyst.plans.logical.analysishelper$class.resolveoperatorsup(analysishelper.scala:86)
at org.apache.spark.sql.catalyst.plans.logical.logicalplan.resolveoperatorsup(logicalplan.scala:29)
at org.apache.spark.sql.catalyst.trees.treenode.mapchildren(treenode.scala:327)
at org.apache.spark.sql.catalyst.plans.logical.analysishelper$.allowinvokingtransformsinanalyzer(analysishelper.scala:194)
at org.apache.spark.sql.catalyst.plans.logical.analysishelper$class.resolveoperatorsup(analysishelper.scala:86)
at org.apache.spark.sql.catalyst.plans.logical.logicalplan.resolveoperatorsup(logicalplan.scala:29)
at org.apache.spark.sql.catalyst.trees.treenode.mapchildren(treenode.scala:327)
at org.apache.spark.sql.catalyst.plans.logical.analysishelper$.allowinvokingtransformsinanalyzer(analysishelper.scala:194)
at org.apache.spark.sql.catalyst.plans.logical.analysishelper$class.resolveoperatorsup(analysishelper.scala:86)
at org.apache.spark.sql.catalyst.plans.logical.logicalplan.resolveoperatorsup(logicalplan.scala:29)
at scala.collection.linearseqoptimized$class.foldleft(linearseqoptimized.scala:124)
at scala.collection.immutable.list.foldleft(list.scala:84)
at scala.collection.immutable.list.foreach(list.scala:381)
at org.apache.spark.sql.catalyst.rules.ruleexecutor.execute(ruleexecutor.scala:76)
at org.apache.spark.sql.catalyst.analysis.analyzer.org$apache$spark$sql$catalyst$analysis$analyzer$$executesamecontext(analyzer.scala:127)
at org.apache.spark.sql.catalyst.analysis.analyzer.execute(analyzer.scala:121)
at org.apache.spark.sql.catalyst.plans.logical.analysishelper$.markinanalyzer(analysishelper.scala:201)
at org.apache.spark.sql.catalyst.analysis.analyzer.executeandcheck(analyzer.scala:105)
at org.apache.spark.sql.execution.queryexecution.analyzed$lzycompute(queryexecution.scala:57)
at org.apache.spark.sql.execution.queryexecution.analyzed(queryexecution.scala:55)
at org.apache.spark.sql.execution.queryexecution.assertanalyzed(queryexecution.scala:47)
at org.apache.spark.sql.dataset$.ofrows(dataset.scala:78)
at org.apache.spark.sql.sparksession.sql(sparksession.scala:642)
at com.wanwang.extract.giscaldailytripfreqextract$.main(giscaldailytripfreqextract.scala:28)
at com.wanwang.extract.giscaldailytripfreqextract.main(giscaldailytripfreqextract.scala)
**如下:
val giscaldailytripfreq = spark.read.jdbc(sqlurl, "gis_cal_dailytrip_freq", properties)
giscaldailytripfreq.createtempview("giscaldailytripfreq")
val res = spark.sql(s"select * from giscaldailytripfreq partition(p_$yesterdaynoformat) where rownum<=100") // 測試
猜測原因:註冊成臨時檢視後,沒有分割槽一說了
解決:在連線的時候,就指定查詢的語句
val jdbcdf: dataframe = spark.read
.format("jdbc")
.option("url", sqlurl)
// .option("query", s"select * from gis_cal_dailytrip_freq partition(p_$yesterdaynoformat) where rownum<=100") // 測試
.option("query", s"select * from gis_cal_dailytrip_freq partition(p_$yesterdaynoformat)")
.option("user", user)
.option("password", password)
.load()
OC 執行時語言踩過的坑
最近 遇到了兩次oc 執行時語言的坑,這讓我對此感到深深的敬畏,貼此 警示後人 碰到最多的是,可變陣列nsmutablearray的排序,從伺服器拿下來的array的陣列,進行直接賦值,不幸將nsmutablearray的型別由 nsmutablearray變成了nsarray,在進行接下來的排序時...
TypeScript踩坑(持續更新)
很多第三方庫已經有自己的型別宣告檔案,比如 types react,types react native,這些需要單獨安裝,而例如mobx react和mobx這種會自帶型別檔案,不需要單獨安裝。我們最近有個新專案,需要照顧到不同同學,有的願意用ts,有的不想用ts,為了照顧到雙方,所有的公共模組都...
踩過的坑,持續更新
1 top竟然是dom中的保留字,和window乙個型別的東西,當初還很2b的設定var top 2 innerhtml的問題,在xhtml中要求符合標準格式才能成功,比如我遇到的是在乙個p元素中,再插入p就是非法的。還要注意ie下面的某些元素唯讀,如等。3 偷懶用var a b c null 會怎...