1、作業鏈
mapreduce作業可以一次建立並依次執行。
舊api:
// create a new jobconf
jobconf job = new jobconf(new configuration(), myjob.class);
// specify various job-specific parameters
job.setjobname("myjob");
job.setinputpath(new path("in"));
job.setoutputpath(new path("out"));
job.setreducerclass(myjob.myreducer.class);
// submit the job, then poll for progress until the job is complete
jobclient.runjob(job);
新api:
// create a new job
job job = new job(new configuration());
job.setjarbyclass(myjob.class);
// specify various job-specific parameters
job.setjobname("myjob");
job.setinputpath(new path("in"));
job.setoutputpath(new path("out"));
job.setreducerclass(myjob.myreducer.class);
// submit the job, then poll for progress until the job is complete
job.waitforcompletion(true);
2、作業圖
解決作業之間的依賴問題,作業之間可能存在多個依賴關係,形成乙個有向的無環圖(dag)。
舊api:
job job1 = new job(new jobconf());
job job2 = new job(new jobconf());
job job3 = new job(new jobconf());
job3.adddependingjob(job1);
job3.adddependingjob(job2);
jobcontrol jobcontrol = new jobcontrol("controlgroupname");
jobcontrol.addjob(job1);
jobcontrol.addjob(job2);
jobcontrol.addjob(job3);
jobcontrol.run();
新api:
//假設作業3依賴作業1和作業2
configuration jobconf1 = null;
/** jobconf1 settting
*/configuration jobconf2 = null;
/** jobconf2 settting
*/configuration jobconf3 = null;
/** jobconf3 settting
*/controlledjob cjob1 = new controlledjob(jobconf1);
controlledjob cjob2 = new controlledjob(jobconf2);
controlledjob cjob3 = new controlledjob(jobconf3);
cjob3.adddependingjob(cjob1);
cjob3.adddependingjob(cjob2);
jobcontrol jobcontrol = new jobcontrol("controlgroupname");
jobcontrol.addjob(cjob1);
jobcontrol.addjob(cjob2);
jobcontrol.addjob(cjob3);
jobcontrol.run();
3、map/reduce鏈
舊api
public class chaintest
class reducer1 implements reducer
}新api
public class chaintest
class reducer1 extends reducer
}
4、對於複雜的工作流可能需要利用外部的mapreduce工作流工具來完成,如:oozie
Map Reduce的過程解析
map reduce的過程首先是由客戶端提交乙個任務開始的。提交任務主要是通過jobclient.runjob jobconf 靜態函式實現的 public static runningjob runjob jobconf job throws ioexception finally finally...
map reduce 過程的認識
map reduce 過程的認識 最初我一直簡單的以為map 的工作就是將資料打散,而reduce 就是將map 打散後的資料合併。雖然之前跑過wordcount 的例子,但之前只是對輸出reduce 最終的結果感興趣,對控制台列印的日誌資訊完全不懂。這幾天我們團隊在探索pagerank 才開始對m...
Map Reduce過程概述
map reduce的過程首先是由客戶端提交乙個任務開始的。提交任務主要是通過jobclient.runjob jobconf 靜態函式實現的 public static runningjob runjob jobconf job throws ioexception finally finally...