多個Mapper和Reducer的Job

2022-05-06 05:30:10 字數 2092 閱讀 5576

@(hadoop)

對於複雜的mr任務來說,只有乙個map和reduce往往是不能夠滿足任務需求的,有可能是需要n個map之後進行reduce,reduce之後又要進行m個map。

使用的格式:

job = new job(conf);

//mapa的配置,如果不是特殊配置可傳入null或者共用乙個conf

configuration mapaconf = new configuration(false);

text.class, text.class, true, mapaconf);

configuration mapbconf = new configuration(false);

longwritable.class, text.class, false, mapbconf);

job.waitforcomplettion(true);

public

class

<? extends

class

<?> inputkeyclass,

class

<?> inputvalueclass,

class

<?> outputkeyclass,

class

<?> outputvalueclass,

configuration

throws

ioexception

job = new job(conf);

configuration reduceconf = new configuration(false);

chainreducer.setreducer(job, xreduce.class, longwritable.class, text.class,

text.class, text.class, true, reduceconf);

longwritable.class, text.class, false, null);

longwritable.class, longwritable.class, true, null);

job.waitforcompletion(true);

setreducer定義:

public

static void setreducer(job job,

class

<? extends

reducer> klass,

class

<?> inputkeyclass,

class

<?> inputvalueclass,

class

<?> outputkeyclass,

class

<?> outputvalueclass,

configuration

reducerconf)

在demo程式測試中觀察結果得到兩條比較有用的結論:

example:

job newjob = job.getinstance(conf, jobname + "-sort");

newjob.setjarbyclass(jarclass);

fileinputformat.setinputpaths(newjob, new path(outpath + "/part-*"));

newjob.setinputformatclass(textinputformat.class);

newjob.setmapoutputkeyclass(sortkey.class);

newjob.setmapoutputvalueclass(nullwritable.class);

fileoutputformat.setoutputpath(newjob, new path(outpath + "/sort"));

newjob.setoutputformatclass(textoutputformat.class);

newjob.waitforcompletion(true);

Mapper和Reduce階段流程

1.預設的textinputformat 場景 普通的文字格式資料 切片 採用預設的切片策略,以檔案為單位,先判斷檔案是否可切,如果可切,迴圈以片大小為單位切片!不可切,整個檔案作為1片!rr linerecordreader 將一行封裝為乙個key value longwritable key 行...

多個Mapper和Reducer的Job

hadoop 對於複雜的mr任務來說,只有乙個map和reduce往往是不能夠滿足任務需求的,有可能是需要n個map之後進行reduce,reduce之後又要進行m個map。使用的格式 job new job conf mapa的配置,如果不是特殊配置可傳入null或者共用乙個conf config...

Mybatis中為Mapper中傳入多個值

1.通過順序 select from user where name and dept 在 中的數字代表了傳遞引數的順序,一般不建議使用 2.通過 param public user selecttest param username string name,param deptid int dep...