|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.hadoop.conf.Configured datafu.hourglass.jobs.AbstractJob datafu.hourglass.jobs.TimeBasedJob datafu.hourglass.jobs.IncrementalJob datafu.hourglass.jobs.AbstractPartitionCollapsingIncrementalJob datafu.hourglass.jobs.PartitionCollapsingIncrementalJob
public class PartitionCollapsingIncrementalJob
A concrete version of AbstractPartitionCollapsingIncrementalJob
.
This provides an alternative to extending AbstractPartitionCollapsingIncrementalJob
.
Instead of extending this class and implementing the abstract methods, this concrete version
can be used instead. Getters and setters have been provided for the abstract methods.
Nested Class Summary |
---|
Nested classes/interfaces inherited from class datafu.hourglass.jobs.AbstractPartitionCollapsingIncrementalJob |
---|
AbstractPartitionCollapsingIncrementalJob.Report |
Field Summary |
---|
Fields inherited from class datafu.hourglass.jobs.AbstractPartitionCollapsingIncrementalJob |
---|
_reusePreviousOutput |
Constructor Summary | |
---|---|
PartitionCollapsingIncrementalJob(java.lang.Class cls)
Initializes the job. |
Method Summary | |
---|---|
void |
config(org.apache.hadoop.conf.Configuration conf)
Overridden to provide custom configuration before the job starts. |
Accumulator<org.apache.avro.generic.GenericRecord,org.apache.avro.generic.GenericRecord> |
getCombinerAccumulator()
Gets the accumulator used for the combiner. |
protected org.apache.avro.Schema |
getIntermediateValueSchema()
Gets the Avro schema for the intermediate value. |
protected org.apache.avro.Schema |
getKeySchema()
Gets the Avro schema for the key. |
Mapper<org.apache.avro.generic.GenericRecord,org.apache.avro.generic.GenericRecord,org.apache.avro.generic.GenericRecord> |
getMapper()
Gets the mapper. |
Merger<org.apache.avro.generic.GenericRecord> |
getOldRecordMerger()
Gets the record merger that is capable of unmerging old partial output from the new output. |
protected org.apache.avro.Schema |
getOutputValueSchema()
Gets the Avro schema for the output data. |
Merger<org.apache.avro.generic.GenericRecord> |
getRecordMerger()
Gets the record merger that is capable of merging previous output with a new partial output. |
Accumulator<org.apache.avro.generic.GenericRecord,org.apache.avro.generic.GenericRecord> |
getReducerAccumulator()
Gets the accumulator used for the reducer. |
void |
setCombinerAccumulator(Accumulator<org.apache.avro.generic.GenericRecord,org.apache.avro.generic.GenericRecord> combiner)
Set the accumulator for the combiner |
void |
setIntermediateValueSchema(org.apache.avro.Schema intermediateValueSchema)
Sets the Avro schema for the intermediate value. |
void |
setKeySchema(org.apache.avro.Schema keySchema)
Sets the Avro schema for the key. |
void |
setMapper(Mapper<org.apache.avro.generic.GenericRecord,org.apache.avro.generic.GenericRecord,org.apache.avro.generic.GenericRecord> mapper)
Set the mapper. |
void |
setMerger(Merger<org.apache.avro.generic.GenericRecord> merger)
Sets the record merger that is capable of merging previous output with a new partial output. |
void |
setOldMerger(Merger<org.apache.avro.generic.GenericRecord> oldMerger)
Sets the record merger that is capable of unmerging old partial output from the new output. |
void |
setOnSetup(Setup setup)
Set callback to provide custom configuration before job begins execution. |
void |
setOutputValueSchema(org.apache.avro.Schema outputValueSchema)
Sets the Avro schema for the output data. |
void |
setReducerAccumulator(Accumulator<org.apache.avro.generic.GenericRecord,org.apache.avro.generic.GenericRecord> reducer)
Set the accumulator for the reducer. |
Methods inherited from class datafu.hourglass.jobs.AbstractPartitionCollapsingIncrementalJob |
---|
getOutputSchemaName, getOutputSchemaNamespace, getReports, getReusePreviousOutput, initialize, run, setProperties, setReusePreviousOutput |
Methods inherited from class datafu.hourglass.jobs.IncrementalJob |
---|
getMaxIterations, getMaxToProcess, getSchemas, isFailOnMissing, setFailOnMissing, setMaxIterations, setMaxToProcess |
Methods inherited from class datafu.hourglass.jobs.TimeBasedJob |
---|
getDaysAgo, getEndDate, getNumDays, getStartDate, setDaysAgo, setEndDate, setNumDays, setStartDate, validate |
Methods inherited from class datafu.hourglass.jobs.AbstractJob |
---|
createRandomTempPath, ensurePath, getCountersParentPath, getFileSystem, getInputPaths, getName, getNumReducers, getOutputPath, getProperties, getRetentionCount, getTempPath, isUseCombiner, randomTempPath, setCountersParentPath, setInputPaths, setName, setNumReducers, setOutputPath, setRetentionCount, setTempPath, setUseCombiner |
Methods inherited from class org.apache.hadoop.conf.Configured |
---|
getConf, setConf |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public PartitionCollapsingIncrementalJob(java.lang.Class cls) throws java.io.IOException
cls
- class to base job name on
java.io.IOException
Method Detail |
---|
public Mapper<org.apache.avro.generic.GenericRecord,org.apache.avro.generic.GenericRecord,org.apache.avro.generic.GenericRecord> getMapper()
AbstractPartitionCollapsingIncrementalJob
getMapper
in class AbstractPartitionCollapsingIncrementalJob
public Accumulator<org.apache.avro.generic.GenericRecord,org.apache.avro.generic.GenericRecord> getCombinerAccumulator()
AbstractPartitionCollapsingIncrementalJob
getCombinerAccumulator
in class AbstractPartitionCollapsingIncrementalJob
public Accumulator<org.apache.avro.generic.GenericRecord,org.apache.avro.generic.GenericRecord> getReducerAccumulator()
AbstractPartitionCollapsingIncrementalJob
getReducerAccumulator
in class AbstractPartitionCollapsingIncrementalJob
protected org.apache.avro.Schema getKeySchema()
IncrementalJob
This is also used as the key for the map output.
getKeySchema
in class IncrementalJob
protected org.apache.avro.Schema getIntermediateValueSchema()
IncrementalJob
This is also used for the value for the map output.
getIntermediateValueSchema
in class IncrementalJob
protected org.apache.avro.Schema getOutputValueSchema()
IncrementalJob
getOutputValueSchema
in class IncrementalJob
public Merger<org.apache.avro.generic.GenericRecord> getRecordMerger()
AbstractPartitionCollapsingIncrementalJob
getRecordMerger
in class AbstractPartitionCollapsingIncrementalJob
public Merger<org.apache.avro.generic.GenericRecord> getOldRecordMerger()
AbstractPartitionCollapsingIncrementalJob
getOldRecordMerger
in class AbstractPartitionCollapsingIncrementalJob
public void setMapper(Mapper<org.apache.avro.generic.GenericRecord,org.apache.avro.generic.GenericRecord,org.apache.avro.generic.GenericRecord> mapper)
mapper
- public void setCombinerAccumulator(Accumulator<org.apache.avro.generic.GenericRecord,org.apache.avro.generic.GenericRecord> combiner)
combiner
- accumulator for the combinerpublic void setReducerAccumulator(Accumulator<org.apache.avro.generic.GenericRecord,org.apache.avro.generic.GenericRecord> reducer)
reducer
- accumulator for the reducerpublic void setKeySchema(org.apache.avro.Schema keySchema)
This is also used as the key for the map output.
keySchema
- key schemapublic void setIntermediateValueSchema(org.apache.avro.Schema intermediateValueSchema)
This is also used for the value for the map output.
intermediateValueSchema
- intermediate value schemapublic void setOutputValueSchema(org.apache.avro.Schema outputValueSchema)
outputValueSchema
- output value schemapublic void setMerger(Merger<org.apache.avro.generic.GenericRecord> merger)
merger
- public void setOldMerger(Merger<org.apache.avro.generic.GenericRecord> oldMerger)
oldMerger
- mergerpublic void setOnSetup(Setup setup)
setup
- object with callback methodpublic void config(org.apache.hadoop.conf.Configuration conf)
AbstractJob
config
in class AbstractJob
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |