public abstract class AbstractNonIncrementalJob extends TimeBasedJob
AbstractPartitionCollapsingIncrementalJob
without all the incremental features.
Jobs extending this class consume input data partitioned according to yyyy/MM/dd. Only a single input path is supported. The output will be written to a directory in the output path with name format yyyyMMdd derived from the end of the time window that is consumed.
This class has the same configuration and methods as TimeBasedJob
.
In addition it also recognizes the following properties:
When combine.inputs is true, then CombinedAvroKeyInputFormat is used instead of AvroKeyInputFormat. This enables a single map task to consume more than one file.
The num.reducers.bytes.per.reducer property controls the number of reducers to use based on the input size. The total size of the input files is divided by this number and then rounded up.
Modifier and Type | Class and Description |
---|---|
static class |
AbstractNonIncrementalJob.BaseCombiner
Combiner base class for
AbstractNonIncrementalJob . |
static class |
AbstractNonIncrementalJob.BaseMapper
Mapper base class for
AbstractNonIncrementalJob . |
static class |
AbstractNonIncrementalJob.BaseReducer
Reducer base class for
AbstractNonIncrementalJob . |
static class |
AbstractNonIncrementalJob.Report
Reports files created and processed for an iteration of the job.
|
Constructor and Description |
---|
AbstractNonIncrementalJob(java.lang.String name,
java.util.Properties props)
Initializes the job.
|
Modifier and Type | Method and Description |
---|---|
boolean |
getCombineInputs()
Gets whether inputs should be combined.
|
java.lang.Class<? extends AbstractNonIncrementalJob.BaseCombiner> |
getCombinerClass()
Gets the combiner class.
|
protected abstract org.apache.avro.Schema |
getMapOutputKeySchema()
Gets the key schema for the map output.
|
protected abstract org.apache.avro.Schema |
getMapOutputValueSchema()
Gets the value schema for the map output.
|
abstract java.lang.Class<? extends AbstractNonIncrementalJob.BaseMapper> |
getMapperClass()
Gets the mapper class.
|
protected abstract org.apache.avro.Schema |
getReduceOutputSchema()
Gets the reduce output schema.
|
abstract java.lang.Class<? extends AbstractNonIncrementalJob.BaseReducer> |
getReducerClass()
Gets the reducer class.
|
AbstractNonIncrementalJob.Report |
getReport()
Gets a report summarizing the run.
|
void |
run()
Runs the job.
|
void |
setCombineInputs(boolean combineInputs)
Sets whether inputs should be combined.
|
getDaysAgo, getEndDate, getNumDays, getStartDate, setDaysAgo, setEndDate, setNumDays, setProperties, setStartDate, validate
config, createRandomTempPath, ensurePath, getCountersParentPath, getFileSystem, getInputPaths, getName, getNumReducers, getOutputPath, getProperties, getRetentionCount, getTempPath, initialize, isUseCombiner, randomTempPath, setCountersParentPath, setInputPaths, setName, setNumReducers, setOutputPath, setRetentionCount, setTempPath, setUseCombiner
public AbstractNonIncrementalJob(java.lang.String name, java.util.Properties props) throws java.io.IOException
name
- job nameprops
- configuration propertiesjava.io.IOException
- IOExceptionpublic boolean getCombineInputs()
public void setCombineInputs(boolean combineInputs)
combineInputs
- true to combine inputspublic AbstractNonIncrementalJob.Report getReport()
public void run() throws java.io.IOException, java.lang.InterruptedException, java.lang.ClassNotFoundException
run
in class AbstractJob
java.io.IOException
- IOExceptionjava.lang.InterruptedException
- InterruptedExceptionjava.lang.ClassNotFoundException
- ClassNotFoundExceptionprotected abstract org.apache.avro.Schema getMapOutputKeySchema()
protected abstract org.apache.avro.Schema getMapOutputValueSchema()
protected abstract org.apache.avro.Schema getReduceOutputSchema()
public abstract java.lang.Class<? extends AbstractNonIncrementalJob.BaseMapper> getMapperClass()
public abstract java.lang.Class<? extends AbstractNonIncrementalJob.BaseReducer> getReducerClass()
public java.lang.Class<? extends AbstractNonIncrementalJob.BaseCombiner> getCombinerClass()