|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.hadoop.conf.Configured datafu.hourglass.jobs.AbstractJob
public abstract class AbstractJob
Base class for Hadoop jobs.
This class defines a set of common methods and configuration shared by Hadoop jobs. Jobs can be configured either by providing properties or by calling setters. Each property has a corresponding setter.
This class recognizes the following properties:The input.path property may be a comma-separated list of paths. When there is more than one it implies a join is to be performed. Alternatively the paths may be listed separately. For example, input.path.first and input.path.second define two separate input paths.
The num.reducers fixes the number of reducers. When not set the number of reducers is computed based on the input size.
The temp.path property defines the parent directory for temporary paths, not the temporary path itself. Temporary paths are created under this directory with an hourglass- prefix followed by a GUID.
The input and output paths are the only required parameters. The rest are optional.
Hadoop configuration may be provided by setting a property with the prefix hadoop-conf.. For example, mapred.min.split.size can be configured by setting property hadoop-conf.mapred.min.split.size to the desired value.
Constructor Summary | |
---|---|
AbstractJob()
Initializes the job. |
|
AbstractJob(java.lang.String name,
java.util.Properties props)
Initializes the job with a job name and properties. |
Method Summary | |
---|---|
void |
config(org.apache.hadoop.conf.Configuration conf)
Overridden to provide custom configuration before the job starts. |
protected org.apache.hadoop.fs.Path |
createRandomTempPath()
Creates a random temporary path within the file system. |
protected org.apache.hadoop.fs.Path |
ensurePath(org.apache.hadoop.fs.Path path)
Creates a path, if it does not already exist. |
org.apache.hadoop.fs.Path |
getCountersParentPath()
Gets the path where counters will be stored. |
protected org.apache.hadoop.fs.FileSystem |
getFileSystem()
Gets the file system. |
java.util.List<org.apache.hadoop.fs.Path> |
getInputPaths()
Gets the input paths. |
java.lang.String |
getName()
Gets the job name |
java.lang.Integer |
getNumReducers()
Gets the number of reducers to use. |
org.apache.hadoop.fs.Path |
getOutputPath()
Gets the output path. |
java.util.Properties |
getProperties()
Gets the configuration properties. |
java.lang.Integer |
getRetentionCount()
Gets the number of days of data which will be retained in the output path. |
org.apache.hadoop.fs.Path |
getTempPath()
Gets the temporary path under which intermediate files will be stored. |
protected void |
initialize()
Initialization required before running job. |
boolean |
isUseCombiner()
Gets whether the combiner should be used. |
protected org.apache.hadoop.fs.Path |
randomTempPath()
Generates a random temporary path within the file system. |
abstract void |
run()
Run the job. |
void |
setCountersParentPath(org.apache.hadoop.fs.Path countersParentPath)
Sets the path where counters will be stored. |
void |
setInputPaths(java.util.List<org.apache.hadoop.fs.Path> inputPaths)
Sets the input paths. |
void |
setName(java.lang.String name)
Sets the job name |
void |
setNumReducers(java.lang.Integer numReducers)
Sets the number of reducers to use. |
void |
setOutputPath(org.apache.hadoop.fs.Path outputPath)
Sets the output path. |
void |
setProperties(java.util.Properties props)
Sets the configuration properties. |
void |
setRetentionCount(java.lang.Integer retentionCount)
Sets the number of days of data which will be retained in the output path. |
void |
setTempPath(org.apache.hadoop.fs.Path tempPath)
Sets the temporary path where intermediate files will be stored. |
void |
setUseCombiner(boolean useCombiner)
Sets whether the combiner should be used. |
protected void |
validate()
Validation required before running job. |
Methods inherited from class org.apache.hadoop.conf.Configured |
---|
getConf, setConf |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public AbstractJob()
public AbstractJob(java.lang.String name, java.util.Properties props)
name
- Job nameprops
- Configuration propertiesMethod Detail |
---|
public java.lang.String getName()
public void setName(java.lang.String name)
name
- Job namepublic java.util.Properties getProperties()
public void setProperties(java.util.Properties props)
props
- Propertiespublic void config(org.apache.hadoop.conf.Configuration conf)
conf
- public java.lang.Integer getNumReducers()
public void setNumReducers(java.lang.Integer numReducers)
numReducers
- Number of reducers to usepublic boolean isUseCombiner()
public void setUseCombiner(boolean useCombiner)
useCombiner
- True if a combiner should be used, otherwise false.public org.apache.hadoop.fs.Path getCountersParentPath()
public void setCountersParentPath(org.apache.hadoop.fs.Path countersParentPath)
countersParentPath
- Counters pathpublic java.lang.Integer getRetentionCount()
public void setRetentionCount(java.lang.Integer retentionCount)
retentionCount
- public java.util.List<org.apache.hadoop.fs.Path> getInputPaths()
public void setInputPaths(java.util.List<org.apache.hadoop.fs.Path> inputPaths)
inputPaths
- input pathspublic org.apache.hadoop.fs.Path getOutputPath()
public void setOutputPath(org.apache.hadoop.fs.Path outputPath)
outputPath
- output pathpublic org.apache.hadoop.fs.Path getTempPath()
public void setTempPath(org.apache.hadoop.fs.Path tempPath)
tempPath
- Temporary pathprotected org.apache.hadoop.fs.FileSystem getFileSystem()
java.io.IOException
protected org.apache.hadoop.fs.Path randomTempPath()
protected org.apache.hadoop.fs.Path createRandomTempPath() throws java.io.IOException
java.io.IOException
protected org.apache.hadoop.fs.Path ensurePath(org.apache.hadoop.fs.Path path) throws java.io.IOException
path
- Path to create
java.io.IOException
protected void validate()
protected void initialize()
public abstract void run() throws java.io.IOException, java.lang.InterruptedException, java.lang.ClassNotFoundException
java.io.IOException
java.lang.InterruptedException
java.lang.ClassNotFoundException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |