datafu.hourglass.mapreduce
Class CollapsingCombiner

java.lang.Object
  extended by datafu.hourglass.mapreduce.ObjectProcessor
      extended by datafu.hourglass.mapreduce.ObjectReducer
          extended by datafu.hourglass.mapreduce.CollapsingCombiner
All Implemented Interfaces:
DateRangeConfigurable, java.io.Serializable

public class CollapsingCombiner
extends ObjectReducer
implements DateRangeConfigurable, java.io.Serializable

The combiner used by AbstractPartitionCollapsingIncrementalJob and its derived classes.

An implementation of Accumulator is used to perform aggregation and produce the intermediate value.

Author:
"Matthew Hayes"
See Also:
Serialized Form

Constructor Summary
CollapsingCombiner()
           
 
Method Summary
 Accumulator<org.apache.avro.generic.GenericRecord,org.apache.avro.generic.GenericRecord> getAccumulator()
          Gets the accumulator used to perform aggregation.
 boolean getReuseOutput()
          Gets whether previous output is being reused.
 PartitionCollapsingSchemas getSchemas()
          Gets the schemas.
 void reduce(java.lang.Object keyObj, java.lang.Iterable<java.lang.Object> values, org.apache.hadoop.mapreduce.ReduceContext<java.lang.Object,java.lang.Object,java.lang.Object,java.lang.Object> context)
           
 void setAccumulator(Accumulator<org.apache.avro.generic.GenericRecord,org.apache.avro.generic.GenericRecord> acc)
          Sets the accumulator used to perform aggregation.
 void setOutputDateRange(DateRange dateRange)
          Sets the date range for the output.
 void setReuseOutput(boolean reuseOutput)
          Sets whether previous output is being reused.
 void setSchemas(PartitionCollapsingSchemas schemas)
          Sets the schemas.
 
Methods inherited from class datafu.hourglass.mapreduce.ObjectProcessor
close, getContext, setContext
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

CollapsingCombiner

public CollapsingCombiner()
Method Detail

reduce

public void reduce(java.lang.Object keyObj,
                   java.lang.Iterable<java.lang.Object> values,
                   org.apache.hadoop.mapreduce.ReduceContext<java.lang.Object,java.lang.Object,java.lang.Object,java.lang.Object> context)
            throws java.io.IOException,
                   java.lang.InterruptedException
Specified by:
reduce in class ObjectReducer
Throws:
java.io.IOException
java.lang.InterruptedException

setSchemas

public void setSchemas(PartitionCollapsingSchemas schemas)
Sets the schemas.

Parameters:
schemas -

getSchemas

public PartitionCollapsingSchemas getSchemas()
Gets the schemas.

Returns:
schemas

getReuseOutput

public boolean getReuseOutput()
Gets whether previous output is being reused.

Returns:
true if previous output is reused

setReuseOutput

public void setReuseOutput(boolean reuseOutput)
Sets whether previous output is being reused.

Parameters:
reuseOutput - true if previous output is reused

getAccumulator

public Accumulator<org.apache.avro.generic.GenericRecord,org.apache.avro.generic.GenericRecord> getAccumulator()
Gets the accumulator used to perform aggregation.

Returns:
The accumulator

setAccumulator

public void setAccumulator(Accumulator<org.apache.avro.generic.GenericRecord,org.apache.avro.generic.GenericRecord> acc)
Sets the accumulator used to perform aggregation.

Parameters:
acc - The accumulator

setOutputDateRange

public void setOutputDateRange(DateRange dateRange)
Description copied from interface: DateRangeConfigurable
Sets the date range for the output.

Specified by:
setOutputDateRange in interface DateRangeConfigurable
Parameters:
dateRange - output date range


Matthew Hayes