datafu.hourglass.mapreduce
Class CollapsingReducer

java.lang.Object
  extended by datafu.hourglass.mapreduce.ObjectProcessor
      extended by datafu.hourglass.mapreduce.ObjectReducer
          extended by datafu.hourglass.mapreduce.CollapsingReducer
All Implemented Interfaces:
DateRangeConfigurable, java.io.Serializable

public class CollapsingReducer
extends ObjectReducer
implements DateRangeConfigurable, java.io.Serializable

The reducer used by AbstractPartitionCollapsingIncrementalJob and its derived classes.

An implementation of Accumulator is used to perform aggregation and produce the output value.

Author:
"Matthew Hayes"
See Also:
Serialized Form

Field Summary
protected  long _beginTime
           
protected  long _endTime
           
 
Constructor Summary
CollapsingReducer()
           
 
Method Summary
 Accumulator<org.apache.avro.generic.GenericRecord,org.apache.avro.generic.GenericRecord> getNewAccumulator()
           
 Accumulator<org.apache.avro.generic.GenericRecord,org.apache.avro.generic.GenericRecord> getOldAccumulator()
           
 boolean getReuseOutput()
          Gets whether previous output is being reused.
 void reduce(java.lang.Object keyObj, java.lang.Iterable<java.lang.Object> values, org.apache.hadoop.mapreduce.ReduceContext<java.lang.Object,java.lang.Object,java.lang.Object,java.lang.Object> context)
           
 void setAccumulator(Accumulator<org.apache.avro.generic.GenericRecord,org.apache.avro.generic.GenericRecord> acc)
           
 void setOldRecordMerger(Merger<org.apache.avro.generic.GenericRecord> merger)
           
 void setOutputDateRange(DateRange dateRange)
          Sets the date range for the output.
 void setRecordMerger(Merger<org.apache.avro.generic.GenericRecord> merger)
           
 void setReuseOutput(boolean reuseOutput)
          Sets whether previous output is being reused.
 void setSchemas(PartitionCollapsingSchemas schemas)
          Sets the Avro schemas.
 
Methods inherited from class datafu.hourglass.mapreduce.ObjectProcessor
close, getContext, setContext
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

_beginTime

protected long _beginTime

_endTime

protected long _endTime
Constructor Detail

CollapsingReducer

public CollapsingReducer()
Method Detail

reduce

public void reduce(java.lang.Object keyObj,
                   java.lang.Iterable<java.lang.Object> values,
                   org.apache.hadoop.mapreduce.ReduceContext<java.lang.Object,java.lang.Object,java.lang.Object,java.lang.Object> context)
            throws java.io.IOException,
                   java.lang.InterruptedException
Specified by:
reduce in class ObjectReducer
Throws:
java.io.IOException
java.lang.InterruptedException

setSchemas

public void setSchemas(PartitionCollapsingSchemas schemas)
Sets the Avro schemas.

Parameters:
schemas -

getReuseOutput

public boolean getReuseOutput()
Gets whether previous output is being reused.

Returns:
true if previous output is reused

setReuseOutput

public void setReuseOutput(boolean reuseOutput)
Sets whether previous output is being reused.

Parameters:
reuseOutput - true if previous output is reused

setAccumulator

public void setAccumulator(Accumulator<org.apache.avro.generic.GenericRecord,org.apache.avro.generic.GenericRecord> acc)

getNewAccumulator

public Accumulator<org.apache.avro.generic.GenericRecord,org.apache.avro.generic.GenericRecord> getNewAccumulator()

getOldAccumulator

public Accumulator<org.apache.avro.generic.GenericRecord,org.apache.avro.generic.GenericRecord> getOldAccumulator()

setRecordMerger

public void setRecordMerger(Merger<org.apache.avro.generic.GenericRecord> merger)

setOldRecordMerger

public void setOldRecordMerger(Merger<org.apache.avro.generic.GenericRecord> merger)

setOutputDateRange

public void setOutputDateRange(DateRange dateRange)
Description copied from interface: DateRangeConfigurable
Sets the date range for the output.

Specified by:
setOutputDateRange in interface DateRangeConfigurable
Parameters:
dateRange - output date range


Matthew Hayes