datafu.hourglass.avro
Class AvroDateRangeMetadata

java.lang.Object
  extended by datafu.hourglass.avro.AvroDateRangeMetadata

public class AvroDateRangeMetadata
extends java.lang.Object

Manages the storage and retrieval of date ranges in the metadata of Avro files. This is used by AbstractPartitionCollapsingIncrementalJob so that when reusing previous output it can determine the date range the data corresponds to.

Author:
"Matthew Hayes"

Field Summary
static java.lang.String METADATA_DATE_END
           
static java.lang.String METADATA_DATE_START
           
 
Constructor Summary
AvroDateRangeMetadata()
           
 
Method Summary
static void configureOutputDateRange(org.apache.hadoop.conf.Configuration conf, DateRange dateRange)
          Updates the Hadoop configuration so that the Avro files which are written have date range information stored in the metadata.
static DateRange getOutputFileDateRange(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path)
          Reads the date range from the metadata stored in an Avro file.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

METADATA_DATE_START

public static java.lang.String METADATA_DATE_START

METADATA_DATE_END

public static java.lang.String METADATA_DATE_END
Constructor Detail

AvroDateRangeMetadata

public AvroDateRangeMetadata()
Method Detail

getOutputFileDateRange

public static DateRange getOutputFileDateRange(org.apache.hadoop.fs.FileSystem fs,
                                               org.apache.hadoop.fs.Path path)
                                        throws java.io.IOException
Reads the date range from the metadata stored in an Avro file.

Parameters:
fs - file system to access path
path - path to get date range for
Returns:
date range
Throws:
java.io.IOException

configureOutputDateRange

public static void configureOutputDateRange(org.apache.hadoop.conf.Configuration conf,
                                            DateRange dateRange)
Updates the Hadoop configuration so that the Avro files which are written have date range information stored in the metadata. This should be used in conjunction with AvroKeyValueWithMetadataRecordWriter.

Parameters:
conf - configuration to store date range in
dateRange - date range


Matthew Hayes