datafu.hourglass.avro
Class AvroMultipleInputsUtil

java.lang.Object
  extended by datafu.hourglass.avro.AvroMultipleInputsUtil

public class AvroMultipleInputsUtil
extends java.lang.Object

Helper methods for dealing with multiple Avro input schemas. A mapping is stored in the configuration that maps each input path to its corresponding schema. Methods in this class help with loading and storing these schema mappings.

Author:
"Matthew Hayes"

Constructor Summary
AvroMultipleInputsUtil()
           
 
Method Summary
static org.apache.avro.Schema getInputKeySchemaForSplit(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.mapreduce.InputSplit split)
          Gets the schema for a particular input split.
static void setInputKeySchemaForPath(org.apache.hadoop.mapreduce.Job job, org.apache.avro.Schema schema, java.lang.String path)
          Sets the job input key schema for a path.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

AvroMultipleInputsUtil

public AvroMultipleInputsUtil()
Method Detail

getInputKeySchemaForSplit

public static org.apache.avro.Schema getInputKeySchemaForSplit(org.apache.hadoop.conf.Configuration conf,
                                                               org.apache.hadoop.mapreduce.InputSplit split)
Gets the schema for a particular input split.

Parameters:
conf - configuration to get schema from
split - input split to get schema for
Returns:
schema

setInputKeySchemaForPath

public static void setInputKeySchemaForPath(org.apache.hadoop.mapreduce.Job job,
                                            org.apache.avro.Schema schema,
                                            java.lang.String path)
Sets the job input key schema for a path.

Parameters:
job - The job to configure.
schema - The input key schema.
path - the path to set the schema for


Matthew Hayes