AvroMultipleInputsKeyInputFormat (DataFu Hourglass)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

datafu.hourglass.avro
Class AvroMultipleInputsKeyInputFormat<T>

java.lang.Object
  org.apache.hadoop.mapreduce.InputFormat<K,V>
      org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.avro.mapred.AvroKey<T>,org.apache.hadoop.io.NullWritable>
          datafu.hourglass.avro.AvroMultipleInputsKeyInputFormat<T>

public class AvroMultipleInputsKeyInputFormat<T>
extends org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.avro.mapred.AvroKey<T>,org.apache.hadoop.io.NullWritable>
extends org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.avro.mapred.AvroKey<T>,org.apache.hadoop.io.NullWritable>

A MapReduce InputFormat that can handle Avro container files and multiple inputs. The input schema is determine based on the split. The mapping from input path to schema is stored in the job configuration.

Keys are AvroKey wrapper objects that contain the Avro data. Since Avro container files store only records (not key/value pairs), the value from this InputFormat is a NullWritable.

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
`org.apache.hadoop.mapreduce.lib.input.FileInputFormat.Counter`

Constructor Summary
`AvroMultipleInputsKeyInputFormat()`

Method Summary
`org.apache.hadoop.mapreduce.RecordReader<org.apache.avro.mapred.AvroKey<T>,org.apache.hadoop.io.NullWritable>`	`createRecordReader(org.apache.hadoop.mapreduce.InputSplit split, org.apache.hadoop.mapreduce.TaskAttemptContext context)`

Methods inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
`addInputPath, addInputPaths, computeSplitSize, getBlockIndex, getFormatMinSplitSize, getInputPathFilter, getInputPaths, getMaxSplitSize, getMinSplitSize, getSplits, isSplitable, listStatus, setInputPathFilter, setInputPaths, setInputPaths, setMaxInputSplitSize, setMinInputSplitSize`

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Constructor Detail

AvroMultipleInputsKeyInputFormat

public AvroMultipleInputsKeyInputFormat()

Method Detail

createRecordReader

public org.apache.hadoop.mapreduce.RecordReader<org.apache.avro.mapred.AvroKey<T>,org.apache.hadoop.io.NullWritable> createRecordReader(org.apache.hadoop.mapreduce.InputSplit split,
                                                                                                                                        org.apache.hadoop.mapreduce.TaskAttemptContext context)
                                                                                                                                 throws java.io.IOException,
                                                                                                                                        java.lang.InterruptedException

Specified by:: createRecordReader in class org.apache.hadoop.mapreduce.InputFormat<org.apache.avro.mapred.AvroKey<T>,org.apache.hadoop.io.NullWritable>

Throws:: java.io.IOException; java.lang.InterruptedException