datafu.pig.bags
Class BagGroup

java.lang.Object
  extended by org.apache.pig.EvalFunc<T>
      extended by datafu.pig.util.ContextualEvalFunc<T>
          extended by datafu.pig.util.AliasableEvalFunc<org.apache.pig.data.DataBag>
              extended by datafu.pig.bags.BagGroup

public class BagGroup
extends AliasableEvalFunc<org.apache.pig.data.DataBag>

Performs an in-memory group operation on a bag. The first argument is the bag. The second argument is a projection of that bag to the group keys.

Example: define BagGroup datafu.pig.bags.BagGroup(); data = LOAD 'input' AS (input_bag: bag {T: tuple(k: int, v: chararray)}); -- ({(1,A),(1,B),(2,A),(2,B),(2,C),(3,A)}) data2 = FOREACH data GENERATE BagGroup(input_bag, input_bag.(k)) as grouped; -- data2: {grouped: {(group: int,input_bag: {T: (k: int,v: chararray)})}} -- ({(1,{(1,A),(1,B)}),(2,{(2,A),(2,B),(2,C)}),(3,{(3,A)})})

Author:
wvaughan

Field Summary
 
Fields inherited from class org.apache.pig.EvalFunc
log, pigLogger, reporter, returnType
 
Constructor Summary
BagGroup()
           
 
Method Summary
 org.apache.pig.data.DataBag exec(org.apache.pig.data.Tuple input)
           
 org.apache.pig.impl.logicalLayer.schema.Schema getOutputSchema(org.apache.pig.impl.logicalLayer.schema.Schema input)
          Specify the output schema as in {link EvalFunc#outputSchema(Schema)}.
 
Methods inherited from class datafu.pig.util.AliasableEvalFunc
getBag, getBoolean, getDouble, getDouble, getFieldAliases, getFloat, getFloat, getInteger, getInteger, getLong, getLong, getObject, getPosition, getPosition, getPrefixedAliasName, getString, getString, outputSchema
 
Methods inherited from class datafu.pig.util.ContextualEvalFunc
getContextProperties, getInstanceName, getInstanceProperties, setUDFContextSignature
 
Methods inherited from class org.apache.pig.EvalFunc
finish, getArgToFuncMapping, getCacheFiles, getInputSchema, getLogger, getPigLogger, getReporter, getReturnType, getSchemaName, isAsynchronous, progress, setInputSchema, setPigLogger, setReporter, warn
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

BagGroup

public BagGroup()
Method Detail

getOutputSchema

public org.apache.pig.impl.logicalLayer.schema.Schema getOutputSchema(org.apache.pig.impl.logicalLayer.schema.Schema input)
Description copied from class: AliasableEvalFunc
Specify the output schema as in {link EvalFunc#outputSchema(Schema)}.

Specified by:
getOutputSchema in class AliasableEvalFunc<org.apache.pig.data.DataBag>
Returns:
outputSchema

exec

public org.apache.pig.data.DataBag exec(org.apache.pig.data.Tuple input)
                                 throws java.io.IOException
Specified by:
exec in class org.apache.pig.EvalFunc<org.apache.pig.data.DataBag>
Throws:
java.io.IOException


Matthew Hayes, Sam Shah