public class BagJoin extends AliasableEvalFunc<org.apache.pig.data.DataBag>
The format for invocation is BagJoin(bag, 'key',....). This UDF expects that all bags are non-null and that there is a corresponding key for each bag. The key that is expected is the alias of the key inside of the preceding bag. By default, an 'inner' join is performed. You can also perform 'left' or 'full' outer joins by specifying 'left' or 'full' in the definition.
Example:
define BagJoin datafu.pig.bags.BagJoin(); -- inner join
-- describe data:
-- data: {bag1: {(key1: chararray,value1: chararray)},bag2: {(key2: chararray,value2: int)}}
bag_joined = FOREACH data GENERATE BagJoin(bag1, 'key1', bag2, 'key2') as joined;
-- describe bag_joined:
-- bag_joined: {joined: {(bag1::key1: chararray, bag1::value1: chararray, bag2::key2: chararray, bag2::value2: int)}}
| Modifier and Type | Class and Description |
|---|---|
static class |
BagJoin.JoinType |
| Constructor and Description |
|---|
BagJoin() |
BagJoin(java.lang.String joinType) |
| Modifier and Type | Method and Description |
|---|---|
org.apache.pig.data.DataBag |
exec(org.apache.pig.data.Tuple input) |
org.apache.pig.impl.logicalLayer.schema.Schema |
getOutputSchema(org.apache.pig.impl.logicalLayer.schema.Schema input)
Specify the output schema as in {link EvalFunc#outputSchema(Schema)}.
|
getBag, getBoolean, getDouble, getDouble, getFieldAliases, getFloat, getFloat, getInteger, getInteger, getLong, getLong, getObject, getPosition, getPosition, getPrefixedAliasName, getString, getString, outputSchemagetContextProperties, getInstanceName, getInstanceProperties, onReady, setUDFContextSignatureallowCompileTimeCalculation, finish, getArgToFuncMapping, getCacheFiles, getInputSchema, getLogger, getPigLogger, getReporter, getReturnType, getSchemaName, getSchemaType, getShipFiles, isAsynchronous, progress, setInputSchema, setPigLogger, setReporter, warnpublic org.apache.pig.data.DataBag exec(org.apache.pig.data.Tuple input)
throws java.io.IOException
exec in class org.apache.pig.EvalFunc<org.apache.pig.data.DataBag>java.io.IOExceptionpublic org.apache.pig.impl.logicalLayer.schema.Schema getOutputSchema(org.apache.pig.impl.logicalLayer.schema.Schema input)
AliasableEvalFuncgetOutputSchema in class AliasableEvalFunc<org.apache.pig.data.DataBag>input - input schema