public class ZipBags
extends org.apache.pig.EvalFunc<org.apache.pig.data.DataBag>
-- input:
-- ({(1,2),(3,4),(5,6)},{(7,8),(9,10),(11,12)})
input = LOAD 'input' AS (OUTER: tuple(B1: bag {a:INT,b:INT}, B2: bag{c:INT,d:INT}));
-- output:
-- ({(1,2,7,8),(3,4,9,10),(5,6,11,12)})
k
output = FOREACH input GENERATE ZipBags(B1,B2);
For this to work as expected each bag should be the same length. It will run as long as
the first bag is the shortest however this may not be the desired behavior.Constructor and Description |
---|
ZipBags() |
Modifier and Type | Method and Description |
---|---|
org.apache.pig.data.DataBag |
exec(org.apache.pig.data.Tuple input) |
org.apache.pig.impl.logicalLayer.schema.Schema |
outputSchema(org.apache.pig.impl.logicalLayer.schema.Schema input) |
allowCompileTimeCalculation, finish, getArgToFuncMapping, getCacheFiles, getInputSchema, getLogger, getPigLogger, getReporter, getReturnType, getSchemaName, getSchemaType, getShipFiles, isAsynchronous, progress, setInputSchema, setPigLogger, setReporter, setUDFContextSignature, warn
public org.apache.pig.data.DataBag exec(org.apache.pig.data.Tuple input) throws java.io.IOException
exec
in class org.apache.pig.EvalFunc<org.apache.pig.data.DataBag>
java.io.IOException
public org.apache.pig.impl.logicalLayer.schema.Schema outputSchema(org.apache.pig.impl.logicalLayer.schema.Schema input)
outputSchema
in class org.apache.pig.EvalFunc<org.apache.pig.data.DataBag>