public class ReverseEnumerate extends SimpleEvalFunc<org.apache.pig.data.DataBag>
{(A),(B),(C),(D)} => {(A,3),(B,2),(C,1),(D,0)}
The first constructor parameter (optional) dictates the starting index of the counting. As the UDF requires the size of the bag for reverse counting, this UDF does not implement the accumulator interface and suffers from the slight performance penalty of DataBag materialization.
Example:
define ReverseEnumerate datafu.pig.bags.ReverseEnumerate('1');
-- input:
-- ({(100),(200),(300),(400)})
input = LOAD 'input' as (B: bag{T: tuple(v2:INT)});
-- output:
-- ({(100,4),(200,3),(300,2),(400,1)})
output = FOREACH input GENERATE ReverseEnumerate(B);
Constructor and Description |
---|
ReverseEnumerate() |
ReverseEnumerate(java.lang.String start) |
Modifier and Type | Method and Description |
---|---|
org.apache.pig.data.DataBag |
call(org.apache.pig.data.DataBag inputBag) |
org.apache.pig.impl.logicalLayer.schema.Schema |
outputSchema(org.apache.pig.impl.logicalLayer.schema.Schema input)
Override outputSchema so we can verify the input schema at pig compile time, instead of runtime
|
exec, getReturnType
getContextProperties, getInstanceName, getInstanceProperties, onReady, setUDFContextSignature
public ReverseEnumerate()
public ReverseEnumerate(java.lang.String start)
public org.apache.pig.data.DataBag call(org.apache.pig.data.DataBag inputBag) throws java.io.IOException
java.io.IOException
public org.apache.pig.impl.logicalLayer.schema.Schema outputSchema(org.apache.pig.impl.logicalLayer.schema.Schema input)
SimpleEvalFunc
outputSchema
in class SimpleEvalFunc<org.apache.pig.data.DataBag>
input
- input schema