Uses reflection to makes writing simple wrapper Pig UDFs easier.
For example, writing a simple string trimming UDF might look like
this:
public class TRIM extends EvalFunc<String>
{
public String exec(Tuple input) throws IOException
{
if (input.size() != 1)
throw new IllegalArgumentException("requires a parameter");
try {
Object o = input.get(0);
if (!(o instanceof String))
throw new IllegalArgumentException("expected a string");
String str = (String)o;
return (str == null) ? null : str.trim();
}
catch (Exception e) {
throw WrappedIOException.wrap("error...", e);
}
}
}
There is a lot of boilerplate to check the number of arguments and
the parameter types in the tuple.
Instead, with this class, you can derive from SimpleEvalFunc and
create a
call()
method (not exec!), just specifying the
arguments as a regular function. The class handles all the argument
checking and exception wrapping for you. So your code would be:
public class TRIM2 extends SimpleEvalFunc<String>
{
public String call(String s)
{
return (s != null) ? s.trim() : null;
}
}
An example of this UDF in action with Pig:
grunt> a = load 'test' as (x:chararray, y:chararray); dump a;
(1 , 2)
grunt> b = foreach a generate TRIM2(x); dump b;
(1)
grunt> c = foreach a generate TRIM2((int)x); dump c;
datafu.pig.util.TRIM2(java.lang.String): argument type
mismatch [#1]; expected java.lang.String, got java.lang.Integer
grunt> d = foreach a generate TRIM2(x, y); dump d;
datafu.pig.util.TRIM2(java.lang.String): got 2 arguments, expected 1.