datafu.pig.stats
Class WilsonBinConf

java.lang.Object
  extended by org.apache.pig.EvalFunc<T>
      extended by datafu.pig.util.SimpleEvalFunc<org.apache.pig.data.Tuple>
          extended by datafu.pig.stats.WilsonBinConf

public class WilsonBinConf
extends SimpleEvalFunc<org.apache.pig.data.Tuple>

Computes the Wilsonian binomial proportion confidence interval

Constructor requires the confidence interval (alpha) parameter, and the parameters are the number of positive (success) outcomes and the total number of observations. The UDF returns the (lower,upper) confidence interval.

Example:

 -- the Wilsonian binomial proportion confidence interval for scoring
 %declare WILSON_ALPHA 0.10

 define WilsonBinConf      datafu.pig.stats.WilsonBinConf('$WILSON_ALPHA'); 

 bar = FOREACH foo GENERATE WilsonBinConf(successes, totals).lower as score;
 quux = ORDER bar BY score DESC;
 top = LIMIT quux 10;
 
 


Field Summary
 
Fields inherited from class org.apache.pig.EvalFunc
log, pigLogger, reporter, returnType
 
Constructor Summary
WilsonBinConf(double alpha)
           
WilsonBinConf(java.lang.String alpha)
           
 
Method Summary
 org.apache.pig.data.Tuple binconf(java.lang.Long x, java.lang.Long n)
           
 org.apache.pig.data.Tuple call(java.lang.Number x, java.lang.Number n)
           
 org.apache.pig.impl.logicalLayer.schema.Schema outputSchema(org.apache.pig.impl.logicalLayer.schema.Schema input)
          Override outputSchema so we can verify the input schema at pig compile time, instead of runtime
 
Methods inherited from class datafu.pig.util.SimpleEvalFunc
exec, getReturnType
 
Methods inherited from class org.apache.pig.EvalFunc
finish, getArgToFuncMapping, getCacheFiles, getInputSchema, getLogger, getPigLogger, getReporter, getSchemaName, isAsynchronous, progress, setInputSchema, setPigLogger, setReporter, setUDFContextSignature, warn
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

WilsonBinConf

public WilsonBinConf(double alpha)

WilsonBinConf

public WilsonBinConf(java.lang.String alpha)
Method Detail

call

public org.apache.pig.data.Tuple call(java.lang.Number x,
                                      java.lang.Number n)
                               throws java.io.IOException
Throws:
java.io.IOException

binconf

public org.apache.pig.data.Tuple binconf(java.lang.Long x,
                                         java.lang.Long n)
                                  throws java.io.IOException
Parameters:
x - The number of positive (success) outcomes
n - The number of observations
Returns:
The (lower,upper) confidence interval
Throws:
java.io.IOException

outputSchema

public org.apache.pig.impl.logicalLayer.schema.Schema outputSchema(org.apache.pig.impl.logicalLayer.schema.Schema input)
Description copied from class: SimpleEvalFunc
Override outputSchema so we can verify the input schema at pig compile time, instead of runtime

Overrides:
outputSchema in class SimpleEvalFunc<org.apache.pig.data.Tuple>
Parameters:
input - input schema
Returns:
call to super.outputSchema in case schema was defined elsewhere


Matthew Hayes, Sam Shah