datafu.pig.sessions
Class SessionCount
java.lang.Object
org.apache.pig.EvalFunc<T>
org.apache.pig.AccumulatorEvalFunc<java.lang.Long>
datafu.pig.sessions.SessionCount
- All Implemented Interfaces:
- org.apache.pig.Accumulator<java.lang.Long>
public class SessionCount
- extends org.apache.pig.AccumulatorEvalFunc<java.lang.Long>
Performs a count of events, ignoring events which occur within the
same time window.
This is useful for tasks such as counting the number of page views per user since it:
a) prevent reloads and go-backs from overcounting actual views
b) captures the notion that views across multiple sessions are more meaningful
Input must be sorted ascendingly by time for this UDF to work.
Example:
%declare TIME_WINDOW 10m
define SessionCount datafu.pig.sessions.SessionCount('$TIME_WINDOW');
views = LOAD 'views' as (user_id:int, page_id:int, time:chararray);
views_grouped = GROUP views by (user_id, page_id);
view_counts = FOREACH views_grouped {
views = order views by time;
generate group.user_id as user_id,
group.page_id as page_id,
SessionCount(views.(time)) as count; }
Fields inherited from class org.apache.pig.EvalFunc |
log, pigLogger, reporter, returnType |
Methods inherited from class org.apache.pig.AccumulatorEvalFunc |
exec |
Methods inherited from class org.apache.pig.EvalFunc |
finish, getArgToFuncMapping, getCacheFiles, getInputSchema, getLogger, getPigLogger, getReporter, getReturnType, getSchemaName, isAsynchronous, outputSchema, progress, setInputSchema, setPigLogger, setReporter, setUDFContextSignature, warn |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
SessionCount
public SessionCount(java.lang.String timeSpec)
accumulate
public void accumulate(org.apache.pig.data.Tuple input)
throws java.io.IOException
- Specified by:
accumulate
in interface org.apache.pig.Accumulator<java.lang.Long>
- Specified by:
accumulate
in class org.apache.pig.AccumulatorEvalFunc<java.lang.Long>
- Throws:
java.io.IOException
getValue
public java.lang.Long getValue()
- Specified by:
getValue
in interface org.apache.pig.Accumulator<java.lang.Long>
- Specified by:
getValue
in class org.apache.pig.AccumulatorEvalFunc<java.lang.Long>
cleanup
public void cleanup()
- Specified by:
cleanup
in interface org.apache.pig.Accumulator<java.lang.Long>
- Specified by:
cleanup
in class org.apache.pig.AccumulatorEvalFunc<java.lang.Long>
Matthew Hayes, Sam Shah