public class Hasher extends SimpleEvalFunc<java.lang.String>
'murmur3-32', [optional seed]
or 'murmur3-128',
[optional seed]
: Returns a murmur3 hash of the given
length. Murmur3 is fast, with has exceptionally good statistical
properties; it's a good choice if all you need is good mixing of the
inputs. It is not cryptographically secure; that is, given an
output value from murmur3, there are efficient algorithms to find an input
yielding the same output value. Supply the seed as a string that
Integer.decode
can handle. Examples: datafu.pig.hash.Hasher('murmur3-32', '0x56789abc');
or datafu.pig.hash.Hasher('murmur3-32', '-12345678');
.'sip24', [optional seed]
: Returns a 64-bit
SipHash-2-4 hash. SipHash is competitive in performance with Murmur3,
and is simpler and faster than the cryptographic algorithms below. When
used with a seed, it can be considered cryptographically secure: given
the output from a sip24 instance but not the seed used, we cannot
efficiently craft a message yielding the same output from that instance. To
supply a seed, pass in a 32-character string representing the seed in
hexadecimal. If none is given, k = '00010203…0e0f' is used.'adler32'
: Returns an Adler-32 checksum (32 hash bits) by delegating to Java's Adler32 Checksum.'crc32'
: Returns a CRC-32 checksum (32 hash bits) by delegating to Java's CRC32 Checksum.'md5'
: Returns an MD5 hash (128 hash bits) using Java's MD5 MessageDigest.'sha1'
: Returns a SHA-1 hash (160 hash bits) using Java's SHA-1 MessageDigest.'sha256'
: Returns a SHA-256 hash (256 hash bits) using Java's SHA-256 MessageDigest.'sha512'
: Returns a SHA-512 hash (160 hash bits) using Java's SHA-512 MessageDigest.'good-{integer number of bits}'
: Returns a general-purpose,
non-cryptographic-strength, streaming hash function that produces
hash codes of length at least minimumBits. Users without specific'
compatibility requirements and who do not persist the hash codes are
encouraged to choose this hash function. (Cryptographers, like dieticians
and fashionistas, occasionally realize that We've Been Doing it Wrong
This Whole Time. Using 'good-*' lets you track What the Experts From
(Milan|NIH|IEEE) Say To (Wear|Eat|Hash With) this Fall.) Expect values
returned by this hasher to change run-to-run.Modifier and Type | Field and Description |
---|---|
protected com.google.common.hash.HashFunction |
hash_func |
protected static java.lang.String |
SEEDED_HASH_NAMES |
Constructor and Description |
---|
Hasher()
Generates hash values according to murmur3-32, a non-cryptographic-strength
hash function with good mixing.
|
Hasher(java.lang.String algorithm)
Generates hash values according to the hash function given by algorithm.
|
Hasher(java.lang.String algorithm,
java.lang.String seed)
Generates hash values according to the hash function given by algorithm,
with initial seed given by the seed.
|
Modifier and Type | Method and Description |
---|---|
java.lang.String |
call(java.lang.String val) |
static int |
intFromHex(java.lang.String hex_str) |
static long |
longFromHex(java.lang.String hex_str) |
protected void |
makeHashFunc(java.lang.String algorithm,
java.lang.String seed)
Returns the HashFunction named by algorithm, with initial seed given by the
seed.
|
exec, getReturnType, outputSchema
getContextProperties, getInstanceName, getInstanceProperties, onReady, setUDFContextSignature
protected com.google.common.hash.HashFunction hash_func
protected static final java.lang.String SEEDED_HASH_NAMES
public Hasher() throws java.lang.IllegalArgumentException, java.lang.RuntimeException
java.lang.IllegalArgumentException
- for an internal errorjava.lang.RuntimeException
- for an internal errorpublic Hasher(java.lang.String algorithm) throws java.lang.IllegalArgumentException, java.lang.RuntimeException
algorithm
- the hash algorithm to usejava.lang.IllegalArgumentException
- for an invalid algorithmjava.lang.RuntimeException
- for an internal errormakeHashFunc(String algorithm)
public Hasher(java.lang.String algorithm, java.lang.String seed) throws java.lang.IllegalArgumentException, java.lang.RuntimeException
algorithm
- the hash algorithm to useseed
- the initial seed to usejava.lang.IllegalArgumentException
- for an invalid algorithm or seedjava.lang.RuntimeException
- when the seed cannot be parsed or other internal errormakeHashFunc(String algorithm, String seed)
protected void makeHashFunc(java.lang.String algorithm, java.lang.String seed) throws java.lang.IllegalArgumentException, java.lang.RuntimeException
algorithm
- the hash algorithm to useseed
- the initial seed to usejava.lang.IllegalArgumentException
- for an invalid seed given the algorithmjava.lang.RuntimeException
- when the seed cannot be parsedpublic static long longFromHex(java.lang.String hex_str)
public static int intFromHex(java.lang.String hex_str)
public java.lang.String call(java.lang.String val)