public abstract class AbstractStableDistributionFunction extends LSH
All p-stable LSH functions are parameterized with a quantization parameter (w or r in the literature , depending on where you look). Consider the following excerpt from Datar, M.; Immorlica, N.; Indyk, P.; Mirrokni, V.S. (2004). "Locality-Sensitive Hashing Scheme Based on p-Stable Distributions". Proceedings of the Symposium on Computational Geometry.
Decreasing the width of the projection (w) decreases the probability of collision for any two points. Thus, it has the same effect as increasing k . As a result, we would like to set w as small as possible and in this way decrease the number of projections we need to make.
In the literature, the quantization parameter (or width of the projection) is found empirically given a sample of the data and the likely threshold for the metric. Tuning this parameter is very important for the performance of this algorithm. For more information, see Datar, M.; Immorlica, N.; Indyk, P.; Mirrokni, V.S. (2004). "Locality-Sensitive Hashing Scheme Based on p-Stable Distributions". Proceedings of the Symposium on Computational Geometry.
Constructor and Description |
---|
AbstractStableDistributionFunction(int dim,
double w,
org.apache.commons.math.random.RandomGenerator rand)
Constructs a new instance.
|
Modifier and Type | Method and Description |
---|---|
long |
apply(org.apache.commons.math.linear.RealVector vector)
Compute the LSH for a given vector.
|
protected abstract Sampler |
getSampler()
The sampler determines the metric which this LSH is associated with.
|
void |
reset(int dim,
double w) |
getDim, getRandomGenerator
public AbstractStableDistributionFunction(int dim, double w, org.apache.commons.math.random.RandomGenerator rand) throws org.apache.commons.math.MathException
dim
- The dimension of the vectors to be hashedw
- A double representing the quantization parameter (also known as the projection width)rand
- The random generator usedorg.apache.commons.math.MathException
- MathExceptionpublic void reset(int dim, double w) throws org.apache.commons.math.MathException
org.apache.commons.math.MathException
protected abstract Sampler getSampler()