You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Samples a hash function from an LSH family for the similarity `sim` defined over the function space ``L^p_{\\mu}(\\Omega)``. `sim` may be one of the following:
50
+
$(
51
+
join(
52
+
["- `"* sim *"`"for sim in (_valid_ChebHash_similarities[1] .|>
53
+
string |>
54
+
collect |>
55
+
sort!)
56
+
],
57
+
"\n"
58
+
)
59
+
)
60
+
61
+
`ChebHash` works by approximating a function by Chebyshev polynomials. You can choose the degree of the approximation to trade between speed and generating desirable hash collision probabilities.
62
+
63
+
!!! info "ChebHash limitations"
64
+
`ChebHash` can only hash function spaces of the form ``L^2([a,b])``, where ``[a,b]`` is an interval on the real line. For a more versatile option, checkout out [`MonteCarloHash`](@ref).
65
+
66
+
# Arguments
67
+
- `sim`: the similarity function you want to hash on.
68
+
- `args...`: arguments to pass on when building the `LSHFunction` instance underlying the returned `ChebHash` struct.
69
+
- `kws...`: keyword arguments to pass on when building the `LSHFunction` instance underlying the returned `ChebHash` struct.
70
+
71
+
# Examples
72
+
Create a hash function for cosine similarity for functions in ``L^2([-1,1])``:
73
+
74
+
```jldoctest; setup = :(using LSHFunctions)
75
+
julia> hashfn = ChebHash(cossim, 50; interval=@interval(-1 ≤ x ≤ 1));
76
+
77
+
julia> n_hashes(hashfn)
78
+
50
79
+
80
+
julia> similarity(hashfn) == cossim
81
+
true
82
+
83
+
julia> hashtype(hashfn)
84
+
$(cossim |> LSHFunction |> hashtype)
85
+
```
86
+
87
+
Create a hash function for ``L^2`` distance defined over ``L^2([0,2\\pi])``. Hash the functions `f(x) = cos(x)` and `f(x) = x/(2π)` using the returned `ChebHash`:
Samples a hash function from an LSH family for the similarity `sim` defined over the function space ``L^p_{\\mu}(\\Omega)``. `sim` may be one of the following:
69
+
$(
70
+
join(
71
+
["- `"* sim *"`"for sim in (_valid_MonteCarloHash_similarities[1] .|>
72
+
string |>
73
+
collect |>
74
+
sort!)
75
+
],
76
+
"\n"
77
+
)
78
+
)
79
+
80
+
Given an input function ``f\\in L^p_{\\mu}(\\Omega)``, `MonteCarloHash` works by sampling ``f`` at some randomly-selected points in ``\\Omega``, and then hashing those samples.
81
+
82
+
# Arguments
83
+
- `sim`: the similarity function you want to hash on.
84
+
- `ω`: a function that takes no inputs and samples a single point from ``\\Omega``. Alternatively, it can be viewed as a random variable with probability measure
- `n_samples::Integer` (default: `$(_MONTECARLOHASH_DEFAULT_N_SAMPLES)`): the number of points to sample from each function that is hashed by the `MonteCarloHash`. Larger values of `n_samples` tend to capture the input function better and will thus be more likely to achieve desirable collision probabilities.
98
+
- `kws...`: keyword arguments to pass on when building the `LSHFunction` instance underlying the returned `MonteCarloHash` struct.
99
+
100
+
# Examples
101
+
Create a hash function for cosine similarity for functions in ``L^2([-1,1])``:
102
+
103
+
```jldoctest; setup = :(using LSHFunctions)
104
+
julia> μ() = 2*rand()-1; # μ samples a random point from [-1,1]
Create a hash function for ``L^2`` distance in the function space ``L^2([0,2\\pi])``. Hash the functions `f(x) = cos(x)` and `f(x) = x/(2π)` using the returned `MonteCarloHash`.
0 commit comments