|
| 1 | +# Hashing in ``L^p`` function spaces |
| 2 | + |
| 3 | +!!! warning "Under construction" |
| 4 | + This section is currently being developed. If you're interested in helping write this section, feel free to [open a pull request](https://github.com/kernelmethod/LSHFunctions.jl/pulls); otherwise, please check back later. |
| 5 | + |
| 6 | +LSHFunctions supports locality-sensitive hashing over ``L^p`` function spaces. In other words, you can hash functions like `sin`, `exp`, and `f(x) = 5x^3 - 2x^2 - 9x + 1` on a few different similarities. Here's an example using [`MonteCarloHash`](@ref) over cosine similarity: |
| 7 | + |
| 8 | +```jldoctest; setup = :(using Random; Random.seed!(0)) |
| 9 | +julia> using LSHFunctions; |
| 10 | +
|
| 11 | +julia> μ() = 2π*rand(); # μ samples a random point from [0,2π] |
| 12 | +
|
| 13 | +julia> hashfn = MonteCarloHash(cossim, μ, 3); |
| 14 | +
|
| 15 | +julia> hashfn(x -> 5x^3 - 2x^2 - 9x + 1) |
| 16 | +3-element BitArray{1}: |
| 17 | + 0 |
| 18 | + 1 |
| 19 | + 1 |
| 20 | +``` |
| 21 | + |
| 22 | +## Function approximation-based hashing |
| 23 | + |
| 24 | +!!! warning "API subject to change" |
| 25 | + The API for both [`ChebHash`](@ref) and [`MonteCarloHash`](@ref), but especially the former, is being modified very quickly. As a result, the docs below may change radically for future versions of the LSHFunctions package. |
| 26 | + |
| 27 | +Create a hash function for cosine similarity for functions in ``L^2([-1,1])``: |
| 28 | + |
| 29 | +```jldoctest; setup = :(using LSHFunctions) |
| 30 | +julia> hashfn = ChebHash(cossim, 50; interval=@interval(-1 ≤ x ≤ 1)); |
| 31 | +
|
| 32 | +julia> n_hashes(hashfn) |
| 33 | +50 |
| 34 | +
|
| 35 | +julia> similarity(hashfn) == cossim |
| 36 | +true |
| 37 | +
|
| 38 | +julia> hashtype(hashfn) |
| 39 | +Bool |
| 40 | +``` |
| 41 | + |
| 42 | +Create a hash function for ``L^2`` distance defined over ``L^2([0,2\pi])``. Hash the functions `f(x) = cos(x)` and `f(x) = x/(2π)` using the returned [`ChebHash`](@ref): |
| 43 | + |
| 44 | +```jldoctest; setup = :(using LSHFunctions, Random; Random.seed!(0)) |
| 45 | +julia> hashfn = ChebHash(L2, 3; interval=@interval(0 ≤ x ≤ 2π)); |
| 46 | +
|
| 47 | +julia> hashfn(cos) |
| 48 | +3-element Array{Int32,1}: |
| 49 | + 3 |
| 50 | + -1 |
| 51 | + -2 |
| 52 | +
|
| 53 | +julia> hashfn(x -> x/(2π)) |
| 54 | +3-element Array{Int32,1}: |
| 55 | + 0 |
| 56 | + 1 |
| 57 | + 0 |
| 58 | +``` |
| 59 | + |
| 60 | +## Monte Carlo-based hashing |
| 61 | + |
| 62 | +Create a hash function for cosine similarity for functions in ``L^2([-1,1])``: |
| 63 | + |
| 64 | +```jldoctest; setup = :(using LSHFunctions) |
| 65 | +julia> μ() = 2*rand()-1; # μ samples a random point from [-1,1] |
| 66 | +
|
| 67 | +julia> hashfn = MonteCarloHash(cossim, μ, 50; volume=2.0); |
| 68 | +
|
| 69 | +julia> n_hashes(hashfn) |
| 70 | +50 |
| 71 | +
|
| 72 | +julia> similarity(hashfn) == cossim |
| 73 | +true |
| 74 | +
|
| 75 | +julia> hashtype(hashfn) |
| 76 | +Bool |
| 77 | +``` |
| 78 | + |
| 79 | +Create a hash function for ``L^2`` distance in the function space ``L^2([0,2\pi])``. Hash the functions `f(x) = cos(x)` and `f(x) = x/(2π)` using the returned [`MonteCarloHash`](@ref). |
| 80 | + |
| 81 | +```jldoctest; setup = :(using LSHFunctions, Random; Random.seed!(0)) |
| 82 | +julia> μ() = 2π * rand(); # μ samples a random point from [0,2π] |
| 83 | +
|
| 84 | +julia> hashfn = MonteCarloHash(L2, μ, 3; volume=2π); |
| 85 | +
|
| 86 | +julia> hashfn(cos) |
| 87 | +3-element Array{Int32,1}: |
| 88 | + -1 |
| 89 | + 3 |
| 90 | + 0 |
| 91 | +
|
| 92 | +julia> hashfn(x -> x/(2π)) |
| 93 | +3-element Array{Int32,1}: |
| 94 | + -1 |
| 95 | + -2 |
| 96 | + -1 |
| 97 | +``` |
| 98 | + |
| 99 | +Create a hash function with a different number of sample points. |
| 100 | + |
| 101 | +```jldoctest; setup = :(using LSHFunctions) |
| 102 | +julia> μ() = rand(); # Samples a random point from [0,1] |
| 103 | +
|
| 104 | +julia> hashfn = MonteCarloHash(cossim, μ; volume=1.0, n_samples=512); |
| 105 | +
|
| 106 | +julia> length(hashfn.sample_points) |
| 107 | +512 |
| 108 | +``` |
| 109 | + |
0 commit comments