Skip to content

Commit 089969a

Browse files
committed
Add docs for LSHFunction(). Add a page for the LSHFunction API, and a page stub for performance tips.
1 parent 62acbe9 commit 089969a

File tree

7 files changed

+99
-29
lines changed

7 files changed

+99
-29
lines changed

docs/make.jl

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,11 +12,13 @@ makedocs(
1212
format = Documenter.HTML(),
1313
modules = [LSH],
1414
pages = ["Home" => "index.md",
15+
"The LSHFunction API" => "lshfunction_api.md",
1516
"Similarity functions" => [
1617
"Cosine similarity" => joinpath("similarities", "cosine.md"),
1718
raw"``\ell^p`` distance" => joinpath("similarities", "lp_distance.md"),
1819
"Jaccard similarity" => joinpath("similarities", "jaccard.md"),
19-
"Inner product similarity" => joinpath("similarities", "inner_prod.md")]
20+
"Inner product similarity" => joinpath("similarities", "inner_prod.md")],
21+
"Performance tips" => "performance.md",
2022
]
2123
)
2224

docs/src/index.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,13 +3,13 @@
33
LSH.jl is a Julia package for performing [locality-sensitive hashing](https://en.wikipedia.org/wiki/Locality-sensitive_hashing) with various similarity functions.
44

55
## Introduction
6-
One of the simplest methods for classifying, categorizing, and grouping data is to measure how similarities pairs of data points are. For instance, the classical [``k``-nearest neighbors algorithm](https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm) takes a similarity function
6+
One of the simplest methods for classifying, categorizing, and grouping data is to measure how similarities pairs of data points are. For instance, the classical [``k``-nearest neighbors algorithm](https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm) searches an input space ``X`` by taking a query point ``x\in X`` and a similarity function
77

88
```math
99
s:X\times X\to\mathbb{R}
1010
```
1111

12-
and a query point ``x\in X``, where ``X`` is the input space. It then computes ``s(x,y)`` for every point ``y`` in a database, and keeps the ``k`` points that are closest to ``x``.
12+
It then computes ``s(x,y)`` for every point ``y`` in a database, and keeps the ``k`` points that are closest to ``x``.
1313

1414
Broadly, there are two computational issues with this approach:
1515

@@ -23,8 +23,8 @@ LSH.jl is a package that provides definitions of locality-sensitive hash functio
2323

2424
- Cosine similarity (`cossim`)
2525
- Jaccard similarity (`jaccard`)
26-
- ``L^1`` (Manhattan / "taxicab") distance (`ℓ1`)
27-
- ``L^2`` (Euclidean) distance (`ℓ2`)
26+
- ``\ell^1`` (Manhattan / "taxicab") distance (`ℓ1`)
27+
- ``\ell^2`` (Euclidean) distance (`ℓ2`)
2828
- Inner product (`inner_prod`)
2929
- Function-space hashes (`L1`, `L2`, and `cossim`)
3030

docs/src/lshfunction_api.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
# The LSHFunction API
2+
3+
!!! warning "Under construction"
4+
This section is currently being developed. If you're interested in helping write this section, feel free to [open a pull request](https://github.com/kernelmethod/LSH.jl/pulls); otherwise, please check back later.
5+
6+
## API reference
7+
8+
```@docs
9+
lsh_family
10+
```

docs/src/performance.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
# Performance tips
2+
3+
!!! warning "Under construction"
4+
This section is currently being developed. If you're interested in helping write this section, feel free to [open a pull request](https://github.com/kernelmethod/LSH.jl/pulls); otherwise, please check back later.

src/LSHBase.jl

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@ LSHFunction API
3838
========================#
3939

4040
macro register_similarity! end
41+
function LSHFunction end
4142
function lsh_family end
4243

4344
#=

src/hashes/lshfunction.jl

Lines changed: 77 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@ Implementations of the LSHFunction() method for constructing new hash functions.
44
55
================================================================#
66

7+
using Markdown
8+
79
#========================
810
Macros
911
========================#
@@ -88,9 +90,81 @@ end
8890
@reset_similarities!()
8991

9092
#========================
91-
Documentation
93+
Documentation for various components of the LSHFunction API
9294
========================#
9395

96+
### similarity docs
97+
98+
Docs.getdoc(::typeof(similarity)) = Markdown.parse("""
99+
similarity(hashfn::LSHFunction)
100+
101+
Returns the similarity function that `hashfn` hashes on.
102+
103+
# Arguments
104+
- `hashfn::AbstractLSHFunction`: the hash function whose similarity we would like to retrieve.
105+
106+
# Returns
107+
Returns a similarity function, which is one of the following:
108+
109+
```
110+
$(join(available_similarities_as_strings(), "\n"))
111+
```
112+
113+
# Examples
114+
```jldoctest; setup = :(using LSH)
115+
julia> hashfn = LSHFunction(cossim);
116+
117+
julia> similarity(hashfn) == cossim
118+
true
119+
```
120+
""") # similarity
121+
122+
### LSHFunction docs
123+
124+
Docs.getdoc(::typeof(LSHFunction)) = Markdown.parse("""
125+
LSHFunction(similarity, args...; kws...)
126+
127+
Construct the default `LSHFunction` subtype that corresponds to the similarity function `similarity`.
128+
129+
# Arguments
130+
- `similarity`: the similarity function you want to use. Can be any of the following:
131+
132+
```
133+
$(join(available_similarities_as_strings(), "\n"))
134+
```
135+
136+
- `args...`: arguments to pass on to the default `LSHFunction` constructor corresponding to `similarity`.
137+
- `kws...`: keyword parameters to pass on to the default `LSHFunction` constructor corresponding to `similarity`.
138+
139+
# Returns
140+
Returns a subtype of `LSH.LSHFunction` that hashes the similarity function `similarity`.
141+
142+
# Examples
143+
In the snippet below, we construct `$(lsh_family(cossim))` (the default hash function corresponding to cosine similarity) using `LSHFunction()`:
144+
145+
```jldoctest; setup = :(using LSH)
146+
julia> hashfn = LSHFunction(cossim);
147+
148+
julia> typeof(hashfn) <: $(lsh_family(cossim)) <: LSHFunction
149+
true
150+
```
151+
152+
We can provide arguments and keyword parameters corresponding to the hash function that we construct:
153+
154+
```jldoctest; setup = :(using LSH)
155+
julia> hashfn = LSHFunction(inner_prod, 100; dtype=Float64, maxnorm=10);
156+
157+
julia> n_hashes(hashfn) == 100 &&
158+
typeof(hashfn) <: SignALSH{Float64} &&
159+
hashfn.maxnorm == 10
160+
true
161+
```
162+
163+
See also: [`lsh_family`](@ref)
164+
""") # LSHFunction
165+
166+
### lsh_family docs
167+
94168
@doc """
95169
lsh_family(similarity)
96170
@@ -120,4 +194,6 @@ else
120194
end
121195
)
122196
```
197+
198+
See also: [`LSHFunction`](@ref)
123199
""" lsh_family

src/similarities.jl

Lines changed: 0 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -414,26 +414,3 @@ emd = wasserstein1_1d
414414

415415
@doc (@doc wasserstein_1d)
416416
wasserstein2_2d(f, g) = wasserstein_1d(f, g, 2)
417-
418-
#====================
419-
Definitions for similarity function-related components of the AbstractLSHFunction
420-
API.
421-
====================#
422-
423-
# Define documentation for `similarity` manually so that we can dynamically
424-
# modify it through the available_similarities list.
425-
Docs.getdoc(::typeof(similarity)) = Markdown.parse("""
426-
similarity(hashfn::AbstractLSHFunction)
427-
428-
Returns the similarity function that the input `AbstractLSHFunction` hashes on.
429-
430-
# Arguments
431-
- `hashfn::AbstractLSHFunction`: the hash function whose similarity we would like to retrieve.
432-
433-
# Returns
434-
Returns a similarity function, which is one of the following:
435-
436-
```
437-
$(join(available_similarities_as_strings(), "\n"))
438-
```
439-
""")

0 commit comments

Comments
 (0)