Skip to content

Commit 77598c6

Browse files
authored
Stochastic Gradient Descent. (#44)
* Initial sketch of a stochastic gradient descent. * refactor variable names * adds a Nesterov mode for the gradient * Adds a Momentum and an Average Direction Update Rule * refactor Processor to be renamed/unified to the DirectionUpdateRule. * introduces a tutorial for the stochastic gradient descent
1 parent 932c0bc commit 77598c6

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

51 files changed

+6318
-519
lines changed

Project.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
name = "Manopt"
22
uuid = "0fc0a36d-df90-57f3-8f93-d78a9fc72bb5"
33
authors = ["Ronny Bergmann <manopt@ronnybergmann.net>"]
4-
version = "0.2.12"
4+
version = "0.2.13"
55

66
[deps]
77
ColorSchemes = "35d6a980-a343-548e-a6ea-1d62b119f2f4"

docs/make.jl

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,15 @@ tutorialsInputPath = joinpath(@__DIR__, "..", "src/tutorials")
55
tutorialsRelativePath = "tutorials/"
66
tutorialsOutputPath = joinpath(@__DIR__, "src/" * tutorialsRelativePath)
77
tutorials = [
8-
"MeanAndMedian", "BezierCurves", "GradientOfSecondOrderDifference", "JacobiFields"
8+
"MeanAndMedian",
9+
"StochasticGradientDescent",
10+
"BezierCurves",
11+
"GradientOfSecondOrderDifference",
12+
"JacobiFields",
913
]
1014
menuEntries = [
1115
"get Started: Optimize!",
16+
"do stochastic gradient descent",
1217
"work with Bézier curves",
1318
"see the gradient of \$d_2\$",
1419
"use Jacobi Fields",
@@ -39,6 +44,7 @@ makedocs(;
3944
"Gradient Descent" => "solvers/gradient_descent.md",
4045
"Nelder–Mead" => "solvers/NelderMead.md",
4146
"Particle Swarm Optimization" => "solvers/particle_swarm.md",
47+
"Stochastic Gradient Descent" => "solvers/stochastic_gradient_descent.md",
4248
"Subgradient method" => "solvers/subgradient.md",
4349
"Steihaug-Toint TCG Method" =>
4450
"solvers/truncated_conjugate_gradient_descent.md",
@@ -49,7 +55,7 @@ makedocs(;
4955
"Bézier curves" => "functions/bezier.md",
5056
"Cost functions" => "functions/costs.md",
5157
"Differentials" => "functions/differentials.md",
52-
"Adjoint Differentials" => "functions/adjoint_differentials.md",
58+
"Adjoint Differentials" => "functions/adjointdifferentials.md",
5359
"Gradients" => "functions/gradients.md",
5460
"Jacobi Fields" => "functions/Jacobi_fields.md",
5561
"Proximal Maps" => "functions/proximal_maps.md",

docs/src/assets/images/tutorials/centerAndLargeData.asy

Lines changed: 5022 additions & 0 deletions
Large diffs are not rendered by default.
Loading

docs/src/functions/adjoint_differentials.md renamed to docs/src/functions/adjointdifferentials.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,5 +2,5 @@
22

33
```@autodocs
44
Modules = [Manopt]
5-
Pages = ["adjointdifferentials.jl"]
5+
Pages = ["adjoint_differentials.jl"]
66
```

docs/src/functions/differentials.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,5 +2,5 @@
22

33
```@autodocs
44
Modules = [Manopt]
5-
Pages = ["differentials.jl"]
5+
Pages = ["functions/differentials.jl"]
66
```

docs/src/plans/index.md

Lines changed: 2 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -107,16 +107,6 @@ Problem
107107
get_cost
108108
```
109109

110-
For any algorithm that involves a cyclic evalutaion, e.g.
111-
[`cyclic_proximal_point`](@ref), one can specify the [`EvalOrder`](@ref) as
112-
113-
```@docs
114-
EvalOrder
115-
LinearEvalOrder
116-
RandomEvalOrder
117-
FixedRandomEvalOrder
118-
```
119-
120110
### Cost based problem
121111

122112
```@docs
@@ -127,7 +117,9 @@ CostProblem
127117

128118
```@docs
129119
GradientProblem
120+
StochasticGradientProblem
130121
get_gradient
122+
get_gradients
131123
```
132124

133125
### Subgradient based problem

docs/src/solvers/conjugate_gradient_descent.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,10 +17,11 @@ ConjugateGradientDescentOptions
1717

1818
## Available Coefficients
1919

20+
The update rules act as [`DirectionUpdateRule`](@ref), which internally always first evaluate the gradient itself.
21+
2022
```@docs
2123
ConjugateDescentCoefficient
2224
DaiYuanCoefficient
23-
DirectionUpdateRule
2425
FletcherReevesCoefficient
2526
HagerZhangCoefficient
2627
HeestenesStiefelCoefficient

docs/src/solvers/gradient_descent.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,9 +11,22 @@ CurrentModule = Manopt
1111
## Options
1212

1313
```@docs
14+
AbstractGradientDescentOptions
1415
GradientDescentOptions
1516
```
1617

18+
## Direction Update Rules
19+
20+
A field of the options is the `direction`, a [`DirectionUpdateRule`](@ref), which by default [`Gradient`](@ref) just evaluates the gradient but can be enhanced for example to
21+
22+
```@docs
23+
DirectionUpdateRule
24+
Gradient
25+
MomentumGradient
26+
AverageGradient
27+
Nesterov
28+
```
29+
1730
## Debug Actions
1831

1932
```@docs
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# [Gradient Descent](@id StochasticGradientDescentSolver)
2+
3+
```@meta
4+
CurrentModule = Manopt
5+
```
6+
7+
```@docs
8+
stochastic_gradient_descent
9+
```
10+
11+
## Options
12+
13+
```@docs
14+
StochasticGradientDescentOptions
15+
```
16+
17+
Additionally, the options share a [`DirectionUpdateRule`](@ref),
18+
so you can also apply [`MomentumGradient`](@ref) and [`AverageGradient`](@ref) here.
19+
The most inner one should always be.
20+
21+
```@docs
22+
AbstractStochasticGradientProcessor
23+
StochasticGradient
24+
```

examples/ChambollePock/SPD_Image_CP_vs_CPPA.jl

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ comparison_data = joinpath(
2929
f = artificial_SPD_image2(32)
3030
if export_original
3131
fn = joinpath(results_folder, experiment_name * "-orig.asy")
32-
asymptote_export_SPD(fn; data=f, scaleAxes=(7.5, 7.5, 7.5))
32+
asymptote_export_SPD(fn; data=f, scale_axes=(7.5, 7.5, 7.5))
3333
render_asymptote(fn; render=asy_render_detail)
3434
end
3535
sC = StopAfterIteration(400)
@@ -95,7 +95,7 @@ if export_result
9595
results_folder,
9696
experiment_name * "img-result-$(numIter)$(replace(string(α), "." => "-")).asy",
9797
)
98-
asymptote_export_SPD(fn; data=y, scaleAxes=(7.5, 7.5, 7.5))
98+
asymptote_export_SPD(fn; data=y, scale_axes=(7.5, 7.5, 7.5))
9999
render_asymptote(fn; render=asy_render_detail)
100100
end
101101
if export_table

examples/CyclicProximalPoint/SPD_Image_CPPA.jl

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ f = artificial_SPD_image2(32)
3535
pixelM = SymmetricPositiveDefinite(3)
3636
if export_orig
3737
orig_filename = joinpath(results_folder, experiment_name * "-orig.asy")
38-
asymptote_export_SPD(orig_filename; data=f, scaleAxes=(7.5, 7.5, 7.5))
38+
asymptote_export_SPD(orig_filename; data=f, scale_axes=(7.5, 7.5, 7.5))
3939
render_asymptote(orig_filename; render=asy_render_detail)
4040
end
4141

@@ -79,7 +79,7 @@ yRec = get_record(o)
7979
# Results
8080
if export_result
8181
result_filename = joinpath(results_folder, experiment_name * "-result.asy")
82-
asymptote_export_SPD(result_filename; data=y, scaleAxes=(7.5, 7.5, 7.5))
82+
asymptote_export_SPD(result_filename; data=y, scale_axes=(7.5, 7.5, 7.5))
8383
render_asymptote(result_filename; render=asy_render_detail)
8484
end
8585
if export_table

examples/DouglasRachford/SPD_Image_DR_vs_CPPA.jl

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ comparison_data = joinpath(
3838
f = artificial_SPD_image2(32)
3939
if export_orig
4040
fn = joinpath(results_folder, experiment_name * "-orig.asy")
41-
asymptote_export_SPD(fn; data=f, scaleAxes=(7.5, 7.5, 7.5))
41+
asymptote_export_SPD(fn; data=f, scale_axes=(7.5, 7.5, 7.5))
4242
render_asymptote(fn; render=asy_render_detail)
4343
end
4444
sC = StopAfterIteration(400)
@@ -91,7 +91,7 @@ if export_result
9191
results_folder,
9292
experiment_name * "img-result-$(numIter)$(replace(string(α), "." => "-")).asy",
9393
)
94-
asymptote_export_SPD(fn; data=y, scaleAxes=(7.5, 7.5, 7.5))
94+
asymptote_export_SPD(fn; data=y, scale_axes=(7.5, 7.5, 7.5))
9595
render_asymptote(fn; render=asy_render_detail)
9696
end
9797
if export_table

examples/GradientDescent/S2_Bezier_Approximation.jl

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ asy_export = true #export data and results to asyExport
1818

1919
curve_samples = [range(0, 3; length=101)...] # sample curve for the gradient
2020
curve_samples_plot = [range(0, 3; length=201)...] # sample curve for asy exports
21-
cameraPosition = (-1.0, -0.7, 0.3)
21+
camera_position = (-1.0, -0.7, 0.3)
2222
curveColor = RGBA{Float64}(colorant"#000000")
2323
sColor = RGBA{Float64}(colorant"#BBBBBB")
2424
dColor = RGBA{Float64}(colorant"#EE7733") # data Color: Tol Vibrant Orange
@@ -38,7 +38,7 @@ if asy_export
3838
joinpath(results_folder, experiment_name * "-orig.asy");
3939
curves=[cP],
4040
points=[get_bezier_junctions(M, B), get_bezier_inner_points(M, B)],
41-
tVectors=[[
41+
tangent_vectors=[[
4242
Tuple(a)
4343
for
4444
a in
@@ -47,10 +47,10 @@ if asy_export
4747
colors=Dict(
4848
:curves => [curveColor], :points => [dColor, bColor], :tvectors => [ξColor]
4949
),
50-
cameraPosition=cameraPosition,
51-
arrowHeadSize=10.0,
52-
lineWidths=[1.5, 1.5],
53-
dotSize=4.0,
50+
camera_position=camera_position,
51+
arrow_head_size=10.0,
52+
line_widths=[1.5, 1.5],
53+
dot_size=4.0,
5454
)
5555
render_asymptote(joinpath(results_folder, experiment_name * "-orig.asy"); render=4)
5656
end
@@ -97,7 +97,7 @@ if asy_export
9797
joinpath(results_folder, experiment_name * "-result.asy");
9898
curves=[res_curve, cP],
9999
points=[get_bezier_junctions(M, B_opt), get_bezier_inner_points(M, B_opt)],
100-
tVectors=[[
100+
tangent_vectors=[[
101101
Tuple(a)
102102
for
103103
a in zip(
@@ -110,10 +110,10 @@ if asy_export
110110
:points => [dColor, bColor],
111111
:tvectors => [ξColor],
112112
),
113-
cameraPosition=cameraPosition,
114-
arrowHeadSize=10.0,
115-
lineWidths=[1.5, 0.75, 1.5],
116-
dotSize=4.0,
113+
camera_position=camera_position,
114+
arrow_head_size=10.0,
115+
line_widths=[1.5, 0.75, 1.5],
116+
dot_size=4.0,
117117
)
118118
render_asymptote(joinpath(results_folder, experiment_name * "-result.asy"); render=4)
119119
end

examples/GradientDescent/S2_Bezier_Approximation_lambda_video.jl

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ colors = RGBA.(get.(Ref(viridis), range(0.0, 1.0; length=length(λRange))))
3333
curve_samples = [range(0, 3; length=101)...] # sample curve for the gradient
3434
curve_samples_plot = [range(0, 3; length=201)...] # sample curve for asy exports
3535

36-
cameraPosition = (-1.0, -0.7, 0.3)
36+
camera_position = (-1.0, -0.7, 0.3)
3737
curveColor = RGBA{Float64}(colorant"#AAAAAA")
3838
M = Sphere(2)
3939
B = artificial_S2_composite_bezier_curve()
@@ -90,9 +90,9 @@ if asy_export
9090
colors=Dict(
9191
:curves => [colors[i], curveColor], :points => [colors[i], colors[i]]
9292
),
93-
cameraPosition=cameraPosition,
94-
lineWidths=[1.0, 0.5],
95-
dotSize=2.0,
93+
camera_position=camera_position,
94+
line_widths=[1.0, 0.5],
95+
dot_size=2.0,
9696
)
9797
render_asymptote(s * "-$(@sprintf "%04.0f" i)-result.asy"; render=render_detail)
9898
end
@@ -103,8 +103,8 @@ if asy_export_summary
103103
s * "-Summary-result.asy";
104104
curves=[cP, resulting_curves...],
105105
colors=Dict(:curves => [curveColor, colors...]),
106-
cameraPosition=cameraPosition,
107-
lineWidths=[0.75, [1.5 for i in eachindex(λRange)]...],
106+
camera_position=camera_position,
107+
line_widths=[0.75, [1.5 for i in eachindex(λRange)]...],
108108
)
109109
render_asymptote(s * "-Summary-result.asy"; render=render_detail)
110110
end

examples/GradientDescent/S2_Bezier_Interplation.jl

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ asy_export = true #export data and results to asyExport
1616
curve_samples = [range(0, 3; length=101)...] # sample curve for the gradient
1717
curve_samples_plot = [range(0, 3; length=201)...] # sample curve for asy exports
1818

19-
cameraPosition = (-1.0, -0.7, 0.3)
19+
camera_position = (-1.0, -0.7, 0.3)
2020
curveColor = RGBA{Float64}(colorant"#000000")
2121
sColor = RGBA{Float64}(colorant"#BBBBBB")
2222
dColor = RGBA{Float64}(colorant"#EE7733") # data Color: Tol Vibrant Orange
@@ -35,7 +35,7 @@ if asy_export
3535
joinpath(results_folder, experiment_name * "-orig.asy");
3636
curves=[cP],
3737
points=[get_bezier_junctions(M, B), get_bezier_inner_points(M, B)],
38-
tVectors=[[
38+
tangent_vectors=[[
3939
Tuple(a)
4040
for
4141
a in
@@ -44,10 +44,10 @@ if asy_export
4444
colors=Dict(
4545
:curves => [curveColor], :points => [dColor, bColor], :tvectors => [ξColor]
4646
),
47-
cameraPosition=cameraPosition,
48-
arrowHeadSize=10.0,
49-
lineWidths=[1.5, 1.5],
50-
dotSize=4.0,
47+
camera_position=camera_position,
48+
arrow_head_size=10.0,
49+
line_widths=[1.5, 1.5],
50+
dot_size=4.0,
5151
)
5252
render_asymptote(joinpath(results_folder, experiment_name * "-orig.asy"); render=4)
5353
end
@@ -87,7 +87,7 @@ if asy_export
8787
joinpath(results_folder, experiment_name * "-result.asy");
8888
curves=[de_casteljau(M, B_opt, curve_samples_plot), cP],
8989
points=[get_bezier_junctions(M, B_opt), get_bezier_inner_points(M, B_opt)],
90-
tVectors=[[
90+
tangent_vectors=[[
9191
Tuple(a)
9292
for
9393
a in zip(
@@ -100,10 +100,10 @@ if asy_export
100100
:points => [dColor, bColor],
101101
:tvectors => [ξColor],
102102
),
103-
cameraPosition=cameraPosition,
104-
arrowHeadSize=10.0,
105-
lineWidths=[1.5, 0.75, 1.5],
106-
dotSize=4.0,
103+
camera_position=camera_position,
104+
arrow_head_size=10.0,
105+
line_widths=[1.5, 0.75, 1.5],
106+
dot_size=4.0,
107107
)
108108
render_asymptote(joinpath(results_folder, experiment_name * "-result.asy"); render=4)
109109
end

0 commit comments

Comments
 (0)