Skip to content

Gradient descent optimizer should find optimal coefficient values and regularize it

Ilya Gyrdymov edited this page Feb 16, 2019 · 3 revisions

iteration 1:

g = [8, 8, 8]

c1 = (1 - 2 * eta * λ) * c1 - η * g1 = 0 - 2 * 8 = -16

c2 = (1 - 2 * eta * λ) * c2 - η * g2 = 0 - 2 * 8 = -16

c3 = (1 - 2 * eta * λ) * c3 - η * g3 = 0 - 2 * 8 = -16

c = (-16, -16, -16)

iteration 2:

g = (8, 8, 8)

c1 = (1 - 2 * η * λ) * c1 - η * g1 = (1 - 2 * 2 * 10) * -16 - 2 * 8 = -39 * -16 - 16 = 608

c2 = (1 - 2 * η * λ) * c2 - η * g2 = (1 - 2 * 2 * 10) * -16 - 2 * 8 = -39 * -16 - 16 = 608

c3 = (1 - 2 * η * λ) * c3 - η * g3 = (1 - 2 * 2 * 10) * -16 - 2 * 8 = -39 * -16 - 16 = 608

c = (608.0, 608.0, 608.0)

iteration 3:

gradient_1 = [5, 5, 5] gradient_2 = [3, 3, 3] gradient = [8, 8, 8]

c_1 = (1 - 2 * eta * lambda) * c_1_prev - eta * partial_1 = (1 - 2 * 2 * 10) * 608 - 2 * 8 = -39 * 608 - 16 = -23728 c_2 = (1 - 2 * eta * lambda) * c_2_prev - eta * partial_2 = (1 - 2 * 2 * 10) * 608 - 2 * 8 = -39 * 608 - 16 = -23728 c_3 = (1 - 2 * eta * lambda) * c_3_prev - eta * partial_3 = (1 - 2 * 2 * 10) * 608 - 2 * 8 = -39 * 608 - 16 = -23728

c = [-23728.0, -23728.0, -23728.0]

Clone this wiki locally