@@ -80,6 +80,17 @@ given the prediction `ŷ` and true values `y`.
80
80
| 0.5 * |ŷ - y|^2, for |ŷ - y| <= δ
81
81
Huber loss = |
82
82
| δ * (|ŷ - y| - 0.5 * δ), otherwise
83
+
84
+ # Example
85
+ ```jldoctest
86
+ julia> ŷ = [1.1, 2.1, 3.1];
87
+
88
+ julia> Flux.huber_loss(ŷ, 1:3) # default δ = 1 > |ŷ - y|
89
+ 0.005000000000000009
90
+
91
+ julia> Flux.huber_loss(ŷ, 1:3, δ=0.05) # changes behaviour as |ŷ - y| > δ
92
+ 0.003750000000000005
93
+ ```
83
94
"""
84
95
function huber_loss (ŷ, y; agg = mean, δ = ofeltype (ŷ, 1 ))
85
96
_check_sizes (ŷ, y)
@@ -377,12 +388,22 @@ function kldivergence(ŷ, y; dims = 1, agg = mean, ϵ = epseltype(ŷ))
377
388
end
378
389
379
390
"""
380
- poisson_loss(ŷ, y)
391
+ poisson_loss(ŷ, y; agg = mean )
381
392
382
- # Return how much the predicted distribution `ŷ` diverges from the expected Poisson
383
- # distribution `y`; calculated as `sum(ŷ .- y .* log.(ŷ)) / size(y, 2)`.
393
+ Return how much the predicted distribution `ŷ` diverges from the expected Poisson
394
+ distribution `y`; calculated as -
395
+
396
+ `sum(ŷ .- y .* log.(ŷ)) / size(y, 2)`.
384
397
385
398
[More information.](https://peltarion.com/knowledge-center/documentation/modeling-view/build-an-ai-model/loss-functions/poisson).
399
+
400
+ # Example
401
+ ```jldoctest
402
+ julia> y_model = [1, 3, 3]; # data should only take integral values
403
+
404
+ julia> poisson_loss(y_model, 1:3)
405
+ 0.5023128522198171
406
+ ```
386
407
"""
387
408
function poisson_loss (ŷ, y; agg = mean)
388
409
_check_sizes (ŷ, y)
@@ -392,11 +413,32 @@ end
392
413
"""
393
414
hinge_loss(ŷ, y; agg = mean)
394
415
395
- Return the [hinge_loss loss ](https://en.wikipedia.org/wiki/Hinge_loss) given the
416
+ Return the [hinge_loss](https://en.wikipedia.org/wiki/Hinge_loss) given the
396
417
prediction `ŷ` and true labels `y` (containing 1 or -1); calculated as
397
- `sum(max.(0, 1 .- ŷ .* y)) / size(y, 2)`.
398
418
419
+ `sum(max.(0, 1 .- ŷ .* y)) / size(y, 2)`.
420
+
421
+ Usually used with classifiers like Support Vector Machines.
399
422
See also: [`squared_hinge_loss`](@ref)
423
+
424
+ # Example
425
+ ```jldoctest
426
+ julia> y_true = [1, -1, 1, 1];
427
+
428
+ julia> y_pred = [0.1, 0.3, 1, 1.5];
429
+
430
+ julia> Flux.hinge_loss(y_pred, y_true)
431
+ 0.55
432
+
433
+ julia> Flux.hinge_loss(y_pred[1], y_true[1]) # same sign but |ŷ| < 1
434
+ 0.9
435
+
436
+ julia> Flux.hinge_loss(y_pred[end], y_true[end]) # same sign but |ŷ| >= 1 -> loss = 0
437
+ 0.0
438
+
439
+ julia> Flux.hinge_loss(y_pred[2], y_true[2]) # opposite signs -> loss != 0
440
+ 1.3
441
+ ```
400
442
"""
401
443
function hinge_loss (ŷ, y; agg = mean)
402
444
_check_sizes (ŷ, y)
407
449
squared_hinge_loss(ŷ, y)
408
450
409
451
Return the squared hinge_loss loss given the prediction `ŷ` and true labels `y`
410
- (containing 1 or -1); calculated as `sum((max.(0, 1 .- ŷ .* y)).^2) / size(y, 2)`.
452
+ (containing 1 or -1); calculated as
411
453
454
+ `sum((max.(0, 1 .- ŷ .* y)).^2) / size(y, 2)`.
455
+
456
+ Usually used with classifiers like Support Vector Machines.
412
457
See also: [`hinge_loss`](@ref)
458
+
459
+ # Example
460
+ ```jldoctes
461
+ julia> y_true = [1, -1, 1, 1];
462
+
463
+ julia> y_pred = [0.1, 0.3, 1, 1.5];
464
+
465
+ julia> Flux.squared_hinge_loss(y_pred, y_true)
466
+ 0.625
467
+
468
+ julia> Flux.squared_hinge_loss(y_pred[1], y_true[1]) # same sign but |ŷ| < 1
469
+ 0.81
470
+
471
+ julia> Flux.squared_hinge_loss(y_pred[end], y_true[end]) # same sign and |ŷ| >= 1 -> loss = 0
472
+ 0.0
473
+
474
+ julia> Flux.squared_hinge_loss(y_pred[2], y_true[2]) # opposite signs -> loss != 0
475
+ 1.6900000000000002
476
+ ```
413
477
"""
414
478
function squared_hinge_loss (ŷ, y; agg = mean)
415
479
_check_sizes (ŷ, y)
422
486
Return a loss based on the dice coefficient.
423
487
Used in the [V-Net](https://arxiv.org/abs/1606.04797) image segmentation
424
488
architecture.
425
- Similar to the F1_score. Calculated as:
489
+ The dice coefficient is similar to the F1_score. Loss calculated as:
426
490
427
491
1 - 2*sum(|ŷ .* y| + smooth) / (sum(ŷ.^2) + sum(y.^2) + smooth)
492
+
493
+ # Example
494
+ ```jldoctest
495
+ julia> y_pred = [1.1, 2.1, 3.1];
496
+
497
+ julia> Flux.dice_coeff_loss(y_pred, 1:3)
498
+ 0.000992391663909964
499
+
500
+ julia> 1 - Flux.dice_coeff_loss(y_pred, 1:3) # ~ F1 score for image segmentation
501
+ 0.99900760833609
502
+ ```
428
503
"""
429
504
function dice_coeff_loss (ŷ, y; smooth = ofeltype (ŷ, 1.0 ))
430
505
_check_sizes (ŷ, y)
@@ -438,7 +513,23 @@ Return the [Tversky loss](https://arxiv.org/abs/1706.05721).
438
513
Used with imbalanced data to give more weight to false negatives.
439
514
Larger β weigh recall more than precision (by placing more emphasis on false negatives)
440
515
Calculated as:
516
+
441
517
1 - sum(|y .* ŷ| + 1) / (sum(y .* ŷ + β*(1 .- y) .* ŷ + (1 - β)*y .* (1 .- ŷ)) + 1)
518
+
519
+ # Example
520
+ ```jldoctest
521
+ julia> ŷ = [1, 0, 1, 1, 0];
522
+
523
+ julia> y = [1, 0, 0, 1, 0]; # one false negative data point
524
+
525
+ julia> Flux.tversky_loss(ŷ, y)
526
+ 0.18918918918918926
527
+
528
+ julia> y = [1, 1, 1, 1, 0]; # No false negatives, but a false positive
529
+
530
+ julia> Flux.tversky_loss(ŷ, y) # loss is smaller as more weight given to the false negatives
531
+ 0.06976744186046513
532
+ ```
442
533
"""
443
534
function tversky_loss (ŷ, y; β = ofeltype (ŷ, 0.7 ))
444
535
_check_sizes (ŷ, y)
@@ -456,6 +547,8 @@ The input, 'ŷ', is expected to be normalized (i.e. [softmax](@ref Softmax) out
456
547
457
548
For `γ == 0`, the loss is mathematically equivalent to [`Losses.binarycrossentropy`](@ref).
458
549
550
+ See also: [`Losses.focal_loss`](@ref) for multi-class setting
551
+
459
552
# Example
460
553
```jldoctest
461
554
julia> y = [0 1 0
@@ -473,9 +566,6 @@ julia> ŷ = [0.268941 0.5 0.268941
473
566
julia> Flux.binary_focal_loss(ŷ, y) ≈ 0.0728675615927385
474
567
true
475
568
```
476
-
477
- See also: [`Losses.focal_loss`](@ref) for multi-class setting
478
-
479
569
"""
480
570
function binary_focal_loss (ŷ, y; agg= mean, γ= 2 , ϵ= epseltype (ŷ))
481
571
_check_sizes (ŷ, y)
@@ -536,7 +626,17 @@ which can be useful for training Siamese Networks. It is given by
536
626
agg(@. (1 - y) * ŷ^2 + y * max(0, margin - ŷ)^2)
537
627
538
628
Specify `margin` to set the baseline for distance at which pairs are dissimilar.
539
-
629
+
630
+ # Example
631
+ ```jldoctest
632
+ julia> ŷ = [0.5, 1.5, 2.5];
633
+
634
+ julia> Flux.siamese_contrastive_loss(ŷ, 1:3)
635
+ -4.833333333333333
636
+
637
+ julia> Flux.siamese_contrastive_loss(ŷ, 1:3, margin = 2)
638
+ -4.0
639
+ ```
540
640
"""
541
641
function siamese_contrastive_loss (ŷ, y; agg = mean, margin:: Real = 1 )
542
642
_check_sizes (ŷ, y)
0 commit comments