-
Notifications
You must be signed in to change notification settings - Fork 10
Linkers evaluation
Andrea Tupini edited this page May 9, 2019
·
12 revisions
- run: April 11 2019 on
soweego-1VPS instance; - output folder:
/srv/dev/20190411; - head commit: 1505429997b878568a9e24185dc3afa7ad4720eb;
- command:
python -m soweego linker evaluate ${Algorithm} ${Dataset} ${Entity}; - evaluation technique: stratified 5-fold cross validation over training/test splits;
- mean performance scores over the folds.
- Naïve Bayes (NB):
- binarize = 0.1;
- alpha = 0.0001;
-
liblinearSVM (LSVM): default parameters as per scikit LinearSVC; -
libsvmSVM (SVM):- kernel = linear;
- other parameters as per scikit SVC defaults;
- single-layer perceptron (SLP):
- layer = fully connected (
Dense); - activation = sigmoid;
- optimizer = stochastic gradient descent;
- loss = binary cross-entropy;
- training batch size = 1,024;
- training epochs = 100.
- layer = fully connected (
- multi-layer perceptron (MLP):
- layers = 128 > BN > 32 > BN > 1
- fully connected layers followed by BatchNormalization (BN)
- activation:
- hidden layers = relu;
- output layer = sigmoid;
- optimizer = Adadelta;
- loss = binary cross-entropy
- training batch size = 1,024;
- training epochs = 1000;
- early stopping:
- patience = 100;
- layers = 128 > BN > 32 > BN > 1
| Algorithm | Dataset | Entity | Precision (std) | Recall (std) | F-score (std) |
|---|---|---|---|---|---|
| NB | Discogs | Band | .789 (.0031) | .941 (.0004) | .859 (.002) |
| LSVM | Discogs | Band | .785 (.0058) | .946 (.0029) | .858 (.0034) |
| SVM | Discogs | Band | .777 (.003) | .963 (.0016) | .86 (.0024) |
| SLP | Discogs | Band | .776 (.0041) | .956 (.0012) | .857 (.0029) |
| NB | Discogs | Musician | .836 (.0018) | .958 (.0012) | .893 (.0013) |
| SVM | Discogs | Musician | .814 (.0015) | .986 (.0003) | .892 (.001) |
| SLP | Discogs | Musician | .815 (.002) | .985 (.0006) | .892 (.0012) |
| NB | IMDb | Actor | TODO | TODO | TODO |
| SVM | IMDb | Actor | TODO | TODO | TODO |
| SLP | IMDb | Actor | TODO | TODO | TODO |
| MLP | IMDb | Actor | TODO | TODO | TODO |
| NB | IMDb | Director | .897 (.00195) | .971 (.0012) | .932 (.001) |
| SVM | IMDb | Director | .919 (.0031) | .942 (.0019) | .93 (.002) |
| SLP | IMDb | Director | .867 (.0115) | .953 (.0043) | .908 (.0056) |
| NB | IMDb | Musician | .891 (.0042) | .96 (.0022) | .924 (.0026) |
| SVM | IMDb | Musician | .917 (.0043) | .937 (.0034) | .927 (.003) |
| SLP | IMDb | Musician | .922 (.005) | .914 (.0092) | .918 (.0055) |
| NB | IMDb | Producer | .871 (.0023) | .97 (.0037) | .918 (.0011) |
| SVM | IMDb | Producer | .92 (.005) | .938 (.0038) | .929 (.0026) |
| SLP | IMDb | Producer | .862 (.0609) | .914 (.0648) | .883 (.0185) |
| NB | IMDb | Writer | .91 (.003) | .961 (.0022) | .935 (.0022) |
| SVM | IMDb | Writer | .936 (.0029) | .948 (.0025) | .942 (.0026) |
| SLP | IMDb | Writer | .903 (.0154) | .955 (.0147) | .928 (.0047) |
| NB | MusicBrainz | Band | .822 (.00169) | .985 (.0008) | .896 (.001) |
| SVM | MusicBrainz | Band | .943 (.0019) | .888 (.0027) | .914 (.0016) |
| SLP | MusicBrainz | Band | .93 (.0265) | .885 (.0103) | .907 (.0082) |
| NB | MusicBrainz | Musician | .955 (.0009) | .936 (.0011) | .946 (.00068) |
| SVM | MusicBrainz | Musician | .941 (.0011) | .962 (.001) | .952 (.0004) |
| SLP | MusicBrainz | Musician | .943 (.0018) | .956 (.0019) | .949 (.0007) |
The following plots display the confidence scores distribution and the total predictions yielded by each algorithm on each target classification set.
Note that linear SVM is omitted since it does not output probability scores.
Axes:
- x = # predictions;
- y = confidence score.
See the plots above to have a rough idea on the amount of confident predictions.
Threshold values:
- # predictions >= 0.0000000001, i.e., equivalent to almost all matches;
- # confident >= 0.8.
WD items: 50,316
| Measure | NB | LSVM | SVM | SLP | MLP |
|---|---|---|---|---|---|
| Precision | .789 | .785 | .777 | .776 | .833 |
| Recall | .941 | .946 | .963 | .957 | .914 |
| F-score | .859 | .858 | .86 | .857 | .872 |
| # predictions | 820 | 51 | 94,430 | 91,295 | 91,132 |
| # confident | 219 | N.A. | 1,660 | 5,355 | 11,114 |
WD items: 199,180
| Measure | NB | LSVM | SVM | SLP | MLP |
|---|---|---|---|---|---|
| Precision | .836 | .814 | .815 | .815 | .849 |
| Recall | .958 | .986 | .985 | .985 | .961 |
| F-score | .893 | .892 | .892 | .892 | .902 |
| # predictions | 3,872 | 200 | 533,301 | 517,450 | 514,488 |
| # confident | 1,101 | N.A. | 98,172 | 58,437 | 57,184 |
WD items: 9,249
| Measure | NB | LSVM | SVM | SLP | MLP |
|---|---|---|---|---|---|
| Precision | .897 | .919 | .908 | .867 | .916 |
| Recall | .971 | .942 | .958 | .953 | .961 |
| F-score | .932 | .93 | .932 | .908 | .938 |
| # predictions | 192 | 10 | 17,557 | 17,187 | 16,881 |
| # confident | 60 | N.A. | 1,616 | 553 | 1,810 |
WD items: 217,139
| Measure | NB | LSVM | SVM | SLP | MLP |
|---|---|---|---|---|---|
| Precision | .891 | .917 | .908 | .922 | .903 |
| Recall | .96 | .937 | .942 | .914 | .951 |
| F-score | .924 | .927 | .924 | .918 | .926 |
| # predictions | 4,806 | 218 | 406,674 | 398,346 | 376,857 |
| # confident | 1,341 | N.A. | 21,462 | 7,244 | 16,272 |
WD items: 2,251
| Measure | NB | LSVM | SVM | SLP | MLP |
|---|---|---|---|---|---|
| Precision | .871 | .92 | .923 | .862 | .912 |
| Recall | .97 | .938 | .926 | .914 | .956 |
| F-score | .918 | .929 | .925 | .883 | .933 |
| # predictions | 56 | 3 | 5,249 | 5,116 | 5,094 |
| # confident | 15 | N.A. | 507 | 180 | 529 |
WD items: 16,446
| Measure | NB | LSVM | SVM | SLP | MLP |
|---|---|---|---|---|---|
| Precision | .91 | .936 | .932 | .903 | .921 |
| Recall | .961 | .948 | .954 | .955 | .962 |
| F-score | .935 | .942 | .943 | .928 | .941 |
| # predictions | 428 | 17 | 45,122 | 44,338 | 43,868 |
| # confident | 138 | N.A. | 2,934 | 1,548 | 3,234 |
WD items: 32,658
| Measure | NB | LSVM | SVM | SLP | MLP |
|---|---|---|---|---|---|
| Precision | .822 | .943 | .939 | .93 | .933 |
| Recall | .985 | .888 | .893 | .885 | .902 |
| F-score | .896 | .914 | .915 | .907 | .918 |
| # predictions | 265 | 33 | 39,618 | 38,012 | 33,981 |
| # confident | 46 | N.A. | 1,475 | 501 | 1,506 |
WD items: 153,725
| Measure | NB | LSVM | SVM | SLP | MLP |
|---|---|---|---|---|---|
| Precision | .955 | .941 | .95 | .943 | .940 |
| Recall | .936 | .962 | .938 | .956 | .968 |
| F-score | .946 | .952 | .944 | .949 | .954 |
| # predictions | 2,833 | 154 | 280,029 | 260,530 | 194,505 |
| # confident | 1,212 | N.A. | 7,496 | 7,339 | 8,470 |