Multimodal-Retrieve

Evaluate the performance of multimodal retrieval algorithms with mAP on some datasets.

Algorithms

CCA
PLS
BLM
GMMFA

Datasets

PASCAL VOC 2007. This experiment uses the formative dataset provided by [1].

Wikipedia. http://www.svcl.ucsd.edu/projects/crossmodal/

NUS-WIDE : https://lms.comp.nus.edu.sg/wp-content/uploads/2019/research/nuswide/NUS-WIDE.html

Train

PASCAL

Train with 2808 samples with only one object. The feature used are GIST(visual feature) and word-frequency(tag feature) which were explained in [1].

Wiki

Train with 2173 samples. The images were represented by 128-dimensional vector quantized features with SIFT and the text feature is derived from the latent Dirichlet allocation model with 10 dimensions.

NUS-WIDE

Train with 10000 samples due to the limitation of machine. The feature used are 225-d block-wise color moments(visual feature) and 1000-d word-frequency vector(text feature).

Test and Evaluate

For PASCAL datasets, I evaluate the model with 2841 test samples with only one object.

For Wiki datasets, I evaluate the model with 693 test samples.

For NUS-WIDE datasets, I evaluate the model with 5000 test samples.

Image-to-text : Retrieve related images with text from testset. Return an ordered list, in which each element indicates the index of retrieved image in testset**.**

Text-to-image : Retrieve related text with image from testset. Return an ordered list, in which each element indicates the index of retrieved text in testset**.**

In evaluate part, if the object/class in prediction image is same with the ground_truth, set it as true.

The computing method of mAP following the steps when it used in recommended system research area. Take [2] as reference.

References

[1]. Accounting for the Relative Importance of Objects in Image Retrieval, S.J.Hwang, BMVC 2010 [2]. https://zhuanlan.zhihu.com/p/74429856

Result

Raw Features

Experiment on PASCAL（GIST、Word Frequency）:

	Image-to-text(mAP)	Text-to-image(mAP)
CCA	0.1962	0.1754
PLS	0.2266	0.1879
BLM	0.2419	0.2085
GMMFA	0.2424	0.2089
CCA+PCA	0.2252	0.1958
PLS+PCA	0.2450	0.2015
BLM+PCA	0.2450	0.2045
GMMFA+PCA	0.2465	0.2050

Experiment on Wiki（SIFT、LDA）:

	Image-to-text(mAP)	Text-to-image(mAP)
CCA	0.2435	0.1978
PLS	0.2075	0.1654
BLM	0.2589	0.2008
GMMFA	0.2481	0.1997
CCA+PCA	0.2649	0.2162
PLS+PCA	0.2477	0.2047
BLM+PCA	0.2607	0.2101
GMMFA+PCA	0.2471	0.2006

Experiment on NUS-WIDE(block-wise color moments, Word Frequency)

	Image-to-text(mAP)	Text-to-image(mAP)
CCA	0.2316	0.2372
PLS	0.2194	0.2246
BLM	0.2519	0.2510
GMMFA	0.2503	0.2440
CCA+PCA	0.2261	0.2391
PLS+PCA	0.2331	0.2363
BLM+PCA	0.2470	0.2510
GMMFA+PCA	0.2507	0.2442

Deep Features

Experiment on PASCAL(VGG19、Word Frequency)

	Image-to-text(mAP)	Text-to-image(mAP)
CCA	0.3552	0.3382
PLS	0.6611	0.6986
BLM	0.6355	0.6381
GMMFA	0.6374	0.6403

Experiment on WIKIPEDIA(VGG19、LDA)

	Image-to-text(mAP)	Text-to-image(mAP)
CCA	0.3126	0.2814
PLS	0.3879	0.3505
BLM	0.3840	0.3650
GMMFA	0.3950	0.3570

Experiment on NUSWIDE(VGG19、Word Frequency)

	Image-to-text(mAP)	Text-to-image(mAP)
CCA	0.2831	0.2826
PLS	0.4231	0.4110
BLM	0.3949	0.4110
GMMFA	0.4074	0.4095

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
Documents		Documents
common		common
data		data
tools		tools
util		util
README.md		README.md
config.xml		config.xml
draw_PR.m		draw_PR.m
evaluate.m		evaluate.m
initialize.m		initialize.m
main.m		main.m
retrieve.m		retrieve.m

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Multimodal-Retrieve

Algorithms

Datasets

Train

PASCAL

Wiki

NUS-WIDE

Test and Evaluate

References

Result

Raw Features

Experiment on PASCAL（GIST、Word Frequency）:

Experiment on Wiki（SIFT、LDA）:

Experiment on NUS-WIDE(block-wise color moments, Word Frequency)

Deep Features

Experiment on PASCAL(VGG19、Word Frequency)

Experiment on WIKIPEDIA(VGG19、LDA)

Experiment on NUSWIDE(VGG19、Word Frequency)

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

TongJiayan/Multimodal-Retrieve

Folders and files

Latest commit

History

Repository files navigation

Multimodal-Retrieve

Algorithms

Datasets

Train

PASCAL

Wiki

NUS-WIDE

Test and Evaluate

References

Result

Raw Features

Experiment on PASCAL（GIST、Word Frequency）:

Experiment on Wiki（SIFT、LDA）:

Experiment on NUS-WIDE(block-wise color moments, Word Frequency)

Deep Features

Experiment on PASCAL(VGG19、Word Frequency)

Experiment on WIKIPEDIA(VGG19、LDA)

Experiment on NUSWIDE(VGG19、Word Frequency)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages