Recommendation to Enrich Computer Vision Part Content with Latest Research

Thank you so much for making such wonderful slides.

I am here to recommand to enrich **Module 4, Chapter Computer vision, Page 141-153**, by incorporating insights from the newly published paper titled "[Scalable Pre-Training of Large Autoregressive Image Models](https://arxiv.org/abs/2401.08541).

This paper follow the inital idea of **iGPT**, which is use an autoregressive objective to pre-train vision transformers on image patches, without any supervision or labels but using pixel-level regression loss.

Moreover, AIM (model proposed in this paper)  introduces two architectural modifications - prefix attention and MLP prediction heads and discussed correlation between the pre-training objective and the downstream performance. 

I think this could be great additional information for the iGPT part. If possiable, I can help you with making the slides.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Recommendation to Enrich Computer Vision Part Content with Latest Research #6

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Recommendation to Enrich Computer Vision Part Content with Latest Research #6

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions