zhousheng97

Follow

🐢

Focusing

Sheng Zhou zhousheng97

🐢

Focusing

Follow

Vision-Language.

18 followers · 14 following

Hefei University of Technology
Singapore
07:44 (UTC +08:00)
https://zhousheng97.github.io/

Achievements

Achievements

zhousheng97/README.md

Hi there 👋

👩 I’m Sheng, a PhD student from China, currently studying as a visiting student at the National University of Singapore.
🧐 My focus is multimodal learning, especially VQA, and I’m currently exploring multimodal LLMs.
💬 As an ENFJ-A, I thrive on meaningful collaboration and communication.
📫 You can reach me at hzgn97@gmail.com—let’s connect!

Pinned Loading

EgoTextVQA EgoTextVQA Public

[CVPR'25] 🌟🌟 EgoTextVQA: Towards Egocentric Scene-Text Aware Video Question Answering

Python 40 1
ViTXT-GQA ViTXT-GQA Public

[IEEE TMM'25] Scene-Text Grounding for Text-Based Video Question Answering

Python 16 1
Awesome-MLLM-TextVQA Awesome-MLLM-TextVQA Public

✨✨Latest Research on Multimodal Large Language Models on Scene-Text VQA Tasks

9
GPIN GPIN Public

[ACM TOMM'24] Graph Pooling Inference Network for Text-based VQA

Python 3
SSGN SSGN Public

[IEEE TIP'23] Exploring Sparse Spatial Relation in Graph Inference for Text-Based VQA

Python 4