Rongjiehuang

Hi there 👋

Rongjie Huang (黄融杰) did my Graduate study at College of Computer Science and Software, Zhejiang University, supervised by Prof. Zhou Zhao. I also obtained Bachelor’s degree at Zhejiang University. During my graduate study, I was lucky to collaborate with the CMU Speech Team led by Prof. Shinji Watanabe, and Audio Research Team at Zhejiang University. I was grateful to intern or collaborate at TikTok, Shanghai AI Lab (OpenGV Lab), Tencent Seattle Lab, Alibaba Damo Academic, with Yi Ren, Jinglin Liu, Chunlei Zhang and Dong Yu.

My research interest includes Multi-Modal Generative AI, Multi-Modal Language Processing, and AI4Science. I have published first-author papers at the top international AI conferences such as NeurIPS/ICLR/ICML/ACL/IJCAI.

I am actively looking for academic collaboration, feel free to drop me an email.

📎 Homepages

Personal Pages: https://rongjiehuang.github.io (updated recently🔥)
Linkedin: https://www.linkedin.com/in/rongjie-huang-a362541b2
Google Scholar: https://scholar.google.com/citations?user=iRHBUsgAAAAJ

💻 Selected Research Papers

Generative AI for Speech, Sing, and Audio: Spoken Large Language Model, Text-to-Audio Synthesis, Text-to-Speech Synthesis, Singing Voice Synthesis

Audio-Visual Language Processing: Audio-Visual Speech-to-Speech Translation, Self-Supervised Learning

My full paper list is shown at my personal homepage.

Spoken Large Language Model

InstructSpeech: Following Speech Editing Instructions via Large Language Models. Rongjie Huang, Ruofan Hu, Yongqi Wang, Zehan Wang, Xize Cheng, Ziyue Jiang, Zhenhui Ye, Dongchao Yang, Luping Liu, Peng Gao, Zhou Zhao. ICML, 2024
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head. Rongjie Huang, Mingze Li, Dongchao Yang, Jiatong Shi, Xuankai Chang, Zhenhui Ye, Yuning Wu, Zhiqing Hong, Jiawei Huang, Jinglin Liu, Yi Ren, Zhou Zhao, Shinji Watanabe. AAAI, 2024
UniAudio: An Audio Foundation Model Toward Universal Audio Generation. Dongchao Yang, Jinchuan Tian, Xu Tan, Rongjie Huang, Songxiang Liu, Xuankai Chang, Jiatong Shi, Sheng Zhao, Jiang Bian, Xixin Wu, Zhou Zhao, Shinji Watanabe, Helen Meng. ICML 2024

Video-to-Audio Synthesis

Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models. Rongjie Huang, Jiawei Huang, Dongchao Yang, Yi Ren, Mingze Li, Zhenhui Ye, Jinglin Liu, Xiang Yin, Zhou Zhao. ICML, 2023
Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT. Le Zhuo*, Ruoyi Du*, Han Xiao*, Yangguang Li*, Dongyang Liu*, Rongjie Huang*, Wenze Liu, Lirui Zhao, Fu-Yun Wang, Zhanyu Ma, Xu Luo, Zehan Wang, Kaipeng Zhang, Xiangyang Zhu, Si Liu, Xiangyu Yue, Dingning Liu, Wanli Ouyang, Ziwei Liu, Yu Qiao, Hongsheng Li, Peng Gao. ICLR, 2024
Make-An-Audio 2: Improving Text-to-Audio with Dual Text Information Representation. Jiawei Huang, Yi Ren, Rongjie Huang, Dongchao Yang, Zhenhui Ye, Chen Zhang, Jinglin Liu, Xiang Yin, Zejun Ma, Zhou Zhao. Arxiv, 2023

Audio-Visual Language Processing

Seamless Interaction: Dyadic Audiovisual Motion Modeling and Large-Scale Dataset. FAIR at Meta. Core contributor
TranSpeech: Speech-to-Speech Translation With Bilateral Perturbation. Rongjie Huang, Jinglin Liu, Huadai Liu, Yi Ren, Lichao Zhang, Jinzheng He, and Zhou Zhao. ICLR, 2023
AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation. Rongjie Huang, Huadai Liu, Xize Cheng, Yi Ren, Linjun Li, Zhenhui Ye, Jinzheng He, Lichao Zhang, Jinglin Liu, Xiang Yin and Zhou Zhao. ACL, 2023

Text-to-Speech Synthesis

GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech. Rongjie Huang, Yi Ren, Jinglin Liu, Chenye Cui, and Zhou Zhao. NeurIPS, 2022
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis. Rongjie Huang, Max W.Y. Lam, Jun Wang, Dan Su, Dong Yu, Yi Ren, and Zhou Zhao. IJCAI, 2022(oral)
Multi-Singer: Fast multi-singer singing voice vocoder with a large-scale corpus. Rongjie Huang, Feiyang Chen, Yi Ren, Jinglin Liu, Chenye Cui, and Zhou Zhao. ACM MM, 2021(oral)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rongjiehuang

Achievements

Achievements

Organizations

Block or report Rongjiehuang