π€ About me
Hi, I am Erli Zhang. I am a final-year undergraduate at Nanyang Technological University πΈπ¬, majoring in Computer Science. My current research interests include Multimodal Large Language Model, Visual Quality Assessment and AI in Healthcare. I aspire to contribute significantly to the academic community, looking forward to joining a PhD program where I can further explore this field.
π¬ Contact Me
- Email: zhangerlicarl@gmail.com or ezhang005@e.ntu.edu.sg
- Twitter: @zhang_erli
π₯ News
- 2024.01.16: Β ππ Q-Bench get accepted by ICLR2024 (spotlight)!
- 2023.07.26: Β ππ MaxVQA get accepted by ACMMM2023 (oral)!
- 2023.07.14: Β ππ DOVER get accepted by ICCV2023!
π Publications
Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models
Haoning Wu*, Zicheng Zhang*, Erli Zhang*, Chaofeng Chen, Liang Liao, Annan Wang, Kaixin Xu, Chunyi Li, Jingwen Hou, Guangtao Zhai, Geng Xue, Wenxiu Sun, Qiong Yan, Weisi Lin
- We construct the Q-Instruct, the first instruction tuning dataset that focuses on human queries related to low-level vision.
- We have now supported to run the Q-Instruct demos on your own device! See local demos for instructions. (Now support mplug_owl-2 only)
Q-Bench: A Benchmark for General-Purpose Foundation Models on Low-Level Vision
Haoning Wu*, Zicheng Zhang*, Erli Zhang*, Chaofeng Chen, Liang Liao, Annan Wang, Chunyi Li, Wenxiu Sun, Qiong Yan, Guangtao Zhai, Weisi Lin
- We construct the Q-Bench, a benchmark to examine the progress of MLLMs on low-level visual abilities. Anticipating these large foundation models to be general-purpose intelligence that can ultimately relieve human efforts, we propose that MLLMs should achieve three important and distinct abilities: perception on low-level visual attributes, language description on low-level visual information, as well as IQA.
- Submit your model at our project page to compete with existing ones!
Towards Explainable Video Quality Assessment: A Database and a Language-Prompted Approach
Haoning Wu*, Erli Zhang*, Liang Liao, Chaofeng Chen, Jingwen Hou, Annan Wang, Wenxiu Sun, Qiong Yan, Weisi Lin
- We collect over two million human opinions on 13 dimensions of quality-related factors to establish the multi-dimensional Maxwell database. Furthermore, we propose the MaxVQA, a language-prompted VQA approach that modifies CLIP to better capture important quality issues as observed in our analyses.
Haoning Wu*, Erli Zhang*, Liang Liao*, Chaofeng Chen, Jingwen Hou, Annan Wang, Wenxiu Sun, Qiong Yan, Weisi Lin
- The proposed Disentangled Objective Video Quality Evaluator (DOVER) reached state-of-the-art performance (0.91 SRCC for KoNViD-1k, 0.89 SRCC for LSVQ, 0.89 SRCC for YouTube-UGC) in the UGC-VQA problem. More importantly, our subjective studies construct the first aesthetic and technical VQA database, the DIVIDE-3k, proving that UGC-VQA is jointly affected by the two perspectives.
π Educations
- 2020.08.10 - 2024.05.30 (expected), Undergraduate Student, Major in Computer Science, Nanyang Technological University
- Specialization: Artificial Intelligence & Data Science
- final year project supervised by Prof. Weisi Lin
- Research Topic: Explainable Visual Quality Assessments.
- 2021.08.10 - 2021.12.01, SUSEP Exchange Student, National University of Singapore
π Honors and Awards
- 2022.7 CFAR Internship Award for Research Excellence
- 2019.6 NTU Science and Engineering Undergraduate Scholarship
π» Internships and Projects
- July 2023-Present, Center for Cognition, Vision, and Learning, Johns Hopkins University, Research Student
- Supervisor: Prof Alan L. Yuille
- Evaluate how the robustness of a sequential learning model changes with every new task relative to jointly trained neural models
- Adapt current robustness methods to continual learning setups and analyse whether they improve model robustness when learning continually
- May 2023-July 2023, Sunstella Foundation, Summer Research Scholar
- Supervisor: Prof Jimeng Sun
- Worked on MedBind, an AI model combining multiple modalities to generate synthetic patient records to enhance clinical research
- Contributed to PyHealth, a comprehensive deep learning toolkit for supporting clinical predictive modelling
- July 2022-May 2023, Institute for Infocomm Research, AI Research Engineer
- Supervisor: Dr Weimin Huang
- Conducted insightful research into the field of medical image processing, specifically in mammogram analysis
- Developed a model using weakly semi-supervised learning and transformers to predict breast cancer risk at multiple time points based on traditional mammograms and common risk factors and clinical data
- July 2021-May 2022, Undergraduate Research Experience on Campus, Nanyang Technological University, URECA Research Student
- Supervisor: Prof Weisi Lin
- Identified common factors that lead to bias in facial analysis, e.g., occlusions, pose variation, expressions, etc.
- Evaluated current state-of-the-art face recognition methods on various datasets with bias
- Compared common feature detection and description techniques in occluded datasets