πŸ‘€ About me

Hi, I am Erli Zhang. I am a final-year undergraduate at Nanyang Technological University πŸ‡ΈπŸ‡¬, majoring in Computer Science. My current research interests include Multimodal Large Language Model, Visual Quality Assessment and AI in Healthcare. I aspire to contribute significantly to the academic community, looking forward to joining a PhD program where I can further explore this field.

πŸ“¬ Contact Me

  • Email: zhangerlicarl@gmail.com or ezhang005@e.ntu.edu.sg
  • Twitter: @zhang_erli

πŸ”₯ News

  • 2024.01.16: Β πŸŽ‰πŸŽ‰ Q-Bench get accepted by ICLR2024 (spotlight)!
  • 2023.07.26: Β πŸŽ‰πŸŽ‰ MaxVQA get accepted by ACMMM2023 (oral)!
  • 2023.07.14: Β πŸŽ‰πŸŽ‰ DOVER get accepted by ICCV2023!

πŸ“ Publications

Preprint
sym

Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models

Haoning Wu*, Zicheng Zhang*, Erli Zhang*, Chaofeng Chen, Liang Liao, Annan Wang, Kaixin Xu, Chunyi Li, Jingwen Hou, Guangtao Zhai, Geng Xue, Wenxiu Sun, Qiong Yan, Weisi Lin

  • We construct the Q-Instruct, the first instruction tuning dataset that focuses on human queries related to low-level vision.
  • We have now supported to run the Q-Instruct demos on your own device! See local demos for instructions. (Now support mplug_owl-2 only)
Preprint
sym

Q-Bench: A Benchmark for General-Purpose Foundation Models on Low-Level Vision

Haoning Wu*, Zicheng Zhang*, Erli Zhang*, Chaofeng Chen, Liang Liao, Annan Wang, Chunyi Li, Wenxiu Sun, Qiong Yan, Guangtao Zhai, Weisi Lin

  • We construct the Q-Bench, a benchmark to examine the progress of MLLMs on low-level visual abilities. Anticipating these large foundation models to be general-purpose intelligence that can ultimately relieve human efforts, we propose that MLLMs should achieve three important and distinct abilities: perception on low-level visual attributes, language description on low-level visual information, as well as IQA.
  • Submit your model at our project page to compete with existing ones!
ACMMM 2023
sym

Towards Explainable Video Quality Assessment: A Database and a Language-Prompted Approach

Haoning Wu*, Erli Zhang*, Liang Liao, Chaofeng Chen, Jingwen Hou, Annan Wang, Wenxiu Sun, Qiong Yan, Weisi Lin

GitHub , ArXiv

  • We collect over two million human opinions on 13 dimensions of quality-related factors to establish the multi-dimensional Maxwell database. Furthermore, we propose the MaxVQA, a language-prompted VQA approach that modifies CLIP to better capture important quality issues as observed in our analyses.
ICCV 2023
sym

Exploring Video Quality Assessment on User Generated Contents from Aesthetic and Technical Perspectives

Haoning Wu*, Erli Zhang*, Liang Liao*, Chaofeng Chen, Jingwen Hou, Annan Wang, Wenxiu Sun, Qiong Yan, Weisi Lin

GitHub , ArXiv

  • The proposed Disentangled Objective Video Quality Evaluator (DOVER) reached state-of-the-art performance (0.91 SRCC for KoNViD-1k, 0.89 SRCC for LSVQ, 0.89 SRCC for YouTube-UGC) in the UGC-VQA problem. More importantly, our subjective studies construct the first aesthetic and technical VQA database, the DIVIDE-3k, proving that UGC-VQA is jointly affected by the two perspectives.

πŸ“– Educations

  • 2020.08.10 - 2024.05.30 (expected), Undergraduate Student, Major in Computer Science, Nanyang Technological University
    • Specialization: Artificial Intelligence & Data Science
    • final year project supervised by Prof. Weisi Lin
    • Research Topic: Explainable Visual Quality Assessments.
  • 2021.08.10 - 2021.12.01, SUSEP Exchange Student, National University of Singapore

πŸŽ– Honors and Awards

  • 2022.7 CFAR Internship Award for Research Excellence
  • 2019.6 NTU Science and Engineering Undergraduate Scholarship

πŸ’» Internships and Projects

  • July 2023-Present, Center for Cognition, Vision, and Learning, Johns Hopkins University, Research Student
    • Supervisor: Prof Alan L. Yuille
    • Evaluate how the robustness of a sequential learning model changes with every new task relative to jointly trained neural models
    • Adapt current robustness methods to continual learning setups and analyse whether they improve model robustness when learning continually
  • May 2023-July 2023, Sunstella Foundation, Summer Research Scholar
    • Supervisor: Prof Jimeng Sun
    • Worked on MedBind, an AI model combining multiple modalities to generate synthetic patient records to enhance clinical research
    • Contributed to PyHealth, a comprehensive deep learning toolkit for supporting clinical predictive modelling
  • July 2022-May 2023, Institute for Infocomm Research, AI Research Engineer
    • Supervisor: Dr Weimin Huang
    • Conducted insightful research into the field of medical image processing, specifically in mammogram analysis
    • Developed a model using weakly semi-supervised learning and transformers to predict breast cancer risk at multiple time points based on traditional mammograms and common risk factors and clinical data
  • July 2021-May 2022, Undergraduate Research Experience on Campus, Nanyang Technological University, URECA Research Student
    • Supervisor: Prof Weisi Lin
    • Identified common factors that lead to bias in facial analysis, e.g., occlusions, pose variation, expressions, etc.
    • Evaluated current state-of-the-art face recognition methods on various datasets with bias
    • Compared common feature detection and description techniques in occluded datasets