Skip to content

Latest commit

 

History

History
25 lines (19 loc) · 1.33 KB

index.md

File metadata and controls

25 lines (19 loc) · 1.33 KB
layout
homepage

About Me

I am a Ph.D. student at Xidian University and advised by Prof. Bo Chen. My long-term research goal is to build an explainable multi-modality cognition system in which machines can reason, make logical decisions, and generate content like humans.

Research Interests

My current research lies in multi-modality understanding and generation. My research areas involve:

  • Image-to-text understanding: VLM, visual captioning, VQA, prompt learning in vision-and-language models
  • Knowledge-aware machine learning: retrieval-augmented generation, knowledge-enhancement
  • Remote sensing foundation models: vision foundation models for multi-modality remote sensing tasks
  • Cross-modality image synthesizing: SAR-to-optical generation

News

  • [July. 2024] Our paper about image captioning evaluation is accepted to ACM MM2024!
  • [Mar. 2024] Our paper about memory-augmented image captioning is accepted to CVPR2024!
  • [July. 2023] Our paper about multi-Label image classification is accepted to ICCV2023!
  • [Mar. 2023] Our paper about zero-shot image captioning is accepted to CVPR2023!
  • [Feb. 2022] Our paper about image paragraphing is accepted to IJCV2022!

{% include_relative _includes/publications.md %}

{% include_relative _includes/services.md %}