layout
homepage

About Me

I am a Ph.D. student at Xidian University and advised by Prof. Bo Chen. My long-term research goal is to build an explainable multi-modality cognition system in which machines can reason, make logical decisions, and generate content like humans.

Research Interests

My current research lies in multi-modality understanding and generation. My research areas involve:

Image-to-text understanding: VLM, visual captioning, VQA, prompt learning in vision-and-language models
Knowledge-aware machine learning: retrieval-augmented generation, knowledge-enhancement
Remote sensing foundation models: vision foundation models for multi-modality remote sensing tasks
Cross-modality image synthesizing: SAR-to-optical generation

News

[July. 2024] Our paper about image captioning evaluation is accepted to ACM MM2024!
[Mar. 2024] Our paper about memory-augmented image captioning is accepted to CVPR2024!
[July. 2023] Our paper about multi-Label image classification is accepted to ICCV2023!
[Mar. 2023] Our paper about zero-shot image captioning is accepted to CVPR2023!
[Feb. 2022] Our paper about image paragraphing is accepted to IJCV2022!

{% include_relative _includes/publications.md %}

{% include_relative _includes/services.md %}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

index.md

index.md

About Me

Research Interests

News

Files

index.md

Latest commit

History

index.md

File metadata and controls

About Me

Research Interests

News