`Awesome Remote Sensing Foundation Models`

🌟A collection of papers, datasets, benchmarks, code, and pre-trained weights for Remote Sensing Foundation Models (RSFMs).

📢 Latest Updates

🔥🔥🔥 Last Updated on 2024.04.03 🔥🔥🔥

2024.4.03: Update SAMRS and msGFM.
2024.4.01: Update PIS and H2RSVLM.
2024.3.27: Update Remote Sensing Task-specific Foundation Models and LuoJiaHOG.
2024.3.25: Update DOFA.

Remote Sensing Vision Foundation Models

Abbreviation	Title	Publication	Paper	Code & Weights
GeoKR	Geographical Knowledge-Driven Representation Learning for Remote Sensing Images	TGRS2021	GeoKR	link
-	Self-Supervised Learning of Remote Sensing Scene Representations Using Contrastive Multiview Coding	CVPRW2021	Paper	link
GASSL	Geography-Aware Self-Supervised Learning	ICCV2021	GASSL	link
SeCo	Seasonal Contrast: Unsupervised Pre-Training From Uncurated Remote Sensing Data	ICCV2021	SeCo	link
DINO-MM	Self-supervised Vision Transformers for Joint SAR-optical Representation Learning	IGARSS2022	DINO-MM	link
SatMAE	SatMAE: Pre-training Transformers for Temporal and Multi-Spectral Satellite Imagery	NeurIPS2022	SatMAE	link
RS-BYOL	Self-Supervised Learning for Invariant Representations From Multi-Spectral and SAR Images	JSTARS2022	RS-BYOL	null
GeCo	Geographical Supervision Correction for Remote Sensing Representation Learning	TGRS2022	GeCo	null
RingMo	RingMo: A remote sensing foundation model with masked image modeling	TGRS2022	RingMo	Code
RVSA	Advancing plain vision transformer toward remote sensing foundation model	TGRS2022	RVSA	link
RSP	An Empirical Study of Remote Sensing Pretraining	TGRS2022	RSP	link
MATTER	Self-Supervised Material and Texture Representation Learning for Remote Sensing Tasks	CVPR2022	MATTER	null
CSPT	Consecutive Pre-Training: A Knowledge Transfer Learning Strategy with Relevant Unlabeled Data for Remote Sensing Domain	RS2022	CSPT	link
-	Self-supervised Vision Transformers for Land-cover Segmentation and Classification	CVPRW2022	Paper	link
BFM	A billion-scale foundation model for remote sensing images	Arxiv2023	BFM	null
TOV	TOV: The original vision model for optical remote sensing image understanding via self-supervised learning	JSTARS2023	TOV	link
CMID	CMID: A Unified Self-Supervised Learning Framework for Remote Sensing Image Understanding	TGRS2023	CMID	link
RingMo-Sense	RingMo-Sense: Remote Sensing Foundation Model for Spatiotemporal Prediction via Spatiotemporal Evolution Disentangling	TGRS2023	RingMo-Sense	null
IaI-SimCLR	Multi-Modal Multi-Objective Contrastive Learning for Sentinel-1/2 Imagery	CVPRW2023	IaI-SimCLR	null
CACo	Change-Aware Sampling and Contrastive Learning for Satellite Images	CVPR2023	CACo	link
SatLas	SatlasPretrain: A Large-Scale Dataset for Remote Sensing Image Understanding	ICCV2023	SatLas	link
GFM	Towards Geospatial Foundation Models via Continual Pretraining	ICCV2023	GFM	link
Scale-MAE	Scale-MAE: A Scale-Aware Masked Autoencoder for Multiscale Geospatial Representation Learning	ICCV2023	Scale-MAE	link
DINO-MC	DINO-MC: Self-supervised Contrastive Learning for Remote Sensing Imagery with Multi-sized Local Crops	Arxiv2023	DINO-MC	link
CROMA	CROMA: Remote Sensing Representations with Contrastive Radar-Optical Masked Autoencoders	NeurIPS2023	CROMA	link
Cross-Scale MAE	Cross-Scale MAE: A Tale of Multiscale Exploitation in Remote Sensing	NeurIPS2023	Cross-Scale MAE	link
DeCUR	DeCUR: decoupling common & unique representations for multimodal self-supervision	Arxiv2023	DeCUR	link
Presto	Lightweight, Pre-trained Transformers for Remote Sensing Timeseries	Arxiv2023	Presto	link
CtxMIM	CtxMIM: Context-Enhanced Masked Image Modeling for Remote Sensing Image Understanding	Arxiv2023	CtxMIM	null
FG-MAE	Feature Guided Masked Autoencoder for Self-supervised Learning in Remote Sensing	Arxiv2023	FG-MAE	link
Prithvi	Foundation Models for Generalist Geospatial Artificial Intelligence	Arxiv2023	Prithvi	link
RingMo-lite	RingMo-lite: A Remote Sensing Multi-task Lightweight Network with CNN-Transformer Hybrid Framework	Arxiv2023	RingMo-lite	null
-	A Self-Supervised Cross-Modal Remote Sensing Foundation Model with Multi-Domain Representation and Cross-Domain Fusion	IGARSS2023	Paper	null
EarthPT	EarthPT: a foundation model for Earth Observation	NeurIPS2023 CCAI workshop	EarthPT	link
USat	USat: A Unified Self-Supervised Encoder for Multi-Sensor Satellite Imagery	Arxiv2023	USat	link
FoMo-Bench	FoMo-Bench: a multi-modal, multi-scale and multi-task Forest Monitoring Benchmark for remote sensing foundation models	Arxiv2023	FoMo-Bench	Comming soon
AIEarth	Analytical Insight of Earth: A Cloud-Platform of Intelligent Computing for Geospatial Big Data	Arxiv2023	AIEarth	link
-	Self-Supervised Learning for SAR ATR with a Knowledge-Guided Predictive Architecture	Arxiv2023	Paper	null
Clay	Clay Foundation Model	-	null	link
Hydro	Hydro--A Foundation Model for Water in Satellite Imagery	-	null	link
U-BARN	Self-Supervised Spatio-Temporal Representation Learning of Satellite Image Time Series	JSTARS2024	Paper	null
GeRSP	Generic Knowledge Boosted Pre-training For Remote Sensing Images	Arxiv2024	GeRSP	GeRSP
SwiMDiff	SwiMDiff: Scene-wide Matching Contrastive Learning with Diffusion Constraint for Remote Sensing Image	Arxiv2024	SwiMDiff	null
OFA-Net	One for All: Toward Unified Foundation Models for Earth Vision	Arxiv2024	OFA-Net	null
SMLFR	Generative ConvNet Foundation Model With Sparse Modeling and Low-Frequency Reconstruction for Remote Sensing Image Interpretation	TGRS2024	SMLFR	link
SpectralGPT	SpectralGPT: Spectral Foundation Model	TPAMI2024	SpectralGPT	link
S2MAE	S2MAE: A Spatial-Spectral Pretraining Foundation Model for Spectral Remote Sensing Data	CVPR2024	S2MAE	null
SatMAE++	Rethinking Transformers Pre-training for Multi-Spectral Satellite Imagery	CVPR2024	SatMAE++	link
msGFM	Bridging Remote Sensors with Multisensor Geospatial Foundation Models	CVPR2024	msGFM	link
SkySense	SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery	CVPR2024	SkySense	Comming soon
MTP	MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining	Arxiv2024	MTP	link
DOFA	Neural Plasticity-Inspired Foundation Model for Observing the Earth Crossing Modalities	Arxiv2024	DOFA	link
PIS	Pretrain A Remote Sensing Foundation Model by Promoting Intra-instance Similarity	-	null	link

Remote Sensing Vision-Language Foundation Models

Abbreviation	Title	Publication	Paper	Code & Weights
RSGPT	RSGPT: A Remote Sensing Vision Language Model and Benchmark	Arxiv2023	RSGPT	link
RemoteCLIP	RemoteCLIP: A Vision Language Foundation Model for Remote Sensing	Arxiv2023	RemoteCLIP	link
GRAFT	Remote Sensing Vision-Language Foundation Models without Annotations via Ground Remote Alignment	ICLR2024	GRAFT	null
-	Charting New Territories: Exploring the Geographic and Geospatial Capabilities of Multimodal LLMs	Arxiv2023	Paper	link
-	Remote Sensing ChatGPT: Solving Remote Sensing Tasks with ChatGPT and Visual Models	Arxiv2024	Paper	link
SkyEyeGPT	SkyEyeGPT: Unifying Remote Sensing Vision-Language Tasks via Instruction Tuning with Large Language Model	Arxiv2024	Paper	link
EarthGPT	EarthGPT: A Universal Multi-modal Large Language Model for Multi-sensor Image Comprehension in Remote Sensing Domain	Arxiv2024	Paper	null
SkyCLIP	SkyScript: A Large and Semantically Diverse Vision-Language Dataset for Remote Sensing	AAAI2024	SkyCLIP	link
GeoChat	GeoChat: Grounded Large Vision-Language Model for Remote Sensing	CVPR2024	GeoChat	link
LHRS-Bot	LHRS-Bot: Empowering Remote Sensing with VGI-Enhanced Large Multimodal Language Model	Arxiv2024	Paper	link
H2RSVLM	H2RSVLM: Towards Helpful and Honest Remote Sensing Large Vision Language Model	Arxiv2024	Paper	link

Remote Sensing Generative Foundation Models

Abbreviation	Title	Publication	Paper	Code & Weights
Seg2Sat	Seg2Sat - Segmentation to aerial view using pretrained diffuser models	Github	null	link
-	Generate Your Own Scotland: Satellite Image Generation Conditioned on Maps	NeurIPSW2023	Paper	link
DiffusionSat	DiffusionSat: A Generative Foundation Model for Satellite Imagery	ICLR2024	DiffusionSat	link
CRS-Diff	CRS-Diff: Controllable Generative Remote Sensing Foundation Model	Arxiv2024	Paper	null

Remote Sensing Vision-Location Foundation Models

Abbreviation	Title	Publication	Paper	Code & Weights
CSP	CSP: Self-Supervised Contrastive Spatial Pre-Training for Geospatial-Visual Representations	ICML2023	CSP	link
GeoCLIP	GeoCLIP: Clip-Inspired Alignment between Locations and Images for Effective Worldwide Geo-localization	NeurIPS2023	GeoCLIP	link
SatCLIP	SatCLIP: Global, General-Purpose Location Embeddings with Satellite Imagery	Arxiv2023	SatCLIP	link

Remote Sensing Vision-Audio Foundation Models

Abbreviation	Title	Publication	Paper	Code & Weights
-	Self-supervised audiovisual representation learning for remote sensing data	JAG2022	Paper	link

Remote Sensing Task-specific Foundation Models

Abbreviation	Title	Publication	Paper	Code & Weights	Task
SS-MAE	SS-MAE: Spatial-Spectral Masked Auto-Encoder for Mulit-Source Remote Sensing Image Classification	TGRS2023	Paper	link	Image Classification
TTP	Time Travelling Pixels: Bitemporal Features Integration with Foundation Model for Remote Sensing Image Change Detection	Arxiv2023	Paper	link	Change Detection
CSMAE	Exploring Masked Autoencoders for Sensor-Agnostic Image Retrieval in Remote Sensing	Arxiv2024	Paper	link	Image Retrieval
RSPrompter	RSPrompter: Learning to Prompt for Remote Sensing Instance Segmentation based on Visual Foundation Model	TGRS2024	Paper	link	Instance Segmentation
BAN	A New Learning Paradigm for Foundation Model-based Remote Sensing Change Detection	TGRS2024	Paper	link	Change Detection
-	Change Detection Between Optical Remote Sensing Imagery and Map Data via Segment Anything Model (SAM)	Arxiv2024	Paper	null	Change Detection (Optical & OSM data)
AnyChange	Segment Any Change	Arxiv2024	Paper	null	Zero-shot Change Detection
RS-CapRet	Large Language Models for Captioning and Retrieving Remote Sensing Images	Arxiv2024	Paper	null	Image Caption & Text-image Retrieval
-	Task Specific Pretraining with Noisy Labels for Remote sensing Image Segmentation	Arxiv2024	Paper	null	Image Segmentation (Noisy labels)
RSBuilding	RSBuilding: Towards General Remote Sensing Image Building Extraction and Change Detection with Foundation Model	Arxiv2024	Paper	link	Building Extraction and Change Detection
SAM-Road	Segment Anything Model for Road Network Graph Extraction	Arxiv2024	Paper	link	Road Extraction

Benchmarks for RSFMs

Abbreviation	Title	Publication	Paper	Link	Downstream Tasks
-	Revisiting pre-trained remote sensing model benchmarks: resizing and normalization matters	Arxiv2023	Paper	link	Classification
GEO-Bench	GEO-Bench: Toward Foundation Models for Earth Monitoring	Arxiv2023	Paper	link	Classification & Segmentation
FoMo-Bench	FoMo-Bench: a multi-modal, multi-scale and multi-task Forest Monitoring Benchmark for remote sensing foundation models	Arxiv2023	FoMo-Bench	Comming soon	Classification & Segmentation & Detection for forest monitoring
PhilEO	PhilEO Bench: Evaluating Geo-Spatial Foundation Models	Arxiv2024	Paper	link	Segmentation & Regression estimation
SkySense	SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery	CVPR2024	SkySense	Comming Soon	Classification & Segmentation & Detection & Change detection & Multi-Modal Segmentation: Time-insensitive LandCover Mapping & Multi-Modal Segmentation: Time-sensitive Crop Mapping & Multi-Modal Scene Classification
VLEO-Bench	Good at captioning, bad at counting: Benchmarking GPT-4V on Earth observation data	Arxiv2024	VLEO-bench	link	Location Recognition & Captioning & Scene Classification & Counting & Detection & Change detection

(Large-scale) Pre-training Datasets

Abbreviation	Title	Publication	Paper	Attribute	Link
fMoW	Functional Map of the World	CVPR2018	fMoW	Vision	link
SEN12MS	SEN12MS -- A Curated Dataset of Georeferenced Multi-Spectral Sentinel-1/2 Imagery for Deep Learning and Data Fusion	-	SEN12MS	Vision	link
BEN-MM	BigEarthNet-MM: A Large Scale Multi-Modal Multi-Label Benchmark Archive for Remote Sensing Image Classification and Retrieval	GRSM2021	BEN-MM	Vision	link
MillionAID	On Creating Benchmark Dataset for Aerial Image Interpretation: Reviews, Guidances, and Million-AID	JSTARS2021	MillionAID	Vision	link
SeCo	Seasonal Contrast: Unsupervised Pre-Training From Uncurated Remote Sensing Data	ICCV2021	SeCo	Vision	link
fMoW-S2	SatMAE: Pre-training Transformers for Temporal and Multi-Spectral Satellite Imagery	NeurIPS2022	fMoW-S2	Vision	link
TOV-RS-Balanced	TOV: The original vision model for optical remote sensing image understanding via self-supervised learning	JSTARS2023	TOV	Vision	link
SSL4EO-S12	SSL4EO-S12: A Large-Scale Multi-Modal, Multi-Temporal Dataset for Self-Supervised Learning in Earth Observation	GRSM2023	SSL4EO-S12	Vision	link
SSL4EO-L	SSL4EO-L: Datasets and Foundation Models for Landsat Imagery	Arxiv2023	SSL4EO-L	Vision	link
SatlasPretrain	SatlasPretrain: A Large-Scale Dataset for Remote Sensing Image Understanding	ICCV2023	SatlasPretrain	Vision (Supervised)	link
CACo	Change-Aware Sampling and Contrastive Learning for Satellite Images	CVPR2023	CACo	Vision	Comming soon
SAMRS	SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model	NeurIPS2023	SAMRS	Vision	link
RSVG	RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing Data	TGRS2023	RSVG	Vision-Language	link
RS5M	RS5M: A Large Scale Vision-Language Dataset for Remote Sensing Vision-Language Foundation Model	Arxiv2023	RS5M	Vision-Language	link
GEO-Bench	GEO-Bench: Toward Foundation Models for Earth Monitoring	Arxiv2023	GEO-Bench	Vision (Evaluation)	link
RSICap & RSIEval	RSGPT: A Remote Sensing Vision Language Model and Benchmark	Arxiv2023	RSGPT	Vision-Language	Comming soon
Clay	Clay Foundation Model	-	null	Vision	link
SATIN	SATIN: A Multi-Task Metadataset for Classifying Satellite Imagery using Vision-Language Models	ICCVW2023	SATIN	Vision-Language	link
SkyScript	SkyScript: A Large and Semantically Diverse Vision-Language Dataset for Remote Sensing	AAAI2024	SkyScript	Vision-Language	link
ChatEarthNet	ChatEarthNet: A Global-Scale, High-Quality Image-Text Dataset for Remote Sensing	Arxiv2024	ChatEarthNet	Vision-Language	[Comming soon]
LuoJiaHOG	LuoJiaHOG: A Hierarchy Oriented Geo-aware Image Caption Dataset for Remote Sensing Image-Text Retrieval	Arxiv2024	LuoJiaHOG	Vision-Language	null

Survey Papers

Title	Publication	Paper	Attribute
Self-Supervised Remote Sensing Feature Learning: Learning Paradigms, Challenges, and Future Works	TGRS2023	Paper	Vision & Vision-Language
Vision-Language Models in Remote Sensing: Current Progress and Future Trends	Arxiv2023	Paper	Vision-Language
The Potential of Visual ChatGPT For Remote Sensing	Arxiv2023	Paper	Vision-Language
遥感大模型：进展与前瞻	武汉大学学报 (信息科学版) 2023	Paper	Vision & Vision-Language
地理人工智能样本：模型、质量与服务	武汉大学学报 (信息科学版) 2023	Paper	-
Brain-Inspired Remote Sensing Foundation Models and Open Problems: A Comprehensive Survey	JSTARS2023	Paper	Vision & Vision-Language
Revisiting pre-trained remote sensing model benchmarks: resizing and normalization matters	Arxiv2023	Paper	Vision
An Agenda for Multimodal Foundation Models for Earth Observation	IGARSS2023	Paper	Vision
Transfer learning in environmental remote sensing	RSE2024	Paper	Transfer learning
遥感基础模型发展综述与未来设想	遥感学报2023	Paper	-
On the Promises and Challenges of Multimodal Foundation Models for Geographical, Environmental, Agricultural, and Urban Planning Applications	Arxiv2023	Paper	Vision-Language

Cite

If you find this repository useful, please consider giving a star ⭐ and citation:

@InProceedings{guo2023skysense,
      title={SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery}, 
      author={Xin Guo and Jiangwei Lao and Bo Dang and Yingying Zhang and Lei Yu and Lixiang Ru and Liheng Zhong and Ziyuan Huang and Kang Wu and Dingxiang Hu and Huimei He and Jian Wang and Jingdong Chen and Ming Yang and Yongjun Zhang and Yansheng Li},
      booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
      month     = {},
      year      = {2024},
      pages     = {}
}

Name		Name	Last commit message	Last commit date
Latest commit History 102 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

`Awesome Remote Sensing Foundation Models`

📢 Latest Updates

Remote Sensing Vision Foundation Models

Remote Sensing Vision-Language Foundation Models

Remote Sensing Generative Foundation Models

Remote Sensing Vision-Location Foundation Models

Remote Sensing Vision-Audio Foundation Models

Remote Sensing Task-specific Foundation Models

Benchmarks for RSFMs

(Large-scale) Pre-training Datasets

Survey Papers

Cite

About

Releases

Packages

paolofraccaro/Awesome-Remote-Sensing-Foundation-Models

Folders and files

Latest commit

History

Repository files navigation

Awesome Remote Sensing Foundation Models

📢 Latest Updates

Remote Sensing Vision Foundation Models

Remote Sensing Vision-Language Foundation Models

Remote Sensing Generative Foundation Models

Remote Sensing Vision-Location Foundation Models

Remote Sensing Vision-Audio Foundation Models

Remote Sensing Task-specific Foundation Models

Benchmarks for RSFMs

(Large-scale) Pre-training Datasets

Survey Papers

Cite

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

`Awesome Remote Sensing Foundation Models`

Packages