Skip to content

Latest commit

 

History

History
853 lines (588 loc) · 63.8 KB

File metadata and controls

853 lines (588 loc) · 63.8 KB

A Survey and an Empirical Evaluation of Multi-view Clustering Approaches

Our paper is now available online: A Survey and an Empirical Evaluation of Multi-view Clustering Approaches. .

This reporsity is a collection of state-of-the-art (SOTA), novel incomplete and complete multi-view clustering (papers, codes and datasets). Any problems, please contact [email protected], [email protected] and [email protected]. If you find this repository useful to your research or work, it is really appreciated to star this repository. ❤️


Contents


What's Multi-view data?

Multi-view data means that the same sample is described from different perspectives, and each perspective describes a class of features of the sample, called a view. In other words, the same sample can be represented by multiple heterogeneous features and each feature representation corresponds to a view. Xu, Tao et al. 2013 provided an intuitive example, where a) a web document is represented by its url and words on the page, b) a web image is depicted by its surrounding text separate to the visual information, c) images of a 3D sample taken from different viewpoints, d) video clips are combinations of audio signals and visual frames, e) multilingual documents have one view in each language.

Figure 1: multi-view data

What's Incomplete Multi-view data?

Multi-view data may be complete or incomplete. The complete multi-view data means that each feature has been collected and each sample appears in each view, while incomplete multi-view data indicates that some data samples could be missing their observation on some views (i.e., missing samples) or could be available only for their partial features (i.e., missing feature). (Zhao, Lyu et al. 2022) gave several specific examples, for example, in multi-lingual documents clustering task, documents are translated into different languages to denote different views, but many documents may have only one or two language versions due to the difficulties to obtain documents in each language; in social multimedia, some sample may miss visual or audio information due to sensor failure; in health informatics, some patients may not take certain lab tests to cause missing views or missing values; in video surveillance, some views are missing due to the cameras for these views are out of action or suffer from occlusions. (Zong, Miao et al. 2021) also considered the case of missing clusters, i.e. some clusters may be missing in some views. Figure 2 illustrates the cases of missing samples and missing clusters, where the samples in the same cluster are represented by the same shape but distinguished by color, the marker “×” means missing samples and missing clusters. In Figure 2. (a), clusters and instances are complete; in Figure 2. (b), clusters are complete but four samples are missing; while in Figure 2. (c), two clusters and two samples are missing.

Figure 2: incomplete multi-view data

What's Multi-view clustering?

Multi-view clustering (MVC) aims to group samples (objects/instances/points) with similar structures or patterns into the same group (cluster) and samples with dissimilar ones into different groups by combining the available feature information of different views and searching for consistent clusters across different views.

Figure 3: multi-view clustering

What's incomplete Multi-view clsutering?

For these multi-view clustering methods, they commonly require that all views of data are complete. However, the requirement is often impossible to satisfy because it is often the case that some views of samples are missing in the real-world applications, especially in the applications of disease diagnosing and webpage clustering. This incomplete problem of views leads to the failure of the conventional multi-view methods. The problem of clustering incomplete multi-view data is known as incomplete multi-view clustering (IMVC) (or partial multi-view clustering) (Hu and Chen 2019). The purpose of IMVC is to group these multi-view data points with incomplete feature views into different clusters by using the observed data instances in different views. IMVC consists of missing multi-view clustering, uncertain multi-view clustering, and incremental multi-view clustering.


Principles related to MVC

There are two significant principles ensuring the effectiveness of MVC: consensus and complementary principles Xu, Tao et al. 2013. The consistent of multi-view data means that there is some common knowledge across different views (e.g. both two pictures about dogs have contour and facial features), while the complementary of multi-view data refers to some unique knowledge contained in each view that is not available in other views (e.g. one view shows the side of a dog and the other shows the front of the dog, these two views allow for a more complete depiction of the dog). Therefore, the consensus principle aims to maximize the agreement across multiple distinct views for improving the understanding of the commonness of the observed samples, while the complementary principle states that in a multi-view context, each view of the data may contain some particular knowledge that other views do not have, and this particular knowledge can mutually complement to each other. (Yang and Wang 2018) illustrated intuitively the complementary and consensus principles by mapping a data sample with two views into a latent data space, where part A and part C exist in view 1 and view 2 respectively, indicating the complementarity of two views; meanwhile, part B is shared by both views, showing the consensus between two views.

Figure 4: multi-view principles

Papers

  1. A survey on multi-view clustering [paper]

  2. A survey of multi-view representation learning [paper]

  3. A survey of multi-view machine learning [paper]

  4. A Survey on Multi-view Learning [paper]

  5. Multi-view clustering: A survey [paper]

  6. Representation Learning in Multi-view Clustering: A Literature Review [paper]

  7. Survey on deep multi-modal data analytics: Collaboration, rivalry, and fusion [paper]

  8. A Comprehensive Survey on Multi-view Clustering [paper]

The information fusion strategy

The strategies for fusing information from multiple views can be divided into three categories: direct-fusion, early-fusion, and late-fusion according to the fusion stage. They are also called data level, feature level, and decision level fusion respectively, i.e. fusion in the data, fusion in the projected features, and fusion in the results. The direct-fusion approaches directly incorporate multi-view data into the clustering process through optimizing some particular loss functions.

early-fusion:

The early-fusion is to fuse multiple features or graph structure representations of multi-view data into a single representation or a consensus affinity graph across multiple views, and then any known single-view clustering algorithm (such as k-means) can be applied to partition data samples.

Late fusion:

The approaches of the late fusion first perform data clustering on each view and then fuse the results for all the views to obtain the final clustering results according to consensus. Late fusion can further be divided into integrated learning and collaborative training. The input to the integrated clustering algorithm is the result of clustering corresponding to multiple views

The clustering routine

One-step routine

The one-step routine integrates representation learning and clustering task into a unified framework, which simultaneously learns a graph for each view, a partition for each view, and a consensus partition. Based on an iterative optimization strategy, the high-quality consensus clustering results can be obtained directly and are employed to guide the graph construction and the updating of basic partitions, which later contributes to a new consensus partition. The joint optimization co-trains the clustering together with representation learning, leveraging the inherent interactions between two tasks and realizing the mutual benefit of these two steps. In one-step routine, the cluster label of each data point can be directly assigned and does not need any post-processing, decreasing the instability of the clustering performance induced by the uncertainty of post-processing operation

Two-step routine

The two-step routine first extracts the low-dimensional representation of multi-view data and then uses traditional clustering approaches (such as k-means) to process the obtained representation. That is to say, the two-step routine often needs a post-processing process, i.e. applying a simple clustering method on the learned representation or carrying out a fusion operation on the clustering results of individual views, to produce the final clustering results.

The weighting strategy

  1. Self-weighted Multiview Clustering with Multiple Graphs [paper|code]

  2. Multiview spectral embedding [paper|code]

  3. Multi-view content-context information bottleneck for image clustering [paper|[code]

  4. Parameter-free auto-weighted multiple graph learning: a framework for multiview clustering and semi-supervised classification [paper|code]

  5. 基于两级权重的多视角聚类 [paper|[code]

  6. Confidence level auto-weighting robust multi-view subspace clustering [paper|[code]

  7. Weighted multi-view clustering with feature selection [paper|[code]

  8. Two-level weighted collaborative k-means for multi-view clustering [paper|code]

  9. Weighted multi-view co-clustering (WMVCC) for sparse data [paper|[code]

  10. A cluster-weighted kernel K-means method for multi-view clustering[paper|[code]

  11. Multi-graph fusion for multi-view spectral clustering [paper|code]

  12. 一种双重加权的多视角聚类方法 [paper|[code]

  13. View-Wise Versus Cluster-Wise Weight: Which Is Better for Multi-View Clustering?[paper|[code]

COMPLETE MULTI-VIEW CLUSTERING

Spectral clustering-based approaches

  1. Multi-view clustering via canonical correlation analysis [paper|[code]

  2. Multi-view kernel spectral clustering [paper|[code]

  3. Correlational spectral clustering [paper|[code]

  4. (TKDE,2020): Multi-view spectral clustering with high-order optimal neighborhood laplacian matrix. [paper|code]

  5. (KBS,2020): Multi-view spectral clustering by simultaneous consensus graph learning and discretization. [paper|code]

Co-regularization and co-training spectral clustering

  1. Co-regularized multi-view spectral clustering[paper|code]

  2. A co-training approach for multi-view spectral clustering[paper|[code]

  3. Combining labeled and unlabeled data with co-training[paper|[code]

Constrained spectral clustering

  1. Heterogeneous image feature integration via multi-modal spectral clustering[paper|[code]

  2. Robust multi-view spectral clustering via low-rank and sparse decomposition [paper|[code]

  3. Multiview clustering via adaptively weighted procrustes[paper|[code]

  4. One-step multi-view spectral clustering[paper|[code]]

  5. Multi-graph fusion for multi-view spectral clustering[paper|code]

  6. Multi-view spectral clustering with adaptive graph learning and tensor schatten p-norm[paper|[code]

  7. Multi-view spectral clustering via integrating nonnegative embedding and spectral embedding[paper|code]

  8. Multi-view spectral clustering via constrained nonnegative embedding[paper|[code]

  9. Low-rank tensor constrained co-regularized multi-view spectral clustering[paper|[code]

Fast spectral clustering

  1. Large-scale multi-view spectral clustering via bipartite graph[paper|code]

  2. Refining a k-nearest neighbor graph for a computationally efficient spectral clustering[paper|[code]

  3. Multi-view clustering based on generalized low rank approximation [paper|[code]

  4. Multi-view spectral clustering by simultaneous consensus graph learning and discretization[paper|code]

NMF-based approaches

  1. Multi-view clustering via joint nonnegative matrix factorization [paper|code]

  2. Multi-view clustering via concept factorization with local manifold regularization [paper|code]

  3. Multi-view clustering via multi-manifold regularized non-negative matrix factorization [paper|[code]]

  4. Semi-supervised multi-view clustering with graph-regularized partially shared non-negative matrix factorization [paper|code]

  5. Semi-supervised multi-view clustering based on constrained nonnegative matrix factorization [paper|[code]

  6. Semi-supervised multi-view clustering based on orthonormality-constrained nonnegative matrix factorization [paper|[code]

  7. Multi-view clustering by non-negative matrix factorization with co-orthogonal constraints [paper|code]

  8. Dual regularized multi-view non-negative matrix factorization for clustering [paper|[code]

  9. A network-based sparse and multi-manifold regularized multiple non-negative matrix factorization for multi-view clustering [paper|[code]

  10. Multi-view clustering with the cooperation of visible and hidden views [paper|[code]

Fast NMF

  1. Binary Multi-View Clustering [paper|code]

  2. Fast Multi-View Clustering via Nonnegative and Orthogonal Factorization [paper|[code]

Deep NMF

  1. Multi-View Clustering via Deep Matrix Factorization [paper|code]

  2. Multi-view clustering via deep concept factorization [paper|code]

  3. Deep Multi-View Concept Learning[paper|[code]

  4. Deep graph regularized non-negative matrix factorization for multi-view clustering[paper|code

  5. Multi-view clustering via deep matrix factorization and partition alignment[paper|[code]

  6. Deep multiple non-negative matrix factorization for multi-view clustering[paper|[code]

Multiple kernel learning

  1. Auto-weighted multi-view clustering via kernelized graph learning[paper|code]

  2. Multiple kernel subspace clustering with local structural graph and low-rank consensus kernel learning [paper|[code]

  3. Jointly Learning Kernel Representation Tensor and Affinity Matrix for Multi-View Clustering [paper|[code]

  4. Kernelized Multi-view Subspace Clustering via Auto-weighted Graph Learning[paper|[code]

Graph learning

  1. Refining a k-nearest neighbor graph for a computationally efficient spectral clustering[paper|[code]]

  2. Robust unsupervised feature selection via dual self representation and manifold regularization [paper|[code]

  3. Robust graph learning from noisy data [paper|[code]

  4. Graph learning for multiview clustering [paper|code

  5. Multi-view projected clustering with graph learning[paper|code]

  6. Parameter-Free Weighted Multi-View Projected Clustering with Structured Graph Learning [paper|[code]

  7. Learning robust affinity graph representation for multi-view clustering[paper|[code]

  8. GMC: Graph-based multi-view clustering[paper|code]

  9. Multiview consensus graph clustering[paper|code]

  10. A study of graph-based system for multi-view clustering [paper|code]

  11. Multi-view Clustering with Latent Low-rank Proxy Graph Learning[paper|[code]

  12. Learning latent low-rank and sparse embedding for robust image feature Extraction [paper|[code]

  13. Robust multi-view graph clustering in latent energy-preserving embedding space[paper|[code]

  14. Robust multi-view data clustering with multi-view capped-norm k-means[paper|[code]

Embedding learning

  1. Robust multi-view graph clustering in latent energy-preserving embedding space[paper|[code]

  2. COMIC: Multi-view clustering without parameter selection[paper|code]

  3. Multi-view clustering in latent embedding space[paper|code]

  4. Relaxed multi-view clustering in latent embedding space[paper|code]

  5. Auto-weighted multi-view clustering via spectral embedding[paper|[code]

  6. Robust graph-based multi-view clustering in latent embedding space[paper|[code]

  7. Efficient correntropy-based multi-view clustering with anchor graph embedding[paper|[code]

  8. Self-supervised discriminative feature learning for multi-view clustering[paper|code]

  9. Deep Multiple Auto-Encoder-Based Multi-view Clustering[paper|[code]]

  10. Joint deep multi-view learning for image clustering[paper|[code]]

  11. Deep embedded multi-view clustering with collaborative training[paper|[code]]

  12. Trio-based collaborative multi-view graph clustering with multiple constraints[paper|[code]

  13. Multi-view graph embedding clustering network: Joint self-supervision and block diagonal representation[paper|[code]

  14. Multi-view fuzzy clustering of deep random walk and sparse low-rank embedding[paper|[code]]

  15. Differentiable Bi-Sparse Multi-View Co-Clustering[paper|[code]]

Alignment learning

  1. Multi-view Clustering via Late Fusion Alignment Maximization[paper|code]

  2. End-to-end adversarial-attention network for multi-modal clustering[paper|code]

  3. Reconsidering representation alignment for multi-view clustering[paper|code]

  4. Multiview Subspace Clustering With Multilevel Representations and Adversarial Regularization[[paper]|code]

  5. Partially view-aligned clustering[paper|code]

Subspace learning

  1. Consistent and diverse multi-View subspace clustering with structure constraint[paper|[code]]

  2. Consistent and specific multi-view subspace clustering[paper|code]

  3. Flexible Multi-View Representation Learning for Subspace Clustering[paper|code]

  4. Learning a joint affinity graph for multiview subspace clustering[paper|[code]

  5. Exclusivity-consistency regularized multi-view subspace clustering[paper|code]

  6. Multi-view subspace clustering with intactness-aware similarity[paper|code]

  7. Diversity-induced multi-view subspace clustering[paper|code]

  8. Split multiplicative multi-view subspace clustering[paper|code]

  9. Learning a consensus affinity matrix for multi-view clustering via subspaces merging on Grassmann manifold[paper|[code]

  10. Clustering on multi-layer graphs via subspace analysis on Grassmann manifolds[paper|[code]

  11. Deep multi-view subspace clustering with unified and discriminative learning[paper|[code]

  12. Attentive multi-view deep subspace clustering net[paper|[code]

  13. Dual shared-specific multiview subspace clustering[paper|code]0

  14. Multi-view subspace clustering with consistent and view-specific latent factors and coefficient matrices[paper|[code]

  15. Robust low-rank kernel multi-view subspace clustering based on the schatten p-norm and correntropy[paper|[code]]

  16. Multiple kernel low-rank representation-based robust multi-view subspace clustering[paper|[code]]

  17. One-step kernel multi-view subspace clustering[paper|[code]]

  18. Deep low-rank subspace ensemble for multi-view clustering[paper|[code]]

  19. Multi-view subspace clustering with adaptive locally consistent graph regularization[paper|[code]]

  20. Multi-view subspace clustering networks with local and global graph information[paper|[code]

  21. Deep Multimodal Subspace Clustering Networks[paper|code]

  22. Multi-view Deep Subspace Clustering Networks[paper|code|pytorch]

  23. Multiview subspace clustering via tensorial t-product representation[paper|[code]]

  24. Latent complete row space recovery for multi-view subspace clustering[paper|[code]]

  25. Fast Parameter-Free Multi-View Subspace Clustering With Consensus Anchor Guidance[paper|[code]]

  26. Multi-view subspace clustering via partition fusion. Information Sciences[[paper]|[code]

  27. Semi-Supervised Structured Subspace Learning for Multi-View Clustering[paper|[code]

  28. 双加权多视角子空间聚类算法[paper|[code]]

  29. Fast Self-guided Multi-view Subspace Clustering [paper|code]

Self-paced learning

  1. Self-paced learning for latent variable models[paper|[code]

  2. Multi-view self-paced learning for clustering[paper|[code]

  3. Self-paced and auto-weighted multi-view clustering[paper|[code]]

  4. Dual self-paced multi-view clustering[paper|[code]

Co-Clustering-based approaches

  1. A generalized maximum entropy approach to bregman co-clustering and matrix approximation[paper|[code]

  2. Multi-view information-theoretic co-clustering for co-occurrence data[paper|[code]

  3. Dynamic auto-weighted multi-view co-clustering[paper|[code]

  4. Auto-weighted multi-view co-clustering with bipartite graphs[paper|[code]]

  5. Auto-weighted multi-view co-clustering via fast matrix factorization[paper|[code]]

  6. Differentiable Bi-Sparse Multi-View Co-Clustering[paper|[code]

  7. Weighted multi-view co-clustering (WMVCC) for sparse data[paper|[code]

Multi-task-based approaches

  1. Multi-task multi-view clustering for non-negative data[paper|[code]

  2. A Multi-task Multi-view based Multi-objective Clustering Algorithm[paper|[code]]

  3. Multi-task multi-view clustering[paper|[code]]

  4. Co-clustering documents and words using bipartite spectral graph partitioning[paper|[code]

  5. Self-paced multi-task multi-view capped-norm clustering[paper|[code]]

  6. Learning task-driving affinity matrix for accurate multi-view clustering through tensor subspace learning[paper|[code]]

Incomplete Multi-view clustering

Imputation-based IMVC Incomplete Multi-view clustering

  1. Doubly aligned incomplete multi-view clustering[paper|[code]]

  2. Incomplete multiview spectral clustering with adaptive graph learning[paper|[code]]

  3. Late fusion incomplete multi-view clustering[paper|[code]]

  4. Consensus graph learning for incomplete multi-view clustering[paper|[code]

  5. Multi-view kernel completion[paper|[code]

  6. Unified embedding alignment with missing views inferring for incomplete multi-view clustering[paper|[code]]

  7. One-Stage Incomplete Multi-view Clustering via Late Fusion[paper|[code]]

  8. Spectral perturbation meets incomplete multi-view data[paper|[code]]

  9. Efficient and effective regularized incomplete multi-view clustering[paper|[code]]

  10. Adaptive partial graph learning and fusion for incomplete multi‐view clustering[paper|[code]

  11. Unified tensor framework for incomplete multi-view clustering and missing-view inferring[paper|[code]]

  12. Incomplete multi-view clustering with cosine similarity [paper|[code]]

  13. COMPLETER: Incomplete Multi-view Clustering via Contrastive Prediction [paper|code]

  14. Learning Disentangled View-Common and View-Peculiar Visual Representations for Multi-View Clustering [paper|code]

Transformation-based IMVC

  1. Partial multi-view clustering via consistent GAN[paper|[code]

  2. One-step multi-view subspace clustering with incomplete views[paper|[code]]

  3. Consensus guided incomplete multi-view spectral clustering[paper|[code]

  4. Incomplete multi-view subspace clustering with adaptive instance-sample mapping and deep feature fusion[paper|[code]]

  5. Dual Alignment Self-Supervised Incomplete Multi-View Subspace Clustering Network[paper|[code]]

  6. Structural Deep Incomplete Multi-view Clustering Network[paper|[code]]

  7. One-Step Graph-Based Incomplete Multi-View Clusterin[paper|code]

  8. Structured anchor-inferred graph learning for universal incomplete multi-view clustering [paper|code

The unified IMVC

  1. Complete/incomplete multi‐view subspace clustering via soft block‐diagonal‐induced regulariser[paper|[code]

  2. A novel consensus learning approach to incomplete multi-view clustering[paper|[code]]

  3. Adaptive graph completion based incomplete multi-view clustering[paper|[code]

  4. Incomplete multi-view clustering via contrastive prediction [paper|[code]

Uncertain multi-view clustering

  1. Outlier-robust multi-view clustering for uncertain data[paper|[code]

  2. Multi-view spectral clustering for uncertain objects[paper|[code]

Incremental multi-view clustering

  1. Incremental multi-view spectral clustering with sparse and connected graph learning [paper|[code]]

  2. Incremental multi-view spectral clustering [paper|[code]]

  3. Incremental multi-view spectral clustering with sparse and connected graph learning[paper|[code]]

  4. Multi-graph fusion for multi-view spectral clustering[paper|[code]

  5. Incremental learning through deep adaptation[paper|[code]]

  6. Clustering-Induced Adaptive Structure Enhancing Network for Incomplete Multi-View Data [paper|[code]]

Code

  • Database :database files

    1. 3sources: example data

      1. "Desc.md" records which papers use the current dataset
  • Dataset : read data files

    1. dataset.py: load datasets

  • img : some pictures

  • utils : some code to process data

    1. process.py: process data

    2. utils.py: tool code

  • configMain.yaml : some configuration files

  • demo.py

Benchmark Datasets

We also collect some datasets, which are uploaded to baiduyun. address(code)f3n4. Other some dataset links Multi-view-Datasets

1.Text Datasets

The text datasets consist of news dataset (3Sourses, BBC, BBCSport, Newsgroup), multilingual documents dataset (Reuters, Reuters-21578), citations dataset (Citeseer), WebKB webpage dataset (Cornell, Texas, Washington and Wisconsin), articles (Wikipedia), and diseases dataset (Derm).

Dataset #views #classes #instances F-Type(#View1) F-Type(#View2) F-Type(#View3) F-Type(#View4) F-Type(#View5) F-Type(#View6)
3Sources 3 6 169 BBC(3560) Reuters(3631) Guardian(3068)
BBC 4 5 685 seg1(4659) seg2(4633) seg3(4665) seq4(4684) .
BBCSport 3 5 544/282 seq1(3183/2582) seg2(3203/2544) / seq3 (2465)
Newsgroup 3 5 500 -2000 -2000 -2000
Reuters 5 6 600/1200 English(9749/2000) French(9109/2000) German(7774/2000) / Italian(2000) / Spanish(2000)
Reuters-21578 5 6 1500 English(21531) French(24892) German(34251) Italian(15506) Spanish(11547)
Citeseers 2 6 3312 citations(4732) word vector(3703)
Cornell 2 5 195 Citation (195) Content (1703)
Texas 2 5 187 Citation (187) Content (1398)
Washington 2 5 230 Citation (230) Content (2000)
Wisconsin 2 5 265 Citation (265) Content (1703)
Wikipedia 2 10 693
Derm 2 6 366 Clinical (11) Histopathological(22)

2.Image Datasets

The image datasets consist of facial image datasets (Yale, Yale-B, Extended-Yale, VIS/NIR, ORL, Notting-Hill, YouTube Faces), handwritten digits datasets (UCI, Digits, HW2source, Handwritten, MNIST-USPS, MNIST-10000, Noisy MNIST-Rotated MNIST), object image dataset (NUS/WIDE, MSRC, MSRCv1, COIL-20, Caltech101), Microsoft Research Asia Internet Multimedia Dataset 2.0 (MSRA-MM2.0), natural scene dataset (Scene, Scene-15, Out-Scene, Indoor), plant species dataset (100leaves), animal with attributes (AWA), multi-temporal remote sensing dataset (Forest), Fashion (such as T-shirt, Dress and Coat) dataset (Fashion-10K), sports event dataset (Event), image dataset (ALOI, ImageNet, Corel, Cifar-10, SUN1k, Sun397).

Dataset #views #classes #instances F-Type(#View1) F-Type(#View2) F-Type(#View3) F-Type(#View4) F-Type(#View5) F-Type(#View6) url
Yale 3 15 165 Intensity (4096) LBP(33040 Gabor (6750)
Yale-B 3 10 650 Intensity(2500) LBP(3304) Gabor(6750)
Extended-Yale 2 28 1774 LBP(900) COV(45)
VIS/NIR 2 22 1056 VL(10000) NIRI(10000)
ORL 3 40 400 Intensity(4096) LBP(3304) Gabor(6750)
Notting-Hill 3 5 550 Intensity(2000) LBP(3304) Gabor(6750)
YouTube Faces 3 66 152549 CH(768) GIST(1024) HOG( 1152)
Digits 3 10 2000 FAC(216) FOU(76) KAR (64)
HW2sources 2 10 2000 FOU (76) PIX (240)
Handwritten 6 10 2000 FOU(76) FAC(216) KAR(64) PIX(240) ZER(47) MOR(6)
MNIST-USPS 2 10 5000 MNIST(28´28) USPS(16´16)
MNIST-10000 2 10 10000 VGG16 FC1(4096) Resnet50(2048)
MNIST-10000 3 10 10000 ISO(30) LDA(9) NPE(30)
Noisy MNIST-Rotated MNIST 2 10 70000 Noisy MNIST(28´28) Rotated
NUS-WIDE-Obj 5 31 30000 CH(65) CM(226) CORR(145) ED(74) WT(129)
NUSWIDE 6 12 2400 CH(64) CC(144) EDH(73) WAV(128) BCM(255) SIFT(500)
MSRC 5 7 210 CM(48) LBP(256) HOG(100) SIFT(200) GIST(512)
MSRCv1 5 7 210 CM(24) HOG(576) GIST(512) LBP(256) GENT(254)
COIL-20 3 20 1440 Intensity(1024) LBP (3304) Gabor (6750)
Caltech101-7/20/102 6 7/20/102 1474/2386/9144 Gabor(48) WM(40) Centrist (254) HOG(1984) GIST(512) LBP(928)
MSRA-MM2.0 4 25 5000 HSV-CH(64) CORRH(144) EDH(75) WT(128)
Scene 4 8 2688 GIST(512) CM(432) HOG(256) LBP(48)
scene-15 3 15 4485 GIST(1800) PHOG(1180) LBP(1240)
Out-Scene 4 8 2688 GIST(512) LBP(48) HOG(256) CM(432)
Indoor 6 5 621 SURF(200) SIFT(200) GIST(512) HOG(680) WT(32)
100leaves 3 100 1600 TH(64) FSM(64) SD(64)
Animal with attributes 6 50 4000/30475 CH(2688) LSS(2000) PHOG(252) SIFT(2000) RGSIFT(2000) -2000
Forest 2 4 524 RS(9) GWSV(18)
Fashion-10K 2 10 70000 Test set(28´28) sampled set (28´28)
Event 6 8 1579 SURF(500) SIFT(500) GIST(512) HOG(680) WT(32) LBP(256)
ALOI 4 100 110250 RGB-CH(77) HSV-CH(13) CS(64) Haralick (64)
ImageNet 3 50 12000 HSV-CH(64) GIST(512) SIFT(1000)
Corel 3 50 5000 CH (9) EDH(18) WT (9)
Cifar-10 3 10 60000 CH(768) GIST (1024) HOG(1152)
SUN1k 3 10 1000 SIFT(6300) HOG(6300) TH(10752)
Sun397 3 397 108754 CH(768) GIST (1024) HOG(1152)

3.Rest of data (Text-gene、Image-text and Video)

The prokaryotic species dataset (Prok) is a text-gene a dataset, which consists of 551 prokaryotic samples belonging to 4 classes. The species are represented by 1 textual view and 2 genomic views. The textual descriptions are summarized into a document-term matrix that records the TF-IDF re-weighted word frequencies. The genomic views are the proteome composition and the gene repertoire.
The image-text datasets consist of Wikipedia’s featured articles dataset (Wikipedia), drosophila embryos dataset (BDGP), NBA-NASCAR Sport dataset (NNSpt), indoor scenes (SentencesNYU v2 (RGB-D)), Pascal dataset (VOC), object dataset ( NUS-WIDE-C5), and photographic images (MIR Flickr 1M) .
The video datasets consist of actions of passengers dataset (DTHC), pedestrian video shot dataset (Lab), motion of body sequences (CMU Mobo) dataset, face video sequences dataset (YouTubeFace_sel, Honda/UCSD), and Columbia Consumer Video dataset (CCV).

Dataset #views #classes #instances F-Type(#View1) F-Type(#View2) F-Type(#View3) F-Type(#View4) F-Type(#View5)
Prokaryotic 3 4 551 438 3 393
Wikipedia 2 10 693/2866 image article
BDGP 5 2500 Visual(1750) Textual(79)
NNSpt 2 2 840 image(1024) TF-IDF(296)
SentencesNYU v2 (RGB-D) 2 13 1449 image (2048) text (300)
VOC 2 20 5,649 Image: Gist (512) Text (399)
NUS-WIDE-C5 2 5 4000 visual codeword vector(500) annotation vector(1000)
MIR Flickr 1M 4 10 2000 HOG(300) LBP(50) HSV CORRH (114) TF-IDF(60)
DTHC 3 cameras Dispersing from the center quickly 3 video sequences 151 frames/video resolution 135 × 240
Lab 4 cameras 4 people 16 video sequences 3915 frames/video resolution 144 ×180
CMU Mobo(CAGL) 4 videos 24 objects 96 video sequences about 300 frames/video resolution 40´40
Honda/UCSD(CAGL) at least 2 videos/ person 20 objects 59 video sequences 12 to 645 frames/video resolution 20´20
YouTubeFace_sel 5 31 101499 64 512 64 647 838
CCV 3 20 6773 YouTube videos SIFT(5000) STIP(5000) MFCC(4000)