December 2020

The Pile: An 800GB Dataset of Diverse Text for Language Modeling - [Arxiv] [QA]
Directed Beam Search: Plug-and-Play Lexically Constrained Language Generation - [Arxiv] [QA]
Refine and Imitate: Reducing Repetition and Inconsistency in Persuasion Dialogues via Reinforcement Learning and Human Demonstration - [Arxiv] [QA]
A Simple Fine-tuning Is All You Need: Towards Robust Deep Learning Via Adversarial Fine-tuning - [Arxiv] [QA]
Evolution Is All You Need: Phylogenetic Augmentation for Contrastive Learning - [Arxiv] [QA]
ProofWriter: Generating Implications, Proofs, and Abductive Statements over Natural Language - [Arxiv] [QA]
Learning Dense Representations of Phrases at Scale - [Arxiv] [QA]
Towards Overcoming False Positives in Visual Relationship Detection - [Arxiv] [QA]
A Distributional Approach to Controlled Text Generation - [Arxiv] [QA]
OBoW: Online Bag-of-Visual-Words Generation for Self-Supervised Learning - [Arxiv] [QA]
Taming Transformers for High-Resolution Image Synthesis - [Arxiv] [QA]
Transformer Interpretability Beyond Attention Visualization - [Arxiv] [QA]
Neural Volume Rendering: NeRF And Beyond - [Arxiv] [QA]
Keyword-Guided Neural Conversational Model - [Arxiv] [QA]
CARE: Commonsense-Aware Emotional Response Generation with Latent Concepts - [Arxiv] [QA]
Understanding the Behaviour of Contrastive Loss - [Arxiv] [QA]
Image Inpainting Guided by Coherence Priors of Semantics and Textures - [Arxiv] [QA]
Contrastive Learning with Adversarial Perturbations for Conditional Text Generation - [Arxiv] [QA]
A Comprehensive Study of Deep Video Action Recognition - [Arxiv] [QA]
Differential Evolution for Neural Architecture Search - [Arxiv] [QA]
Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need? - [Arxiv] [QA]
Spatially Conditioned Graphs for Detecting Human-Object Interactions - [Arxiv] [QA]
Equivalent Causal Models - [Arxiv] [QA]
Explainable Link Prediction for Privacy-Preserving Contact Tracing - [Arxiv] [QA]
The Counterfactual NESS Definition of Causation - [Arxiv] [QA]
Distilling Knowledge from Reader to Retriever for Question Answering - [Arxiv] [QA]
Active Learning: Problem Settings and Recent Developments - [Arxiv] [QA]
Sheaf Neural Networks - [Arxiv] [QA]
Challenging common interpretability assumptions in feature attribution explanations - [Arxiv] [QA]
Practical No-box Adversarial Attacks against DNNs - [Arxiv] [QA]
Practical No-box Adversarial Attacks against DNNs - [Arxiv] [QA]
RPT: Relational Pre-trained Transformer Is Almost All You Need towards Democratizing Data Preparation - [Arxiv] [QA]
pixelNeRF: Neural Radiance Fields from One or Few Images - [Arxiv] [QA]
Learned Initializations for Optimizing Coordinate-Based Neural Representations - [Arxiv] [QA]
Neural Prototype Trees for Interpretable Fine-grained Image Recognition - [Arxiv] [QA]
Just Ask: Learning to Answer Questions from Millions of Narrated Videos - [Arxiv] [QA]
CPM: A Large-scale Generative Chinese Pre-trained Language Model - [Arxiv] [QA]

November 2020

Feature Learning in Infinite-Width Neural Networks - [Arxiv] [QA]
How Well Do Self-Supervised Models Transfer? - [Arxiv] [QA]
Can Temporal Information Help with Contrastive Self-Supervised Learning? - [Arxiv] [QA]
All You Need is a Good Functional Prior for Bayesian Deep Learning - [Arxiv] [QA]
DeRF: Decomposed Radiance Fields - [Arxiv] [QA]
GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields - [Arxiv] [QA]
Hierarchically Decoupled Spatial-Temporal Contrast for Self-supervised Video Representation Learning - [Arxiv] [QA]
ROME: Robustifying Memory-Efficient NAS via Topology Disentanglement and Gradient Accumulation - [Arxiv] [QA]
Exploring Simple Siamese Representation Learning - [Arxiv] [QA]
A Reputation Mechanism Is All You Need: Collaborative Fairness and Adversarial Robustness in Federated Learning - [Arxiv] [QA]
Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning - [Arxiv] [QA]
MixMix: All You Need for Data-Free Compression Are Feature and Data Mixing - [Arxiv] [QA]
Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge? - [Arxiv] [QA]
Contextual Fusion For Adversarial Robustness - [Arxiv] [QA]
Contextual Fusion For Adversarial Robustness - [Arxiv] [QA]
Functorial Manifold Learning - [Arxiv] [QA]
Unsupervised Video Representation Learning by Bidirectional Feature Prediction - [Arxiv] [QA]
Multimodal Pretraining for Dense Video Captioning - [Arxiv] [QA]
Topological properties of basins of attraction and expressiveness of width bounded neural networks - [Arxiv] [QA]
A Broad Dataset is All You Need for One-Shot Object Detection - [Arxiv] [QA]
Long Range Arena: A Benchmark for Efficient Transformers - [Arxiv] [QA]
Feature Removal Is a Unifying Principle for Model Explanation Methods - [Arxiv] [QA]
Language Model is All You Need: Natural Language Understanding as Question Answering - [Arxiv] [QA]
This Looks Like That, Because ... Explaining Prototypes for Interpretable Image Recognition - [Arxiv] [QA]
Fast Biconnectivity Restoration in Multi-Robot Systems for Robust Communication Maintenance - [Arxiv] [QA]
Non-Autoregressive Predictive Coding for Learning Speech Representations from Local Dependencies - [Arxiv] [QA]

October 2020

A Survey on Contrastive Self-supervised Learning - [Arxiv] [QA]
HOI Analysis: Integrating and Decomposing Human-Object Interaction - [Arxiv] [QA]
Pretext-Contrastive Learning: Toward Good Practices in Self-supervised Video Representation Leaning - [Arxiv] [QA]
Learning to Actively Learn: A Robust Approach - [Arxiv] [QA]
Learning to Actively Learn: A Robust Approach - [Arxiv] [QA]
Class-incremental learning: survey and performance evaluation on image classification - [Arxiv] [QA]
Cycle-Contrast for Self-Supervised Video Representation Learning - [Arxiv] [QA]
How Does the Task Landscape Affect MAML Performance? - [Arxiv] [QA]
How Does the Task Landscape Affect MAML Performance? - [Arxiv] [QA]
One Solution is Not All You Need: Few-Shot Extrapolation via Structured MaxEnt RL - [Arxiv] [QA]
RSPNet: Relative Speed Perception for Unsupervised Video Representation Learning - [Arxiv] [QA]
Interpretation of NLP models through input marginalization - [Arxiv] [QA]
Attention is All You Need in Speech Separation - [Arxiv] [QA]
Model Interpretability through the Lens of Computational Complexity - [Arxiv] [QA]
Towards falsifiable interpretability research - [Arxiv] [QA]
The Turking Test: Can Language Models Understand Instructions? - [Arxiv] [QA]
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale - [Arxiv] [QA]
Transcription Is All You Need: Learning to Separate Musical Mixtures with Score as Supervision - [Arxiv] [QA]
MAM: Masked Acoustic Modeling for End-to-End Speech-to-Text Translation - [Arxiv] [QA]
Distilling Dense Representations for Ranking using Tightly-Coupled Teachers - [Arxiv] [QA]
Counterfactual Explanations and Algorithmic Recourses for Machine Learning: A Review - [Arxiv] [QA]
CR-Walker: Tree-Structured Graph Reasoning and Dialog Acts for Conversational Recommendation - [Arxiv] [QA]
PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval - [Arxiv] [QA]
Improving Dialog Systems for Negotiation with Personality Modeling - [Arxiv] [QA]
Self-supervised Co-training for Video Representation Learning - [Arxiv] [QA]
Solving relaxations of MAP-MRF problems: Combinatorial in-face Frank-Wolfe directions - [Arxiv] [QA]
For self-supervised learning, Rationality implies generalization, provably - [Arxiv] [QA]
RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering - [Arxiv] [QA]
What is More Likely to Happen Next? Video-and-Language Future Event Prediction - [Arxiv] [QA]
NeRF++: Analyzing and Improving Neural Radiance Fields - [Arxiv] [QA]
Representable Markov Categories and Comparison of Statistical Experiments in Categorical Probability - [Arxiv] [QA]
Pretrained Transformers for Text Ranking: BERT and Beyond - [Arxiv] [QA]
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis - [Arxiv] [QA]
Fairness-aware Agnostic Federated Learning - [Arxiv] [QA]
Fairness-aware Agnostic Federated Learning - [Arxiv] [QA]
Automated Concatenation of Embeddings for Structured Prediction - [Arxiv] [QA]
GRF: Learning a General Radiance Field for 3D Representation and Rendering - [Arxiv] [QA]
A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks - [Arxiv] [QA]
Automatic Backward Filtering Forward Guiding for Markov processes and graphical models - [Arxiv] [QA]
Unsupervised Representation Learning by InvariancePropagation - [Arxiv] [QA]
Like hiking? You probably enjoy nature: Persona-grounded Dialog with Commonsense Expansions - [Arxiv] [QA]
Beyond [CLS] through Ranking by Generation - [Arxiv] [QA]
A Transformer-based Framework for Multivariate Time Series Representation Learning - [Arxiv] [QA]
Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillation - [Arxiv] [QA]
MIME: MIMicking Emotions for Empathetic Response Generation - [Arxiv] [QA]
Sharpness-Aware Minimization for Efficiently Improving Generalization - [Arxiv] [QA]
DecAug: Augmenting HOI Detection via Decomposition - [Arxiv] [QA]
DIRV: Dense Interaction Region Voting for End-to-End Human-Object Interaction Detection - [Arxiv] [QA]
All You Need Is CONSTRUCT - [Arxiv] [QA]
SparTerm: Learning Term-based Sparse Representation for Fast Text Retrieval - [Arxiv] [QA]
Understanding Self-supervised Learning with Dual Deep Networks - [Arxiv] [QA]

September 2020

Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval - [Arxiv] [QA]
Learning to Plan and Realize Separately for Open-Ended Dialogue Systems - [Arxiv] [QA]
From Pixel to Patch: Synthesize Context-aware Features for Zero-shot Semantic Segmentation - [Arxiv] [QA]
Learned Low Precision Graph Neural Networks - [Arxiv] [QA]
Generation-Augmented Retrieval for Open-domain Question Answering - [Arxiv] [QA]
SelfAugment: Automatic Augmentation Policies for Self-Supervised Learning - [Arxiv] [QA]
Simplified TinyBERT: Knowledge Distillation for Document Retrieval - [Arxiv] [QA]
BERT-QE: Contextualized Query Expansion for Document Re-ranking - [Arxiv] [QA]
Efficient Transformers: A Survey - [Arxiv] [QA]
Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion - [Arxiv] [QA]
Understanding the Role of Individual Units in a Deep Neural Network - [Arxiv] [QA]
KNN-DBSCAN: a DBSCAN in high dimensions - [Arxiv] [QA]
Generative Language Modeling for Automated Theorem Proving - [Arxiv] [QA]
Measuring Massive Multitask Language Understanding - [Arxiv] [QA]
Sensors, Safety Models and A System-Level Approach to Safe and Scalable Automated Vehicles - [Arxiv] [QA]
Sample-Efficient Automated Deep Reinforcement Learning - [Arxiv] [QA]
Sample-Efficient Automated Deep Reinforcement Learning - [Arxiv] [QA]
Learning to summarize from human feedback - [Arxiv] [QA]
WaveGrad: Estimating Gradients for Waveform Generation - [Arxiv] [QA]
Zero-Shot Human-Object Interaction Recognition via Affordance Graphs - [Arxiv] [QA]
Neural Architecture Search For Keyword Spotting - [Arxiv] [QA]

August 2020

Self-supervised Video Representation Learning by Uncovering Spatio-temporal Statistics - [Arxiv] [QA]
A Survey of Deep Active Learning - [Arxiv] [QA]
Against Membership Inference Attack: Pruning is All You Need - [Arxiv] [QA]
A Survey of Evaluation Metrics Used for NLG Systems - [Arxiv] [QA]
Automated Search for Resource-Efficient Branched Multi-Task Networks - [Arxiv] [QA]
Contrastive learning, multi-view redundancy, and linear models - [Arxiv] [QA]
A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild - [Arxiv] [QA]
PARADE: Passage Representation Aggregation for Document Reranking - [Arxiv] [QA]
Monocular Expressive Body Regression through Body-Driven Attention - [Arxiv] [QA]
Automated Machine Learning -- a brief review at the end of the early years - [Arxiv] [QA]
HiPPO: Recurrent Memory with Optimal Polynomial Projections - [Arxiv] [QA]
A Survey of Active Learning for Text Classification using Deep Neural Networks - [Arxiv] [QA]
Context-aware Feature Generation for Zero-shot Semantic Segmentation - [Arxiv] [QA]
ConsNet: Learning Consistency Graph for Zero-Shot Human-Object Interaction Detection - [Arxiv] [QA]
Adaptive Learning of Tensor Network Structures - [Arxiv] [QA]
Adaptive Learning of Tensor Network Structures - [Arxiv] [QA]
SpeedySpeech: Efficient Neural Speech Synthesis - [Arxiv] [QA]
Spatiotemporal Contrastive Video Representation Learning - [Arxiv] [QA]
A Boundary Based Out-of-Distribution Classifier for Generalized Zero-Shot Learning - [Arxiv] [QA]
Polysemy Deciphering Network for Robust Human-Object Interaction Detection - [Arxiv] [QA]
Self-supervised Video Representation Learning Using Inter-intra Contrastive Framework - [Arxiv] [QA]
Pose-based Modular Network for Human-Object Interaction Detection - [Arxiv] [QA]
Predicting What You Already Know Helps: Provable Self-Supervised Learning - [Arxiv] [QA]
Explainable Face Recognition - [Arxiv] [QA]

July 2020

Demystifying Contrastive Self-Supervised Learning: Invariances, Augmentations and Dataset Biases - [Arxiv] [QA]
Self-supervised Learning for Large-scale Item Recommendations - [Arxiv] [QA]
Visual Compositional Learning for Human-Object Interaction Detection - [Arxiv] [QA]
Self-Supervised Learning Across Domains - [Arxiv] [QA]
Understanding BERT Rankers Under Distillation - [Arxiv] [QA]
Video Representation Learning by Recognizing Temporal Transformations - [Arxiv] [QA]
Learning Joint Spatial-Temporal Transformations for Video Inpainting - [Arxiv] [QA]
Mixture Representation Learning with Coupled Autoencoders - [Arxiv] [QA]
Mixture Representation Learning with Coupled Autoencoders - [Arxiv] [QA]
Leveraging Seen and Unseen Semantic Relationships for Generative Zero-Shot Learning - [Arxiv] [QA]
Towards Deeper Graph Neural Networks - [Arxiv] [QA]
Towards Deeper Graph Neural Networks - [Arxiv] [QA]
DVI: Depth Guided Video Inpainting for Autonomous Driving - [Arxiv] [QA]
Detecting Human-Object Interactions with Action Co-occurrence Priors - [Arxiv] [QA]
Hopfield Networks is All You Need - [Arxiv] [QA]
Natural Graph Networks - [Arxiv] [QA]
Few-shot Scene-adaptive Anomaly Detection - [Arxiv] [QA]
Few-shot Scene-adaptive Anomaly Detection - [Arxiv] [QA]
Rethinking Image Inpainting via a Mutual Encoder-Decoder with Feature Equalizations - [Arxiv] [QA]
A Graph-based Interactive Reasoning for Human-Object Interaction Detection - [Arxiv] [QA]
TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech - [Arxiv] [QA]
Accuracy Prediction with Non-neural Model for Neural Architecture Search - [Arxiv] [QA]
GOLD-NAS: Gradual, One-Level, Differentiable - [Arxiv] [QA]
GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis - [Arxiv] [QA]
Confidence-Aware Learning for Deep Neural Networks - [Arxiv] [QA]
The Fyodorov-Hiary-Keating Conjecture. I - [Arxiv] [QA]
Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval - [Arxiv] [QA]
Interactive Path Reasoning on Graph for Conversational Recommendation - [Arxiv] [QA]

June 2020

Data Movement Is All You Need: A Case Study on Optimizing Transformers - [Arxiv] [QA]
ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through Scene Graph - [Arxiv] [QA]
PLATO-2: Towards Building an Open-Domain Chatbot via Curriculum Learning - [Arxiv] [QA]
Lipschitzness Is All You Need To Tame Off-policy Generative Adversarial Imitation Learning - [Arxiv] [QA]
RepBERT: Contextualized Text Embeddings for First-Stage Retrieval - [Arxiv] [QA]
Video Representation Learning with Visual Tempo Consistency - [Arxiv] [QA]
GPT-GNN: Generative Pre-Training of Graph Neural Networks - [Arxiv] [QA]
Space-Time Correspondence as a Contrastive Random Walk - [Arxiv] [QA]
Practical applications of metric space magnitude and weighting vectors - [Arxiv] [QA]
Generative causal explanations of black-box classifiers - [Arxiv] [QA]
Gaining Insight into SARS-CoV-2 Infection and COVID-19 Severity Using Self-supervised Edge Features and Graph Neural Networks - [Arxiv] [QA]
A Constructive, Type-Theoretic Approach to Regression via Global Optimisation - [Arxiv] [QA]
Unsupervised Evaluation of Interactive Dialog with DialoGPT - [Arxiv] [QA]
Efficient Hyperparameter Optimization in Deep Learning Using a Variable Length Genetic Algorithm - [Arxiv] [QA]
Logarithmic Pruning is All You Need - [Arxiv] [QA]
Towards Understanding Label Smoothing - [Arxiv] [QA]
Towards Understanding Label Smoothing - [Arxiv] [QA]
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations - [Arxiv] [QA]
Self-Supervised Prototypical Transfer Learning for Few-Shot Classification - [Arxiv] [QA]
Denoising Diffusion Probabilistic Models - [Arxiv] [QA]
Neural Parameter Allocation Search - [Arxiv] [QA]
Neural Parameter Allocation Search - [Arxiv] [QA]
Contrastive learning of global and local features for medical image segmentation with limited annotations - [Arxiv] [QA]
Stochastic Bandits with Linear Constraints - [Arxiv] [QA]
Self-supervised Learning on Graphs: Deep Insights and New Direction - [Arxiv] [QA]
Big Self-Supervised Models are Strong Semi-Supervised Learners - [Arxiv] [QA]
GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training - [Arxiv] [QA]
Unsupervised Learning of Visual Features by Contrasting Cluster Assignments - [Arxiv] [QA]
Cross-lingual Retrieval for Iterative Self-Supervised Training - [Arxiv] [QA]
When Does Self-Supervision Help Graph Convolutional Networks? - [Arxiv] [QA]
Augmented Sliced Wasserstein Distances - [Arxiv] [QA]
Augmented Sliced Wasserstein Distances - [Arxiv] [QA]
Self-supervised Learning: Generative or Contrastive - [Arxiv] [QA]
DeeperGCN: All You Need to Train Deeper GCNs - [Arxiv] [QA]
DeeperGCN: All You Need to Train Deeper GCNs - [Arxiv] [QA]
IsarStep: a Benchmark for High-level Mathematical Reasoning - [Arxiv] [QA]
Interpretable Neural Architecture Search via Bayesian Optimisation with Weisfeiler-Lehman Kernels - [Arxiv] [QA]
Rethinking the Value of Labels for Improving Class-Imbalanced Learning - [Arxiv] [QA]
Self-Supervised Relational Reasoning for Representation Learning - [Arxiv] [QA]
Diagnosing Rarity in Human-Object Interaction Detection - [Arxiv] [QA]
Contrastive Multi-View Representation Learning on Graphs - [Arxiv] [QA]
Self-supervised Learning from a Multi-view Perspective - [Arxiv] [QA]
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech - [Arxiv] [QA]
Differentiable Neural Input Search for Recommender Systems - [Arxiv] [QA]
CoCon: A Self-Supervised Approach for Controlled Text Generation - [Arxiv] [QA]
M3P: Learning Universal Representations via Multitask Multilingual Multimodal Pre-training - [Arxiv] [QA]
Situated and Interactive Multimodal Conversations - [Arxiv] [QA]

May 2020

Bayesian Updates Compose Optically - [Arxiv] [QA]
Explainable Artificial Intelligence: a Systematic Review - [Arxiv] [QA]
Language Models are Few-Shot Learners - [Arxiv] [QA]
SCAN: Learning to Classify Images without Labels - [Arxiv] [QA]
High-Resolution Image Inpainting with Iterative Confidence Feedback and Guided Upsampling - [Arxiv] [QA]
Novel Human-Object Interaction Detection via Adversarial Domain Generalization - [Arxiv] [QA]
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks - [Arxiv] [QA]
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search - [Arxiv] [QA]
Novel Policy Seeking with Constrained Optimization - [Arxiv] [QA]
Novel Policy Seeking with Constrained Optimization - [Arxiv] [QA]
Is MAP Decoding All You Need? The Inadequacy of the Mode in Neural Machine Translation - [Arxiv] [QA]
Mirror Descent Policy Optimization - [Arxiv] [QA]
Mirror Descent Policy Optimization - [Arxiv] [QA]
Normalized Attention Without Probability Cage - [Arxiv] [QA]
Normalized Attention Without Probability Cage - [Arxiv] [QA]
Vector-Quantized Autoregressive Predictive Coding - [Arxiv] [QA]
Semantic Photo Manipulation with a Generative Image Prior - [Arxiv] [QA]
Is Your Goal-Oriented Dialog Model Performing Really Well? Empirical Analysis of System-wise Evaluation - [Arxiv] [QA]
Multi-band MelGAN: Faster Waveform Generation for High-Quality Text-to-Speech - [Arxiv] [QA]
Local Self-Attention over Long Text for Efficient Document Retrieval - [Arxiv] [QA]
Categorical Stochastic Processes and Likelihood - [Arxiv] [QA]
Condensed Movies: Story Based Retrieval with Contextual Embeddings - [Arxiv] [QA]
DramaQA: Character-Centered Video Story Understanding with Hierarchical QA - [Arxiv] [QA]
The Cascade Transformer: an Application for Efficient Answer Sentence Selection - [Arxiv] [QA]
Evaluating Explainable AI: Which Algorithmic Explanations Help Users Predict Model Behavior? - [Arxiv] [QA]
Does Visual Self-Supervision Improve Learning of Speech Representations for Emotion Recognition? - [Arxiv] [QA]
Learning an Unreferenced Metric for Online Dialogue Evaluation - [Arxiv] [QA]
POINTER: Constrained Progressive Text Generation via Insertion-based Generative Pre-training - [Arxiv] [QA]
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training - [Arxiv] [QA]
Sparse, Dense, and Attentional Representations for Text Retrieval - [Arxiv] [QA]

April 2020

Consistent Video Depth Estimation - [Arxiv] [QA]
Training Curricula for Open Domain Answer Re-Ranking - [Arxiv] [QA]
Efficient Document Re-Ranking for Transformers by Precomputing Term Representations - [Arxiv] [QA]
Pre-training Is (Almost) All You Need: An Application to Commonsense Reasoning - [Arxiv] [QA]
Complementing Lexical Retrieval with Semantic Residual Embedding - [Arxiv] [QA]
Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels - [Arxiv] [QA]
Recipes for building an open-domain chatbot - [Arxiv] [QA]
Modularized Transfomer-based Ranking Framework - [Arxiv] [QA]
ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT - [Arxiv] [QA]
All you need is a second look: Towards Tighter Arbitrary shape text detection - [Arxiv] [QA]
Multi-Domain Dialogue Acts and Response Co-Generation - [Arxiv] [QA]
Beyond 512 Tokens: Siamese Multi-depth Transformer-based Hierarchical Encoder for Long-Form Document Matching - [Arxiv] [QA]
A survey on domain adaptation theory: learning bounds and theoretical guarantees - [Arxiv] [QA]
Learning Term Discrimination - [Arxiv] [QA]
Supervised Contrastive Learning - [Arxiv] [QA]
Federated Stochastic Gradient Langevin Dynamics - [Arxiv] [QA]
Federated Stochastic Gradient Langevin Dynamics - [Arxiv] [QA]
Distilling Knowledge for Fast Retrieval-based Chat-bots - [Arxiv] [QA]
Considering Likelihood in NLP Classification Explanations with Occlusion and Language Modeling - [Arxiv] [QA]
Detailed 2D-3D Joint Representation for Human-Object Interaction - [Arxiv] [QA]
Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks - [Arxiv] [QA]
Will I Sound Like Me? Improving Persona Consistency in Dialogues through Pragmatic Self-Consciousness - [Arxiv] [QA]
Spatially-Attentive Patch-Hierarchical Network for Adaptive Motion Deblurring - [Arxiv] [QA]
Dense Passage Retrieval for Open-Domain Question Answering - [Arxiv] [QA]
TextGAIL: Generative Adversarial Imitation Learning for Text Generation - [Arxiv] [QA]
There and Back Again: Revisiting Backpropagation Saliency Methods - [Arxiv] [QA]
PaStaNet: Toward Human Activity Knowledge Engine - [Arxiv] [QA]
A Survey on Conversational Recommender Systems - [Arxiv] [QA]

March 2020

How Useful is Self-Supervised Pretraining for Visual Tasks? - [Arxiv] [QA]
Learning Human-Object Interaction Detection using Interaction Points - [Arxiv] [QA]
InterBERT: Vision-and-Language Interaction for Multi-modal Pretraining - [Arxiv] [QA]
VIOLIN: A Large-Scale Dataset for Video-and-Language Inference - [Arxiv] [QA]
Rethinking Few-Shot Image Classification: a Good Embedding Is All You Need? - [Arxiv] [QA]
Deformable Style Transfer - [Arxiv] [QA]
Distributional Reinforcement Learning with Ensembles - [Arxiv] [QA]
Distributional Reinforcement Learning with Ensembles - [Arxiv] [QA]
Model-based Asynchronous Hyperparameter and Neural Architecture Search - [Arxiv] [QA]
Pre-trained Models for Natural Language Processing: A Survey - [Arxiv] [QA]
Latent Embedding Feedback and Discriminative Features for Zero-Shot Classification - [Arxiv] [QA]
XPersona: Evaluating Multilingual Personalized Chatbot - [Arxiv] [QA]
Guidance and Evaluation: Semantic-Aware Image Inpainting for Mixed Scenes - [Arxiv] [QA]
VCNet: A Robust Approach to Blind Image Inpainting - [Arxiv] [QA]
Document Ranking with a Pretrained Sequence-to-Sequence Model - [Arxiv] [QA]
VSGNet: Spatial Attention Network for Detecting Human Object Interactions Using Graph Convolutions - [Arxiv] [QA]
Building and Interpreting Deep Similarity Models - [Arxiv] [QA]
xCos: An Explainable Cosine Metric for Face Verification Task - [Arxiv] [QA]
Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning - [Arxiv] [QA]
ReZero is All You Need: Fast Convergence at Large Depth - [Arxiv] [QA]
Improved Baselines with Momentum Contrastive Learning - [Arxiv] [QA]
How to Train Your Super-Net: An Analysis of Training Heuristics in Weight-Sharing NAS - [Arxiv] [QA]
Cascaded Human-Object Interaction Recognition - [Arxiv] [QA]
A Safety Framework for Critical Systems Utilising Deep Neural Networks - [Arxiv] [QA]
De Finetti's construction as a categorical limit - [Arxiv] [QA]
AlignTTS: Efficient Feed-Forward Text-to-Speech System without Explicit Alignment - [Arxiv] [QA]
XGPT: Cross-modal Generative Pre-Training for Image Captioning - [Arxiv] [QA]
Benchmarking Graph Neural Networks - [Arxiv] [QA]
Benchmarking Graph Neural Networks - [Arxiv] [QA]

February 2020

DC-BERT: Decoupling Question and Document for Efficient Contextual Encoding - [Arxiv] [QA]
Estimation-Action-Reflection: Towards Deep Interaction Between Conversational and Recommender Systems - [Arxiv] [QA]
Automatic Shortcut Removal for Self-Supervised Representation Learning - [Arxiv] [QA]
Disentangled Speech Embeddings using Cross-modal Self-supervision - [Arxiv] [QA]
Gradient Boosting Neural Networks: GrowNet - [Arxiv] [QA]
Gradient Boosting Neural Networks: GrowNet - [Arxiv] [QA]
Information Condensing Active Learning - [Arxiv] [QA]
Information Condensing Active Learning - [Arxiv] [QA]
UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation - [Arxiv] [QA]
A Simple Framework for Contrastive Learning of Visual Representations - [Arxiv] [QA]
REALM: Retrieval-Augmented Language Model Pre-Training - [Arxiv] [QA]
Pre-training Tasks for Embedding-based Large-scale Retrieval - [Arxiv] [QA]
Unsupervised pretraining transfers well across languages - [Arxiv] [QA]
Dynamic Knowledge Routing Network For Target-Guided Open-Domain Conversation - [Arxiv] [QA]
Proving the Lottery Ticket Hypothesis: Pruning is All You Need - [Arxiv] [QA]

January 2020

Learning Robust and Multilingual Speech Representations - [Arxiv] [QA]
Selective Weak Supervision for Neural Information Retrieval - [Arxiv] [QA]
Multi-task self-supervised learning for Robust Speech Recognition - [Arxiv] [QA]
TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval - [Arxiv] [QA]
Scaling Laws for Neural Language Models - [Arxiv] [QA]
Safety Concerns and Mitigation Approaches Regarding the Use of Deep Learning in Safety-Critical Perception Tasks - [Arxiv] [QA]
Discriminator Soft Actor Critic without Extrinsic Rewards - [Arxiv] [QA]
Latency-Aware Differentiable Neural Architecture Search - [Arxiv] [QA]
MixPath: A Unified Approach for One-shot Neural Architecture Search - [Arxiv] [QA]
A Categorical Framework for Learning Generalised Tree Automata - [Arxiv] [QA]
Classifying All Interacting Pairs in a Single Shot - [Arxiv] [QA]
Visually Guided Self Supervised Learning of Speech Representations - [Arxiv] [QA]
ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training - [Arxiv] [QA]
Visual-Semantic Graph Attention Networks for Human-Object Interaction Detection - [Arxiv] [QA]
Correctness of Automatic Differentiation via Diffeologies and Categorical Gluing - [Arxiv] [QA]
Deeper Insights into Weight Sharing in Neural Architecture Search - [Arxiv] [QA]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Papers-2020.md

Papers-2020.md

December 2020

November 2020

October 2020

September 2020

August 2020

July 2020

June 2020

May 2020

April 2020

March 2020

February 2020

January 2020

Files

Papers-2020.md

Latest commit

History

Papers-2020.md

File metadata and controls

December 2020

November 2020

October 2020

September 2020

August 2020

July 2020

June 2020

May 2020

April 2020

March 2020

February 2020

January 2020