Skip to content

Johswald/awesome-hypernetworks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 

Repository files navigation

Awesome hypernetworks

A curated list of awesome hypernetwork resources, inspired by awesome-computer-vision and awesome implicit representations.

Introduction

Hypernetworks have become very common in the field of deep learning and appear in some way or another already in thousands of papers. In the following, I will therefore try to make a list of resources that are only a good representative of the most interesting concepts around HyperNetworks. Also, there will be bias towards my papers.

Please get in touch when you think I missed important references.

What are Hypernetworks?

HyperNetworks are simply neural networks that produce and/or adapt parameters of another parametrized model. Without surprise, they at least date back to the beginning of the 1990s and Schmidhuber in the context of meta-learning and self-referential. Hypernetworks have been applied in a very large range of deep learning contexts and applications which I try to cover below.

Papers

Adaptive Layers

The core idea of adaptive layers is the make the parameters of a certain layer of the neural network adapt to computation that preceded the layers computation. Usually, and in contrast to that, the computational node of normal parameters have no parent and the node’s value is static during the “forward” computation.

Fast weights and work on RNNs.

Work on CNNs.

Work on generative models. The following two papers simply condition the generators of a GAN on side information. Probably there is more interesting work, please contact me if you know of something. I also list my paper "continual learning with hypernetworks" here because we use a hypernetwork i.a. to generate weights of a decoder in a variational autoencoder.

An overview of multiplicative interactions and hypernetworks

Self-attention

Self-attention is a form of adaptive layers. Nevertheless, I will not cover transformer literature here but mention this Schlag, Irie and Schmidhuber paper that discusses the equivalence to fast weights:

Architecture search and Hypernetworks used in Neuroevolution

There has been very nice ideas that use Hypernetworks in architecture search. This list is probably far from accurate and complete.

Implicit Neural Representations

Implicit Neural Representations are continuous functions, usually neural networks, that simply represent a map between a domain and the signal value. Interestingly, hypernetworks are used in this framework intensively.

Meta- and Continual Learning

Algorithms that tackle meta- and continual learning with the help of hypernetworks have been developed extensively. Naturally, one can view the considered problems as acting on different time scales and formulate them as solutions to a bilevel optimization or related formulations where Hypernetworks can work well.

Reinforcement learning

I have not seen many papers so far that use hypernetworks to tackle RL problems explicitly. Please contact me if you know of any.

Modeling distributions

The following papers use hypernetworks to model a distribution over the weights of the target network. For example, one can use a hypernetwork to transform a simple normal distribution into a potentially complex weight distribution that captures the epistemic uncertainty of the model.

Others

Hypernetwork papers that do not fall in the categories above.

Links to Code

The following links implemented different Hypernetwork in Pytorch code.

Talks

License

License: MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published