Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch input activations #33

Merged
merged 29 commits into from
Feb 25, 2021
Merged

Batch input activations #33

merged 29 commits into from
Feb 25, 2021

Conversation

jalammar
Copy link
Owner

  • Adds a documentation portal
  • LM now has a call() function so MLMs like BERT can be supported (without requiring text generation). Closes Add ability to run models via __call__(), and not just using generate() #18
  • call() and other functions now all support a batch dimension. The exception is "generate" which works on a single input sequence and not a batch. Closes Support batched inference #19
  • Set up ground-work towards BERT/MLM support #6. BERT is now supported for activation collection and an earlier version of NMF factorization. EccoJS needs to clean up partial token characters like (##). Or, better yet, eccojs should remain dumb and we give it the tokens cleaned up and it only worries about displaying them.
  • Part of the groundwork to support additional models is the model-config.yml file which should lay out how to connect ecco.LM with the underlying language model

…generation.

- Activation collection and processing now supports batched inputs [still work in progress]
- Updated existing tests to support the new shape of the activations tensor with a batch dimension
- Creating a mockGPT to more properly test lm functionality [work in progress]
…n collection. No support for saliency or other features yet.

- Activation collection and processing now supports batched inputs [still work in progress]
- Creating a mockGPT to more properly test lm functionality [work in progress]
…and NMF visualization. Tested with a batch of one input.

- Activation collection and processing now supports batched inputs [still work in progress]
- Creating a mockGPT to more properly test lm functionality [work in progress]
…and NMF visualization. Tested with a batch of one input.

- Activation collection and processing now supports batched inputs [still work in progress]
- Creating a mockGPT to more properly test lm functionality [work in progress]
# Conflicts:
#	setup.py
#	src/ecco/__init__.py
#	src/ecco/lm.py
#	tests/lm_test.py
#	tests/output_test.py
- Pytest passes locally
…els to be supported and define their key layers (embeddings for saliency, and FFNN for activations) in YAML.

- Defined a batch of initial models in model-config.yaml. The top pytorch models and two dummy models for testing purposes.
- Started writing tests in lm_test.py that act as integration tests with HF Transformers. These use tiny GPT/BERT models to ensure functionality works between ecco and the models. This is to automated tests that were previously done manually in jupyter notebooks before release.
- Switched docs from sphinx to mkdoc
- Wrote a couple of docs pages, set up a skeleton for navigation. Expanding on docstrings
- Added a 'verbose' parameter to LM to suppress printing tokens during generation.
- Removed unused MockGPT code.
…hed from layer name, to using a regex pattern. This was we can be more specific and not collect other layers by mistake (example: both attention and ffnn layers of bert contain "output.dense" layers.

- Starting documenting lm as a module. Looks messy for now. Will see how to clean it up later.
- Can now force LM to use CPU even if GPU is available. That is done by setting the "gpu" parameter to False in ecco.from_pretrained().
- __call__ now automatically moves input token to GPU if the model is on GPU.
- In _get_activations_hook, extracting the layer number is now done more precisely using regex.
- To better support new models, AutoModel is now used on everything that doesn't have "GPT2" in its model name. Stopgap for now.
…as not consistent between outputs of "generate()" and "__call__".

- generate() now produces the same shape of 'token_ids' and 'tokens' produced by __call__. dims: (batch, position). OutputSeq should likely verify the dims of inputs.
- __call__ now returns 'token_ids' without the 'input_ids' dict key. Consumers shouldn't know about the distinction.
- Added tests for nmf pipeline for both dummy bert and GPT.
- More docs
…n_ dimensions of token_ids and tokens

- created index of docs
- Added simple css to the docs template
- Adding notebook links
@jalammar jalammar merged commit 8639e3a into main Feb 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support batched inference Add ability to run models via __call__(), and not just using generate()
1 participant