Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch input activations #33

Merged
merged 29 commits into from
Feb 25, 2021
Merged

Batch input activations #33

merged 29 commits into from
Feb 25, 2021

Commits on Jan 4, 2021

  1. - Created a __call__ method in lm to allow running the model without …

    …generation.
    
    - Activation collection and processing now supports batched inputs [still work in progress]
    - Updated existing tests to support the new shape of the activations tensor with a batch dimension
    - Creating a mockGPT to more properly test lm functionality [work in progress]
    jalammar committed Jan 4, 2021
    Configuration menu
    Copy the full SHA
    65c6516 View commit details
    Browse the repository at this point in the history

Commits on Jan 19, 2021

  1. - Initial BERT/DistilBERT support now works in __call__ for activatio…

    …n collection. No support for saliency or other features yet.
    
    - Activation collection and processing now supports batched inputs [still work in progress]
    - Creating a mockGPT to more properly test lm functionality [work in progress]
    jalammar committed Jan 19, 2021
    Configuration menu
    Copy the full SHA
    43a9450 View commit details
    Browse the repository at this point in the history

Commits on Jan 20, 2021

  1. - BERT/Distilbert now work for activation collection, NMF reduction, …

    …and NMF visualization. Tested with a batch of one input.
    
    - Activation collection and processing now supports batched inputs [still work in progress]
    - Creating a mockGPT to more properly test lm functionality [work in progress]
    jalammar committed Jan 20, 2021
    Configuration menu
    Copy the full SHA
    02d20d6 View commit details
    Browse the repository at this point in the history

Commits on Feb 11, 2021

  1. - BERT/Distilbert now work for activation collection, NMF reduction, …

    …and NMF visualization. Tested with a batch of one input.
    
    - Activation collection and processing now supports batched inputs [still work in progress]
    - Creating a mockGPT to more properly test lm functionality [work in progress]
    jalammar committed Feb 11, 2021
    Configuration menu
    Copy the full SHA
    fb47b7b View commit details
    Browse the repository at this point in the history
  2. Merge remote-tracking branch 'origin/main' into batch-input-activations

    # Conflicts:
    #	setup.py
    #	src/ecco/__init__.py
    #	src/ecco/lm.py
    #	tests/lm_test.py
    #	tests/output_test.py
    jalammar committed Feb 11, 2021
    Configuration menu
    Copy the full SHA
    583e0af View commit details
    Browse the repository at this point in the history
  3. - Merged main v0.0.13 changes

    - Pytest passes locally
    jalammar committed Feb 11, 2021
    Configuration menu
    Copy the full SHA
    eb88b7d View commit details
    Browse the repository at this point in the history

Commits on Feb 13, 2021

  1. - carved out model configurations to model-config. To enable more mod…

    …els to be supported and define their key layers (embeddings for saliency, and FFNN for activations) in YAML.
    
    - Defined a batch of initial models in model-config.yaml. The top pytorch models and two dummy models for testing purposes.
    - Started writing tests in lm_test.py that act as integration tests with HF Transformers. These use tiny GPT/BERT models to ensure functionality works between ecco and the models. This is to automated tests that were previously done manually in jupyter notebooks before release.
    - Switched docs from sphinx to mkdoc
    - Wrote a couple of docs pages, set up a skeleton for navigation. Expanding on docstrings
    - Added a 'verbose' parameter to LM to suppress printing tokens during generation.
    - Removed unused MockGPT code.
    jalammar committed Feb 13, 2021
    Configuration menu
    Copy the full SHA
    0bace9d View commit details
    Browse the repository at this point in the history

Commits on Feb 14, 2021

  1. - When specifying the layer name to collect activations for, we switc…

    …hed from layer name, to using a regex pattern. This was we can be more specific and not collect other layers by mistake (example: both attention and ffnn layers of bert contain "output.dense" layers.
    
    - Starting documenting lm as a module. Looks messy for now. Will see how to clean it up later.
    - Can now force LM to use CPU even if GPU is available. That is done by setting the "gpu" parameter to False in ecco.from_pretrained().
    - __call__ now automatically moves input token to GPU if the model is on GPU.
    - In _get_activations_hook, extracting the layer number is now done more precisely using regex.
    jalammar committed Feb 14, 2021
    Configuration menu
    Copy the full SHA
    dab2111 View commit details
    Browse the repository at this point in the history
  2. - Fixed typo in setup.py

    - To better support new models, AutoModel is now used on everything that doesn't have "GPT2" in its model name. Stopgap for now.
    jalammar committed Feb 14, 2021
    Configuration menu
    Copy the full SHA
    181feb6 View commit details
    Browse the repository at this point in the history

Commits on Feb 18, 2021

  1. - Fixed NMF.explore() issue where the resulting 'factors' parameter w…

    …as not consistent between outputs of "generate()" and "__call__".
    
    - generate() now produces the same shape of 'token_ids' and 'tokens' produced by __call__. dims: (batch, position). OutputSeq should likely verify the dims of inputs.
    - __call__ now returns 'token_ids' without the 'input_ids' dict key. Consumers shouldn't know about the distinction.
    - Added tests for nmf pipeline for both dummy bert and GPT.
    - More docs
    jalammar committed Feb 18, 2021
    Configuration menu
    Copy the full SHA
    7901faa View commit details
    Browse the repository at this point in the history

Commits on Feb 20, 2021

  1. - More docs

    jalammar committed Feb 20, 2021
    Configuration menu
    Copy the full SHA
    5b37c34 View commit details
    Browse the repository at this point in the history
  2. - More docs

    jalammar committed Feb 20, 2021
    Configuration menu
    Copy the full SHA
    c11b079 View commit details
    Browse the repository at this point in the history
  3. - More docs work

    jalammar committed Feb 20, 2021
    Configuration menu
    Copy the full SHA
    9f72b64 View commit details
    Browse the repository at this point in the history
  4. - More docs work

    jalammar committed Feb 20, 2021
    Configuration menu
    Copy the full SHA
    218c5e4 View commit details
    Browse the repository at this point in the history
  5. - More docs work

    jalammar committed Feb 20, 2021
    Configuration menu
    Copy the full SHA
    73f7d77 View commit details
    Browse the repository at this point in the history
  6. - More docs work

    jalammar committed Feb 20, 2021
    Configuration menu
    Copy the full SHA
    10ebce7 View commit details
    Browse the repository at this point in the history
  7. - More docs work

    jalammar committed Feb 20, 2021
    Configuration menu
    Copy the full SHA
    c79ff19 View commit details
    Browse the repository at this point in the history
  8. - More docs work

    jalammar committed Feb 20, 2021
    Configuration menu
    Copy the full SHA
    fc0c78d View commit details
    Browse the repository at this point in the history
  9. - More docs work

    jalammar committed Feb 20, 2021
    Configuration menu
    Copy the full SHA
    50f1bc2 View commit details
    Browse the repository at this point in the history

Commits on Feb 21, 2021

  1. - More docs work

    jalammar committed Feb 21, 2021
    Configuration menu
    Copy the full SHA
    c8d6686 View commit details
    Browse the repository at this point in the history
  2. - More docs work

    jalammar committed Feb 21, 2021
    Configuration menu
    Copy the full SHA
    124c7f4 View commit details
    Browse the repository at this point in the history
  3. - More docs work

    jalammar committed Feb 21, 2021
    Configuration menu
    Copy the full SHA
    2abe152 View commit details
    Browse the repository at this point in the history
  4. - More docs work

    jalammar committed Feb 21, 2021
    Configuration menu
    Copy the full SHA
    4e68c64 View commit details
    Browse the repository at this point in the history

Commits on Feb 22, 2021

  1. - Fixed rankings() and saliency() to adapt to the new (batch, positio…

    …n_ dimensions of token_ids and tokens
    
    - created index of docs
    - Added simple css to the docs template
    jalammar committed Feb 22, 2021
    Configuration menu
    Copy the full SHA
    7ce6e47 View commit details
    Browse the repository at this point in the history
  2. - Adding images to docs

    - Adding notebook links
    jalammar committed Feb 22, 2021
    Configuration menu
    Copy the full SHA
    dcff20f View commit details
    Browse the repository at this point in the history

Commits on Feb 24, 2021

  1. Configuration menu
    Copy the full SHA
    a8d20c8 View commit details
    Browse the repository at this point in the history
  2. - Docs

    jalammar committed Feb 24, 2021
    Configuration menu
    Copy the full SHA
    c78a435 View commit details
    Browse the repository at this point in the history
  3. - Docs

    jalammar committed Feb 24, 2021
    Configuration menu
    Copy the full SHA
    86f048b View commit details
    Browse the repository at this point in the history
  4. - Docs

    jalammar committed Feb 24, 2021
    Configuration menu
    Copy the full SHA
    86cdb9e View commit details
    Browse the repository at this point in the history