Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DO NOT MERGE] Vinayak/moe final hashem #127

Open
wants to merge 24 commits into
base: main
Choose a base branch
from

Commits on Jul 30, 2024

  1. Microbenchmark for fused moe

    vgokhale committed Jul 30, 2024
    Configuration menu
    Copy the full SHA
    dc5b660 View commit details
    Browse the repository at this point in the history
  2. Revert broken load

    vgokhale committed Jul 30, 2024
    Configuration menu
    Copy the full SHA
    d5564d3 View commit details
    Browse the repository at this point in the history

Commits on Aug 1, 2024

  1. Configuration menu
    Copy the full SHA
    1934f71 View commit details
    Browse the repository at this point in the history
  2. Config issues

    vgokhale committed Aug 1, 2024
    Configuration menu
    Copy the full SHA
    e01cd34 View commit details
    Browse the repository at this point in the history

Commits on Aug 2, 2024

  1. Configuration menu
    Copy the full SHA
    24317bb View commit details
    Browse the repository at this point in the history

Commits on Aug 6, 2024

  1. Configuration menu
    Copy the full SHA
    6696142 View commit details
    Browse the repository at this point in the history

Commits on Aug 8, 2024

  1. Latest perf

    vgokhale committed Aug 8, 2024
    Configuration menu
    Copy the full SHA
    077cb78 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    eb861ac View commit details
    Browse the repository at this point in the history
  3. Add masking back

    vgokhale committed Aug 8, 2024
    Configuration menu
    Copy the full SHA
    6e70d89 View commit details
    Browse the repository at this point in the history
  4. Add accuracy test

    vgokhale committed Aug 8, 2024
    Configuration menu
    Copy the full SHA
    2b94a75 View commit details
    Browse the repository at this point in the history
  5. Merge all changes

    vgokhale committed Aug 8, 2024
    Configuration menu
    Copy the full SHA
    c406afc View commit details
    Browse the repository at this point in the history
  6. Fixes

    vgokhale committed Aug 8, 2024
    Configuration menu
    Copy the full SHA
    5b92aa6 View commit details
    Browse the repository at this point in the history

Commits on Aug 9, 2024

  1. Fix shuffling bug

    vgokhale committed Aug 9, 2024
    Configuration menu
    Copy the full SHA
    98a31f2 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    75031c5 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    7c64459 View commit details
    Browse the repository at this point in the history

Commits on Aug 10, 2024

  1. Use shuffled layout in UT

    vgokhale committed Aug 10, 2024
    Configuration menu
    Copy the full SHA
    b213dfe View commit details
    Browse the repository at this point in the history
  2. Fix test bugs

    vgokhale committed Aug 10, 2024
    Configuration menu
    Copy the full SHA
    7df58ad View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    1ca0ad6 View commit details
    Browse the repository at this point in the history
  4. Enable LDS bypass

    vgokhale committed Aug 10, 2024
    Configuration menu
    Copy the full SHA
    0cbe892 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    f2f1ca5 View commit details
    Browse the repository at this point in the history
  6. Add new configs

    vgokhale committed Aug 10, 2024
    Configuration menu
    Copy the full SHA
    c43e3e8 View commit details
    Browse the repository at this point in the history

Commits on Aug 11, 2024

  1. Configuration menu
    Copy the full SHA
    ce8a86e View commit details
    Browse the repository at this point in the history
  2. Add batched prefill via VLLM_SCHED_PREFILL_COUNT

    To ensure we we don't run prefills repeatedly during decode, provide a
    mechanism to queue up a certain number of prefills before executing.
    VLLM_SCHED_PREFILL_COUNT will be the minimum batch count to specify before
    executing.  One caveat, the --scheduler-delay-factor should be used to
    enforce a longer prefill scheduling value.  This will be set to the value
    in VLLM_SCHED_PREFILL_COUNT, if not explicitly provided.  The need for this exists
    because an uneven number of prefills can lead to the queue never reaching the
    VLLM_SCHED_PREFILL_COUNT.  Causing the server to hang
    dllehr-amd authored and valarLip committed Aug 11, 2024
    Configuration menu
    Copy the full SHA
    8148b54 View commit details
    Browse the repository at this point in the history
  3. add script for decode

    carlushuang committed Aug 11, 2024
    Configuration menu
    Copy the full SHA
    b9f05ff View commit details
    Browse the repository at this point in the history