Skip to content

[WIP] Support for cached multi-query attention towards speculative decoding #893

[WIP] Support for cached multi-query attention towards speculative decoding

[WIP] Support for cached multi-query attention towards speculative decoding #893