Inquiry about Key/Value Storage and Matrix Merging in DeepSeekerV2 Inference Code #92

xlim1996 · 2024-09-26T09:45:11Z

Dear DeepSeekerV2 team,

First of all, I would like to thank you for your incredible work on DeepSeekerV2. I am very interested in the model and have been exploring it in detail. However, I have a couple of questions related to the implementation of your inference process.

In the paper, you mentioned that during inference, the compressed latent vectors for keys and values (ct^kv) are stored. However, when I checked the HuggingFace code implementation, I noticed that Key_states and Value_states are still being saved separately during inference. Could you clarify how this aligns with the approach mentioned in the paper?

Additionally, the paper discusses merging W^UV into WO and W^UK into WQ for efficiency. However, I couldn't locate this merging process in the code either. Could you provide some insights or point me in the right direction on how this is implemented?

Thank you again for your fantastic work, and I really look forward to your guidance on these points.

Best regards,
lucas

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inquiry about Key/Value Storage and Matrix Merging in DeepSeekerV2 Inference Code #92

Inquiry about Key/Value Storage and Matrix Merging in DeepSeekerV2 Inference Code #92

xlim1996 commented Sep 26, 2024

Inquiry about Key/Value Storage and Matrix Merging in DeepSeekerV2 Inference Code #92

Inquiry about Key/Value Storage and Matrix Merging in DeepSeekerV2 Inference Code #92

Comments

xlim1996 commented Sep 26, 2024