Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inquiry about Key/Value Storage and Matrix Merging in DeepSeekerV2 Inference Code #92

Open
xlim1996 opened this issue Sep 26, 2024 · 0 comments

Comments

@xlim1996
Copy link

Dear DeepSeekerV2 team,

First of all, I would like to thank you for your incredible work on DeepSeekerV2. I am very interested in the model and have been exploring it in detail. However, I have a couple of questions related to the implementation of your inference process.

In the paper, you mentioned that during inference, the compressed latent vectors for keys and values (ct^kv) are stored. However, when I checked the HuggingFace code implementation, I noticed that Key_states and Value_states are still being saved separately during inference. Could you clarify how this aligns with the approach mentioned in the paper?

Additionally, the paper discusses merging W^UV into WO and W^UK into WQ for efficiency. However, I couldn't locate this merging process in the code either. Could you provide some insights or point me in the right direction on how this is implemented?

Thank you again for your fantastic work, and I really look forward to your guidance on these points.

Best regards,
lucas

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant