Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

more traces? #66

Open
liecn opened this issue May 20, 2024 · 7 comments
Open

more traces? #66

liecn opened this issue May 20, 2024 · 7 comments
Labels
help wanted Extra attention is needed question Further information is requested

Comments

@liecn
Copy link

liecn commented May 20, 2024

Please provide a detailed description of your question or the information you seek.

Hi,

Could you please share more ET traces, such as the LLaMA traces you mentioned in previous issues?

Currently, I only have the converted traces from Astra-sim 1.0 and the Megatron trace mentioned in issue #176.

It would be really helpful if you could share more traces.

Thanks!

@liecn liecn added the question Further information is requested label May 20, 2024
@liecn
Copy link
Author

liecn commented May 21, 2024

Additionally, it appears that the functionality to parse the text files Transformer_HybridParallel.txt and Transformer_HybridParallel_Fwd_In_Bckwd.txt is missing.

@srinivas212
Copy link
Contributor

These text files are an artifact of ASTRA-sim 1.0 and not Chakra.

The best way to get these traces is collect it by running PyTorch model and enabling the profiler. Are you looking for instructions to collect yourself?

@liecn
Copy link
Author

liecn commented May 22, 2024

Thank you for your response. I appreciate the instructions on the Wiki and find them clear.

I'm just interested in whether I could obtain the measured traces from your end, particularly those involving many nodes, as they would be highly beneficial for my simulation.

@srinivas212
Copy link
Contributor

Yes, I understand!

What scale are you looking at?

We are updating comms group info in pytorch and collecting few traces. I will check and see if we can share externally. We do want to eventually setup a DB of traces but hosting the DB and keeping them up-to-date are TBD.

@liecn
Copy link
Author

liecn commented May 22, 2024

Understood.

I'm currently in need of some traces for transformers and LLAMA involving tens of nodes.

Once again, I really appreciate your outstanding work!

@srinivas212 srinivas212 added the help wanted Extra attention is needed label Jun 14, 2024
@tyn513
Copy link

tyn513 commented Jun 18, 2024

is there any available trace now?

@32HD
Copy link

32HD commented Sep 5, 2024

If more multi-node traces can be opened, it will be very helpful to me, thank you!

rvinaybharadwaj pushed a commit to rvinaybharadwaj/chakra that referenced this issue Sep 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants