Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core][Debugger] Info about node IP for each breakpoint is not exposed in Ray Debugger #48129

Closed
ArturNiederfahrenhorst opened this issue Oct 21, 2024 · 4 comments · Fixed by #48202
Labels
debugger enhancement Request for new feature and/or capability triage Needs triage (eg: priority, bug/not-bug, and owning component)

Comments

@ArturNiederfahrenhorst
Copy link
Contributor

Description

When using the Ray Debugger in multi-node settings where we hit numerous breakpoints, it would be helpful to see which node/IP matches which breakpoint.

Use case

No response

@ArturNiederfahrenhorst ArturNiederfahrenhorst added enhancement Request for new feature and/or capability triage Needs triage (eg: priority, bug/not-bug, and owning component) debugger labels Oct 21, 2024
@rynewang
Copy link
Contributor

Context: so now the ray cli debugger shows a menu for breakpoints like

NUM | TIME | task/actor name | pwd

But if a same task/actor has 10 instances all breakpointed, they look the same in the menu so the user can't know which one to debug on.

@ArturNiederfahrenhorst
Copy link
Contributor Author

Thanks for adding the context. Can we commit to a timeline on this?

@rynewang
Copy link
Contributor

I made a PR to expose "Node ID", "Worker ID", "Actor ID", "Task ID". I guess this is enough for most use cases.

index | timestamp           | Ray task | filename:lineno                              | Node ID                                                  | Worker ID                                                | Actor ID                         | Task ID                                         
0     | 2024-10-22 21:10:31 | ray::f   | /Users/ruiyangwang/tmp/debugger/simple.py:13 | fa5968b1c7df5fb95f4d33128b37f55e3a5edfd15312eef521c30215 | b23dcddb0f990534adf5833b5e274d5d7b807e54f1ed83e4448df6fb |                                  | fc6ad546348b7319ffffffffffffffffffffffff08000000
1     | 2024-10-22 21:10:31 | ray::f   | /Users/ruiyangwang/tmp/debugger/simple.py:13 | fa5968b1c7df5fb95f4d33128b37f55e3a5edfd15312eef521c30215 | 6f030680e72a63753a36c8224229bae34196699035785a562c92e3ec |                                  | 64a7bfb65a47b860ffffffffffffffffffffffff08000000
2     | 2024-10-22 21:10:31 | ray::f   | /Users/ruiyangwang/tmp/debugger/simple.py:13 | fa5968b1c7df5fb95f4d33128b37f55e3a5edfd15312eef521c30215 | f43c05719f1e141554e464df015057e0ea68ae25cd34ab90f3446f95 |                                  | e80e45b286b5e0c0ffffffffffffffffffffffff08000000
3     | 2024-10-22 21:10:31 | ray::f   | /Users/ruiyangwang/tmp/debugger/simple.py:13 | fa5968b1c7df5fb95f4d33128b37f55e3a5edfd15312eef521c30215 | 47397d8fa58d37eb7681f81d942d286bb6728ffcc8957bd82512155c |                                  | aa519d4ad69f6ba0ffffffffffffffffffffffff08000000
4     | 2024-10-22 21:10:31 | ray::f   | /Users/ruiyangwang/tmp/debugger/simple.py:13 | fa5968b1c7df5fb95f4d33128b37f55e3a5edfd15312eef521c30215 | a0a4338b4d27858e5c94eb7713d77e2afd6ae7dfb1d17f39e6563c0d |                                  | 46d8fe0751d2bcb9ffffffffffffffffffffffff08000000
5     | 2024-10-22 21:10:31 | ray::f   | /Users/ruiyangwang/tmp/debugger/simple.py:13 | fa5968b1c7df5fb95f4d33128b37f55e3a5edfd15312eef521c30215 | b0b9ee330c99b005136986b88e5fc17557c5ac52e4856d91ba9ceded |                                  | b76042b39a3502c3ffffffffffffffffffffffff08000000
6     | 2024-10-22 21:10:31 | ray::f   | /Users/ruiyangwang/tmp/debugger/simple.py:13 | fa5968b1c7df5fb95f4d33128b37f55e3a5edfd15312eef521c30215 | 5092f33e441450950d11b1c948afdf591665d4037093eb02ee8eb2b4 |                                  | 9a9110e6cefca514ffffffffffffffffffffffff08000000
7     | 2024-10-22 21:10:31 | ray::f   | /Users/ruiyangwang/tmp/debugger/simple.py:13 | fa5968b1c7df5fb95f4d33128b37f55e3a5edfd15312eef521c30215 | 5c7f420545c79a89e25913680151ee8030143ba655313eda5b20bd03 |                                  | 052dccbf15cfaa1cffffffffffffffffffffffff08000000
8     | 2024-10-22 21:10:31 | ray::f   | /Users/ruiyangwang/tmp/debugger/simple.py:13 | fa5968b1c7df5fb95f4d33128b37f55e3a5edfd15312eef521c30215 | 62fddfe85885f0fd03dc958280d611e15f5064c15533bd5ef09411ed |                                  | 6765b80abd7cac2effffffffffffffffffffffff08000000
9     | 2024-10-22 21:10:31 | ray::A.m | /Users/ruiyangwang/tmp/debugger/simple.py:18 | fa5968b1c7df5fb95f4d33128b37f55e3a5edfd15312eef521c30215 | 7de0eae2f1458c0f7dbcb230c2faca7df832106ce5ba0979426c4425 | 47b2983c22a74f8a3907e06408000000 | f301eb8e7946b29d47b2983c22a74f8a3907e06408000000
10    | 2024-10-22 21:10:31 | ray::f   | /Users/ruiyangwang/tmp/debugger/simple.py:13 | fa5968b1c7df5fb95f4d33128b37f55e3a5edfd15312eef521c30215 | 0e2a598a0eda540939edf19e16e735aa9cf8bd6eaf3558a08334b98f |                                  | c9fc41409abba0ebffffffffffffffffffffffff08000000
Enter breakpoint index or press enter to refresh:

@rynewang
Copy link
Contributor

Code:

import ray

ray.init()

@ray.remote
def f():
    ray.util.pdb.set_trace()

@ray.remote
class A:
    def m(self):
        ray.util.pdb.set_trace()

a = A.remote()

objs = [f.remote() for i in range(10)] + [a.m.remote()]
ray.get(objs)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
debugger enhancement Request for new feature and/or capability triage Needs triage (eg: priority, bug/not-bug, and owning component)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants