-
-
Notifications
You must be signed in to change notification settings - Fork 428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compatibility with torch.distributed.rpc? #568
Comments
I guess the easiest way is to just try it out. I don't think RPCs matter in this case, it's all about if you can spawn your process with viztracer. If |
Thanks for the response! I just tried it with my toy program today which uses multiprocessing and RPC. While it is able to generate a result.json, when I tried opening it up with vizviewer, perfetto seem to get stuck, and doesn't load up the trace at all. My result.json file is 138MB if that is of any help. |
138MB is not huge. So the http server of vizviewer is not the most stable software :) Try refreshing the webpage a few times, that normally works for me. There are a few alternatives. |
Hi,
I have a multiprocess program with an agent and multiple observers. The processes are spawned using mp.spawn(run_worker, args, nproces.....).
In run_worker, I use torch.distributed.rpc.init_rpc() to create an rpc node for the agent and each of the observer processes. RPC manages a pool of threads within process for handling the rpc communications.
I use rpc from the agent to initiate some jobs on the observer. Then the observer returns results back to the agent also using rpc. Some async rpc, some sync rpc.
Thats my setup.
My question is, is Viztracer be able to support this setup?
Regards,
Dean
The text was updated successfully, but these errors were encountered: