Skip to content

Windows bench_command memory tracking fails #216

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
HunterAP23 opened this issue Apr 23, 2025 · 3 comments
Open

Windows bench_command memory tracking fails #216

HunterAP23 opened this issue Apr 23, 2025 · 3 comments

Comments

@HunterAP23
Copy link

HunterAP23 commented Apr 23, 2025

Creating a new issue to track a PR I'm working on to fix the issue in the title. Related issue from May 2021: #97

I was attempting to track the memory usage of command benchmarks on Windows, but got the following errors when doing so:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "D:\my_project\.venv\Lib\site-packages\pyperf\__main__.py", line 769, in <module>
    main()
  File "D:\my_project\.venv\Lib\site-packages\pyperf\__main__.py", line 765, in main
    func()
  File "D:\my_project\.venv\Lib\site-packages\pyperf\__main__.py", line 734, in cmd_bench_command
    runner.bench_command(name, command)
  File "D:\my_project\.venv\Lib\site-packages\pyperf\_runner.py", line 747, in bench_command
    return self._main(task)
           ^^^^^^^^^^^^^^^^
  File "D:\my_project\.venv\Lib\site-packages\pyperf\_runner.py", line 460, in _main
    bench = self._worker(task)
            ^^^^^^^^^^^^^^^^^^
  File "D:\my_project\.venv\Lib\site-packages\pyperf\_runner.py", line 434, in _worker
    run = task.create_run()
          ^^^^^^^^^^^^^^^^^
  File "D:\my_project\.venv\Lib\site-packages\pyperf\_worker.py", line 299, in create_run
    self.compute()
  File "D:\my_project\.venv\Lib\site-packages\pyperf\_command.py", line 70, in compute
    raise RuntimeError("failed to get the process RSS")
RuntimeError: failed to get the process RSS
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "\.venv\Scripts\pyperf.exe\__main__.py", line 10, in <module>
  File "D:\my_project\.venv\Lib\site-packages\pyperf\__main__.py", line 765, in main
    func()
  File "D:\my_project\.venv\Lib\site-packages\pyperf\__main__.py", line 734, in cmd_bench_command
    runner.bench_command(name, command)
  File "D:\my_project\.venv\Lib\site-packages\pyperf\_runner.py", line 747, in bench_command
    return self._main(task)
           ^^^^^^^^^^^^^^^^
  File "D:\my_project\.venv\Lib\site-packages\pyperf\_runner.py", line 465, in _main
    bench = self._manager()
            ^^^^^^^^^^^^^^^
  File "D:\my_project\.venv\Lib\site-packages\pyperf\_runner.py", line 678, in _manager
    bench = Manager(self).create_bench()
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\my_project\.venv\Lib\site-packages\pyperf\_manager.py", line 243, in create_bench
    worker_bench, run = self.create_worker_bench()
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\my_project\.venv\Lib\site-packages\pyperf\_manager.py", line 142, in create_worker_bench
    suite = self.create_suite()
            ^^^^^^^^^^^^^^^^^^^
  File "D:\my_project\.venv\Lib\site-packages\pyperf\_manager.py", line 132, in create_suite
    suite = self.spawn_worker(self.calibrate_loops, 0)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\my_project\.venv\Lib\site-packages\pyperf\_manager.py", line 118, in spawn_worker
    raise RuntimeError("%s failed with exit code %s"
RuntimeError: D:\my_project\.venv\Scripts\python.exe failed with exit code 1

I debugged my way through the code and ended up getting the root cause, which is located here:

try:
import resource
except ImportError:
resource = None
def get_max_rss(*, children):
if resource is not None:
if children:
resource_type = resource.RUSAGE_CHILDREN
else:
resource_type = resource.RUSAGE_SELF
usage = resource.getrusage(resource_type)
if sys.platform == 'darwin':
return usage.ru_maxrss
return usage.ru_maxrss * 1024
else:
return 0

In short, this function gets the current process resident set size by using the resource library, but this library is only available on Linux. When run on Windows, this function simply returns 0, which causes the downstream callers to see this as an error and fail running the benchmark entirely.

I began working on a fork where I instead use psutil to get the current process' RSS, but I noticed that psutil.Process().memory_info().rss returns higher values than the measurements from the resource library. I'm seeing roughly 25% - 35% higher RSS size with psutil`, so that leads to a dilemma in terms of accuracy across operating systems. We have a few options:

  1. psutil works cross-platform, but the rss values are not accurate with what the resource module gets. We can opt to only use psutil moving forward, but that would invalidate all existing command benchmark results until they are re-run.
  2. We can use psutil only for Windows systems, but this leads to a memory usage discrepancy between operating systems. On my Mac Mini, the resource and psutil RSS sizes did not match by a wide margin, so for Windows systems it would falsely appear to have higher memory usage than Mac systems (and presumably Linux ones as well).
  3. We can use some other data point, such as the Unique Set Size from psutil through psutil.Process().memory_full_info().uss. USS is closer to what the resource module gets for RSS, but now USS is about 15% smaller than RSS from the resource module. USS is supposed to be the closest representation of the process memory usage, which should be more ideal than RSS or peak RSS

I'm not aware of any other ways to get the memory usage of a process without writing some C bindings to do so. What's more confusing is that there is also the _win_memory.py file that uses Windows-native functionality to track memory usage, but from my testing that's not used correctly - if it was then I wouldn't be getting the above error.
I see in both _runner.py and _worker.py that we break down what method to use based on what OS is running. If we go with using psutil for the unifying the memory tracking of command benchmarks, should we do the same for regular benchmarks?

@HunterAP23 HunterAP23 changed the title Windows memory tracking fails Windows bench_command memory tracking fails Apr 23, 2025
@HunterAP23
Copy link
Author

Hate to bother but I figure this could be a big change - any opinions @mdboom ?

@vstinner
Copy link
Member

Using psutil.Process().memory_info().rss on Windows sounds like a good idea.

What's more confusing is that there is also the _win_memory.py file that uses Windows-native functionality to track memory usage, but from my testing that's not used correctly - if it was then I wouldn't be getting the above error.

It's used by collect_metadata to set mem_peak_pagefile_usage metadata. I don't know how it compares to RSS memory.

What does psutil use to compute the RSS memory?

@HunterAP23
Copy link
Author

They get it by using C extensions to query PROCESS_QUERY_LIMITED_INFORMATION and getting out the PROCESS_MEMORY_COUNTERS struct:
https://github.com/giampaolo/psutil/blob/master/psutil/arch/windows/proc.c#L359-L386

For Linux they instead look at the stats of the PID's statm file on disk:
https://github.com/giampaolo/psutil/blob/master/psutil/_pslinux.py#L1913-L1930

And lastly on MacOS they have this C extension but I'm having trouble following where the data is pulled from into their proc_taskinfo struct:
https://github.com/giampaolo/psutil/blob/master/psutil/arch/osx/proc.c#L431-L475

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants