Add memory usage monitor callback #21245

DimiChatzipavlis · 2025-05-03T19:54:19Z

We made the memory monitor callback(CPU/GPU monitoring) according to devs instructions (Issue No:#21150-->Tensorboard integration, support for all backends).We are looking forward to receiving your informative feedback!

google-cla · 2025-05-03T19:54:23Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

codecov-commenter · 2025-05-03T20:01:17Z

Codecov Report

Attention: Patch coverage is 60.41667% with 76 lines in your changes missing coverage. Please review.

Project coverage is 79.60%. Comparing base (6b74cb0) to head (8f37649).
Report is 40 commits behind head on master.

Files with missing lines	Patch %	Lines
keras/src/callbacks/memory_usage_callback.py	60.31%	67 Missing and 8 partials ⚠️
keras/api/_tf_keras/keras/callbacks/__init__.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #21245      +/-   ##
==========================================
+ Coverage   76.84%   79.60%   +2.75%     
==========================================
  Files         565      566       +1     
  Lines       54799    55265     +466     
  Branches     8509     8603      +94     
==========================================
+ Hits        42112    43992    +1880     
+ Misses      10543     9233    -1310     
+ Partials     2144     2040     -104

Flag	Coverage Δ
keras	`79.41% <58.85%> (+2.72%)`	⬆️
keras-jax	`63.47% <54.16%> (-0.09%)`	⬇️
keras-numpy	`58.53% <17.18%> (-0.18%)`	⬇️
keras-openvino	`?`
keras-tensorflow	`63.88% <56.25%> (?)`
keras-torch	`63.50% <53.12%> (-0.12%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

fchollet · 2025-05-05T23:27:38Z

Thanks for the PR! Can you link to a Colab showing the callback in action (maybe with different backends)?

DimiChatzipavlis · 2025-05-24T23:01:55Z

Hi everyone!
I’ve put together a Colab demo showing the MemoryUsageCallback in action (including a TensorBoard integration):
https://colab.research.google.com/drive/1-vV1D98TtGN5A9Cx37aW_7qE-CtoFBfd?usp=sharing

My callback works for CPU, TensorFlow, PyTorch, and JAX—and even writes scalars to TensorBoard—but OpenVINO doesn’t expose any memory-stats API (and isn’t typically used for training workloads), so it isn’t strictly required here and its tests keep failing in CI. Does anyone have suggestions for:

Skipping or mocking out the OpenVINO memory tests?
Alternatively, adding minimal OpenVINO support (even just a warning) so that the import_test passes without installation?

Thanks in advance and I’m eager for your feedback, both for the colab and openvino test.

divyashreepathihalli · 2025-05-29T03:07:50Z

Thank you for the PR @DimiChatzipavlis - taking a look!

divyashreepathihalli · 2025-05-29T03:15:05Z