Skip to content

Refactor keras/src/export/export_lib and add export_onnx #20710

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jan 4, 2025

Conversation

james77777778
Copy link
Contributor

@james77777778 james77777778 commented Jan 2, 2025

This PR:

  1. Refactors export_lib.py to separate the export logic into export_utils.py, saved_model.py, tfsm_layer.py and onnx.py
  2. Adds export_onnx and the corresponding tests

The details of the ONNX export for TF and JAX rely on export_saved_model and tf2onnx library.

Now, we can export ONNX artifacts using TF, JAX and Torch backends!
However, the exported artifacts might not be fully optimized. Users are encouraged to see the guidelines here:
https://onnxruntime.ai/docs/performance/

EDITED:
tf2onnx breaks when using numpy>=2.0.0
onnx/tensorflow-onnx#2373

@codecov-commenter
Copy link

codecov-commenter commented Jan 2, 2025

Codecov Report

Attention: Patch coverage is 83.97790% with 29 lines in your changes missing coverage. Please review.

Project coverage is 81.93%. Comparing base (5b29974) to head (5a94db2).
Report is 3 commits behind head on master.

Files with missing lines Patch % Lines
keras/src/export/onnx.py 71.11% 5 Missing and 8 partials ⚠️
keras/src/export/export_utils.py 85.71% 4 Missing and 5 partials ⚠️
keras/src/export/tfsm_layer.py 93.33% 1 Missing and 2 partials ⚠️
keras/src/layers/layer.py 66.66% 1 Missing and 1 partial ⚠️
keras/api/_tf_keras/keras/export/__init__.py 0.00% 1 Missing ⚠️
keras/src/models/model.py 83.33% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master   #20710   +/-   ##
=======================================
  Coverage   81.93%   81.93%           
=======================================
  Files         548      551    +3     
  Lines       51190    51264   +74     
  Branches     7912     7924   +12     
=======================================
+ Hits        41942    42003   +61     
- Misses       7310     7315    +5     
- Partials     1938     1946    +8     
Flag Coverage Δ
keras 81.75% <82.87%> (-0.01%) ⬇️
keras-jax 64.01% <59.11%> (+0.01%) ⬆️
keras-numpy 58.91% <25.41%> (-0.03%) ⬇️
keras-openvino 29.88% <24.86%> (+0.01%) ⬆️
keras-tensorflow 64.68% <73.48%> (+<0.01%) ⬆️
keras-torch 64.05% <55.80%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@fchollet fchollet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! LGTM

from keras.src.utils.module_utils import tensorflow as tf


@keras_export("keras.layers.TFSMLayer")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to move the file to keras/layers/ in the future. Fine to keep it here for now.

@fchollet
Copy link
Collaborator

fchollet commented Jan 4, 2025

tf2onnx breaks when using numpy>=2.0.0

Pretty big limitation! Do you know if the package is maintained? Who maintains it?

Copy link
Collaborator

@fchollet fchollet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work -- thank you for the contribution! 👍

@google-ml-butler google-ml-butler bot added kokoro:force-run ready to pull Ready to be merged into the codebase labels Jan 4, 2025
@fchollet fchollet merged commit 94977dd into keras-team:master Jan 4, 2025
7 checks passed
@google-ml-butler google-ml-butler bot removed ready to pull Ready to be merged into the codebase kokoro:force-run labels Jan 4, 2025
@james77777778
Copy link
Contributor Author

Pretty big limitation! Do you know if the package is maintained? Who maintains it?

Yes, this is a big limitation.

I believe the ONNX team is working on fixing it:
onnx/tensorflow-onnx#2377

Kindly ping @fatcat-z

@james77777778 james77777778 deleted the add-export-onnx branch January 4, 2025 12:53
fchollet added a commit that referenced this pull request Jan 29, 2025
* Allow some TF kernels fusion: tf.nn.bias_add as special case of tf.add (#20386)

* tf.nn.bias_add as special case of tf.add

* More comments

* Update softmax.py (#20400)

Updated keras.layers.activations.Softmax() to keras.layers.Softmax().  otherwise will get an error as AttributeError

* Add GLU activation (#20392)

* Add GLU activation function

* Add test cases for GLU

* Update assert statement to ValueError

* Updated keras.layers.activations.ReLU API with keras.layers.ReLU in Example from relu.py file (#20403)

`keras.layers.activations.ReLU API` throwing `AttributeError: module 'keras.api.layers' has no attribute 'activations'`.  Modified      it with`keras.layers.ReLU API.

* [Visualization utils] Add visualization utils for plotting images(plain, with bounding boxes and segmentation masks) (#20401)

* api gen

* add plot image gallery function

* add `plot_ bounding_box_gallery`

* correct label key

* add segmentation mask draw and plot functions

* few arg corrections and docstrings

* nit

* add missing args for plotting segmenation masks use cols for each mask to make aspect ratio of each subplot correct

* add missing argument for color

* Fix serialization / deserialization. (#20406)

- Serialization was not taking the registered name and package from the registry.
- Deserialization was selecting symbols by postfix as a fallback.

* Fixed the Value error in Example from activation.py (#20404)

* Fixed the Value error in Example from activation.py

Passing Python list directly to the keras layer object in Example from activation.py is throwing Value error.  Fixed the error by passing tensor as a input. Here is the [gist](https://colab.sandbox.google.com/gist/LakshmiKalaKadali/caefd982bfff4ff6c4139784236c3a17/quickstart_colab.ipynb#scrollTo=F3hV2zfCb7Nu).

Thank You

* Update activation.py

* Add hard_tanh activation function (#20405)

* Add hard_tanh activation function

* Fix the logic to match dtype of output

* Patch to support TF1 in TF numpy backend (#20413)

eb5c5ae broke Dense layers in TF1, since `.shape` returns a list of
Dimensions which are unhashable types. Adding `.as_list()` enables this
check in both TF1 and TF2.

```
{tf.constant([1, 2]).shape.as_list()[0],}
```

* Add `mean_with_sample_weight` reduction to `Loss` (#20410)

* Add `normalize_by_sample_weight` to `Loss`

* Add `"mean_with_sample_weight"` reduction for `Loss`

* Minimize code changes

* Fix CI bug

* Jax tracing fix (#20412)

* `JAXTrainer`: refactoring and fixes
Fix for https://github.com/keras-team/keras/issues/20402
Fix for https://github.com/keras-team/keras/issues/20411

* CI setup

* Fix tests

* Revert CI branch to master

* `function` -> `iterator_step`

* Add log_sigmoid activation (#20416)

* correct misspelling and test case (#20417)

* Fix additional shape comparison for TF1 compatibility (#20422)

I missed this in #20413. Confirmed this fixes the issue in Colab.

* Add error for empty PyDataset

* Add `tree.flatten_with_path` and `tree.assert_same_paths` methods. (#20431)

* Add `tree.flatten_with_path` and `tree.assert_same_paths` methods.

* Add methods in `__init__.py`

* Fix api generated files.

* `CompileLoss`: Allow different but reconcilable structures for `y_true` and `y_pred` (#20426)

* - Allow different but reconcilable structures for `y_true` and `y_pred`

* Fix test

* fix too much relaxation

* Use `assert_same_paths` for structures reconciliation checks

* Add `from_sorted_ids` option to `SparseTopKCategoricalAccuracy`. (#20433)

to consume sorted IDs of top N categories instead of scores for all categories.

* Move project metadata from setup.py to pyproject.toml (#20427)

* Move project metadata from setup.py to pyproject.toml

* Override black target version (for now) to avoid other changes

* PR feedback

* Move explicit list of dependencies from setup.py to pyproject.toml

* pathlib was already imported

* Fix 5D shape validation issues with concat layer

* Bump the python group with 5 updates (#20436)

Updates the requirements on [tensorflow-cpu](https://github.com/tensorflow/tensorflow), [tensorflow](https://github.com/tensorflow/tensorflow), torch, torchvision and [tensorflow[and-cuda]](https://github.com/tensorflow/tensorflow) to permit the latest version.

Updates `tensorflow-cpu` to 2.18.0
- [Release notes](https://github.com/tensorflow/tensorflow/releases)
- [Changelog](https://github.com/tensorflow/tensorflow/blob/master/RELEASE.md)
- [Commits](https://github.com/tensorflow/tensorflow/compare/v2.17.0...v2.18.0)

Updates `tensorflow` to 2.18.0
- [Release notes](https://github.com/tensorflow/tensorflow/releases)
- [Changelog](https://github.com/tensorflow/tensorflow/blob/master/RELEASE.md)
- [Commits](https://github.com/tensorflow/tensorflow/compare/v2.17.0...v2.18.0)

Updates `torch` from 2.4.1+cu121 to 2.5.1+cu121

Updates `torchvision` from 0.19.1+cu121 to 0.20.1+cu121

Updates `tensorflow[and-cuda]` to 2.18.0
- [Release notes](https://github.com/tensorflow/tensorflow/releases)
- [Changelog](https://github.com/tensorflow/tensorflow/blob/master/RELEASE.md)
- [Commits](https://github.com/tensorflow/tensorflow/compare/v2.17.0...v2.18.0)

---
updated-dependencies:
- dependency-name: tensorflow-cpu
  dependency-type: direct:production
  dependency-group: python
- dependency-name: tensorflow
  dependency-type: direct:production
  dependency-group: python
- dependency-name: torch
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: python
- dependency-name: torchvision
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: python
- dependency-name: tensorflow[and-cuda]
  dependency-type: direct:production
  dependency-group: python
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump the github-actions group with 2 updates (#20435)

Bumps the github-actions group with 2 updates: [actions/upload-artifact](https://github.com/actions/upload-artifact) and [github/codeql-action](https://github.com/github/codeql-action).


Updates `actions/upload-artifact` from 4.4.0 to 4.4.3
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](https://github.com/actions/upload-artifact/compare/50769540e7f4bd5e21e526ee35c689e35e0d6874...b4b15b8c7c6ac21ea08fcf65892d2ee8f75cf882)

Updates `github/codeql-action` from 3.26.10 to 3.27.0
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](https://github.com/github/codeql-action/compare/e2b3eafc8d227b0241d48be5f425d47c2d750a13...662472033e021d55d94146f66f6058822b0b39fd)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: github-actions
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: github-actions
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Fix typos (#20434)

* Fix typos

* Manually fix E501, lines too long

* Fix keras.ops.quantile implementation for floating point inputs that are not tf.float32. (#20438)

* Use temporary folder for testing model saving in file editor (#20439)

* Fix encoding issue (#20443)

* Fix encoding issue

* Fix CI

* Replace isort and flake8 with Ruff checker (#20442)

* Replace isort and flake8 with Ruff checker

* Resolve issue with shell/api_gen.sh and correction to fix/check logic

* Resolve E721 to use `is` and `is not` for type comparisons

* Workaround for pydataset hanging issue

* Replace Black with Ruff formatter (#20445)

* adding `ifft2` method to ops (#20447)

* adding ifft2 method to ops

* fixes all test checks

* using built-in versions in backends

* Fix profiling for Tensorflow and JAX (#20450)

* Fix profiling for tensorflow and JAX

* Update doc

* Test fix

* Fix for https://github.com/keras-team/keras/issues/20425 (#20453)

The issue was caused by the fact that the iterator was not fully consumed and `on_epoch_end` was not called.

Added an exception to catch this situation in the future.

Added a unit test to test `model.fit()` with all the combinations of data adapters.

* Tweaked documentation of `Model`'s `fit`, `evaluate` and `predict`. (#20454)

Clearly documented what all the options are for `x` and all the implications for other arguments.

Also made the documentation more consistent between the arguments and between `fit`, `evaluate` and `predict`.

* Suppress warnings for mismatched tuples and lists in functional models. (#20456)

* Add Circle Loss Function for Similarity/Metric Learning Tasks. (#20452)

* update keras/src/losses/__init__.py, losses.py, losses_test.py and numerical_utils.py

* ruff fixes

* hotfix for logsumexp numerical unstability with -inf values

* actual fix for logsumexp -inf unstability

* Add tests, fix numpy logsumexp, and update Circle Loss docstrings.

* run api_gen.sh

* Docstring nits

* `TensorFlowTrainer`: Add low-level API `unrolled_steps_per_execution` parameter (#20451)

* `TensorFlowTrainer`: Add `unrolled_steps_per_execution` parameter.

* Fix test

* Get rid of mask related warnings when using MHA layer with mask

* Fix steps for `TensorBoard` callback for evaluation and batch metrics. (#20461)

This bug caused batch level metrics and evaluation metrics to all be reported for step 0, which would not show a graph.

Epoch level metrics, were not affected by this bug.

* Attempt to fix nightly

* Add support for direct tensor as initializer (#20457)

* Add support for direct tensor as initializer

* Update docstrings and improve fn for direct tensor as initializer

* Switch jnp.reshape from newshape to shape paramter. (#20469)

The newshape parameter was deprecated in JAX v0.4.28, and will soon be removed.

* Enable flash attention (#20448)

* Enable flash attention

* code reformat

* address review comments

* add docstring

* update docstring

* add numerical correctness test

* code reformat

* use causal mask from call method

* address review comments

* update if

* fix tests

* update tests

* enable flash attention on TPU JAX

* update code

* minor fix

* address review comments

* fix tests

* run api_gen

* code reformat

* fix mask issue

* disable causal mask in dpa because it is comuted in comput_attention_mask

* fix masks tests

* code reformat

* disable tests of env is not supported

* fix code reformat error

* fix torch GPU tests

* fix torch gpu tests

* make everything contigious

* check if mask is not before callng contigious

* disable pytorch GPU test

* merge master

* code reformat

* set bias to None

* disable GPU test

* Implement transform_bounding_boxes for random_flip (#20468)

* Implement transform_bounding_boxes for random_flip

* fix test case for torch env

* Add channel first test cases also

* Add condition for channel_first

* `CompileLoss`: fix for partially defined loss with different `y_pred` and `y_true` structures (#20477)

* `CompileLoss`: fix for partially defined loss with different `y_pred` and `y_true` structures.

* - added test

* Update CompileLoss to report unweighted metric values (breaking change) (#20476)

Fixes #20343. Thanks to rivershah@ for pointing this out.

This changes CompileLoss metrics to reporting the values before
weights are applied, which brings it in line with Keras 2 behavior.

* Add loss call fastpath

* Attempt to fix torch gpu CI

* Add hard_shrink activation function (#20470)

* Add hard_shrink activation function

* Correct test case failed

* Change threshold name from lambd to threshold

* Change threshold name from lambd to threshold

* Docstring nits

* - Better handling of partial loss configs (#20478)

* Double backup (#20465)

* Double backup

* Do not remove previous backup in case some epoch will fails twice

* Fix PR comments

* Allow np object arrays containing strings as sliceable inputs

* Add tanh_shrink activation (#20480)

* remove arg `return_attention_scores` from `_compute_attention` (#20482)

* Improve the consistency of the names of initializers (#20484)

* Fix `Orthogonal` initializer and improve consistency of the names of initializers.

* Rename `STFTInitializer` to `STFT`

* Fix CI

* Fix `attention_mask` computation in `MultiHeadAttention` (#20488)

* Fix `dot_product_attention` in `MultiHeadAttention`

* Simplify tests

* Refactor `dot_product_attention` to use flash attention when available (#20489)

* Refactor `dot_product_attention`

* Fix CI and improve compatibility for torch backend.

* Minor condition update.

* Fix CI.

* Fix CI

* Fix GPU CI

* Minor updates for tests

* Fixing example code for BinaryFocalCrossentropy in losses.py file (#20492)

* Add soft_shrink activation (#20494)

* Enhance the robustness of the flash attention check (#20495)

* Enhance the robustness of the flash attention check.

* Fix CI

* Fix CI again

* Fix GPU CI again and again...

* No raise in tests

* Pin coverage==7.6.1

* Fix the comment

* Unpin coverage (#20499)

* implement transform_bounding_boxes for center_crop (#20491)

* implement transform_bounding_boxes for center_crop

* Add test case

* Add support for XPU device for torch

* Fix rendering issue (#20501)

* Add exp2 op (#20506)

* Add Exp2

* Api

* fix and format

* fix format

* fix

* Fix Docstring

* Update API files

* More flexible output_shape computation in keras.layers.MultiHeadAttention (#20503)

* Made the compute_output_shape method more flexible; now _output_shape can be either an integer or a tuple (as previously required).
Fix discussed in #19769

* Added unit test

* Minor changes to comments in unit test

* Minor changes to comments in unit test

* Minor fix

* Fix tensorflow `_dot_product_attention_xla` and update `enable_flash_attention` (#20510)

* Fix tensorflow `_dot_product_attention_xla` and update MHA tests

* Fix tests

* Add squareplus activation (#20508)

* Add squareplus activation

* correct spelling

* Fix docstrings

* `MultiHeadAttention._compute_attention_mask()` always returns a bool tensor. (#20511)

Previously, if an non-bool `attention_mask` was passed and no other mask was passed, the original `attention_mask` was returned unchanged.

Now, it is always cast to bool.

* Allow `convert_to_tensor` to take a value with the wrong `dtype` on Tensorflow. (#20513)

`ops.convert_to_tensor(1.0, "int32")` would fail with the TensorFlow backend. This case is now supported.

Note that other backends already supported this.

* Avoid call to deprecated xla_bridge.get_backend() (#20512)

This function was deprecated in JAX v0.4.32, and will soon be removed.

* Update GQA to use flash attention and move the config to `backend.config` (#20514)

* Make test resilient to spurious warnings. (#20516)

Test was counting warnings, but some other components can throw unrelated warnings.

This makes sure we only count the warnings we're looking for.

* Update losses.py (#20523)

* Fix and update GQA tests (#20522)

* Fix incorrect argument name and update description in RNN documentation (#20525)

* Replace `np.iinfo` and `np.finfo` with `ml_dtypes` (#20528)

* Raise error when calling `save_weights` and `load_weights` with the unbuilt model (#20530)

* Allow EarlyStopping to be reused between multiple `fit`s. (#20533)

All values were already reset properly in `on_train_begin` except `best`.

Fixes https://github.com/keras-team/keras/issues/20521

* implement transform_bounding_boxes for random_zoom (#20526)

* implement transform_bounding_boxes for random_zoom

* Add test cases

* Update test case & correct code

* Revert "Update test case & correct code"

This reverts commit 3288fc7164f802a66948b27905df7f4bce9d7df9.

* Update test case & correct code

* move inline method to layer level

* Fix `BaseOptimizer` with mixed `tf.Variable` and `KerasVariable` (#20534)

* Add support for symbolic tensors to `convert_to_tensor`. (#20539)

`convert_to_tensor(x, sparse=False)` is the API to densify sparse tensors. When used in that manner, the input is already a backend tensor. For this scenario, it makes sense to support symbolic tensors so that one can build a functional model using `convert_to_tensor`.

Also improved the documentation of `convert_to_tensor`.

* implement transform_bounding_boxes for random_translation (#20540)

* Propagate the `aggregation` property when creating a `tf.Variable` (#20541)

* Fix TF variable aggregation

* Add `none` to aggregation

* Fix torch GPU CI, I suppose...

* Add inner op (#20532)

* add inner op

* Fix tensorflow implementation

* fix

* api

* fix lint

* format

* Remove `output_shape` property in MHA (#20543)

* Simplify `output_shape` logic in MHA and remove `output_shape` property.

* Fix CI

* Update test

* Update test

* Fix issue with list/dict losses

* Tiny bit of battle testing function dict inputs

* Fix CI

* Improve `keras.Variable` by exposing docstrings and ensuring consistency in the codebase (#20544)

* Improve `keras.Variable` by exposing docstrings and ensuring consistency in the codebase

* Fix CI

* Update docstrings

* Fix cloning of Sequential models w. input_tensors argument (#20550)

* Fix cloning for Sequential w. input tensor

* Add missing test for input_tensor argument

* Add Sequential wo. Input to test, build model to ensure defined inputs

* Better input validation for InputLayer with input_tensor provided

* Multiple Example Title has removed (#20553)

Multiple Example Title has removed in metrics cosineSimilarity method.

* Add diagflat op (#20547)

* diagflat

* api

* Add sparse_plus activation (#20545)

* Add sparse_plus activation

* correct test cases failed

* Tiny fix

* Update ModelCheckpoint support ".h5" support (#20561)

* Update ModelCheckpoint support ".h5" support

* ModelCheckpoint support ".h5" and ".keras" both filetype

* Minor touch ups

* Addition of Sparsemax activation (#20558)

* add: sprsemax ops

* add: sparsemax api references to inits

* add: sparsemax tests

* edit: changes after test

* edit: test case

* rename: function in numpy

* add: pointers to rest inits

* edit: docstrings

* change: x to logits in docstring

* Add parameter axis to tversky loss (#20563)

* Add axis to tversky loss

* Add tests for tversky loss

* Fiz line too long error

* Reformat code

* Example title two times removed in regression_metrics.py (#20565)

* Un-disable legacy saving tests.

* FIX BUG in load_weights_from_hdf5_group_by_name" legacy_h5_format.py (#20537)

* FIX BUG in load_weights_from_hdf5_group_by_name" legacy_h5_format.py

* add top_level_model_weights to get_subclassed_model

* Minor fixes.

* Major rework of `optree_impl` and `dmtree_impl` for consistency. (#20481)

The `optree` implementation and the `dmtree` implementation of the `tree` API had a nummber of discreptancies. Running unit tests without `optree` installed would fail on a number of tests.

The documentation and behavior of the `tree` API was not internally consistent. There was contradicting documentation about the handling of `OrderedDict`s. The behavior of the `optree` implementation was to use the key sorted order for `pack_sequence_as` but use the sequence order for `flatten`, resulting in `flatten` + `pack_sequence_as` not being idempotent (as discovered in https://github.com/keras-team/keras/issues/20538 )

The exceptions used to report non-matching structures where different between the two implementation. When `optree` uses `ValueError` for all mimatches, `dmtree` would distinguish between `ValueError` and `TypeError` for some cases. This caused a number of bugs because `TypeError` was often not caught, only `ValueError`.

The `assert_same_structure` argument of `check_types` is deprecated and no longer does anything. The `optree` implementation would check the types of the *leaves*, whereas the `dmtree` would check the types of the *collections*. So `check_types=False` with `optree` was fairly, although not completely similar, to `check_types=True` with `dmtree`. The rules is that no two collection types are considered the same, except for `dict`, `OrderedDict` and `defaultdict`.

Because `optree` is the default implementation used and `dmtree` is only a fallback, this PR changes the `tree` API behavior to conform to the `optree` approach everywhere. This makes the `optree` implementation a thin wrapper on top of `optree`, whereas large portions of the `dmtree` wrapper are now reimplemented in Python. Note that the `tree` API was initially modelled after `dmtree`, not `optree`.

There are a couple of fairly niche known discrepancies between the `optree` implementation and the `dmtree` implementation. They are documented in `dmtree_impl.py`.

- Fixed references to `unflatten_as` in documentation, which doesn't exist.
- Fixed contradicting documentation in `flatten` and `pack_sequence_as` related to the handling of `OrderedDict`s. The documentation now states that the sequence order is always used.
- Made handling of `OrderedDict`s follow the spec with both `optree` and `dmtree`.
- Made the exceptions raised consistent and documented them. `TypeError` is only for major programmer error (e.g. `func` is not callable), and `ValueError` is used for all structure mismatches.
- Removed all cases where `TypeError` was caught after a `assert_same_structure`.
- Fixed the discrepancy in the behavior for `namedtuple`s. The `optree` behavior is now followed, meaning that the path for fields are indices, not field names.
- Deprecated the `check_types` argument in `assert_same_structure` and implemented the `optree` behavior in `dmtree`.
- Remove `sequence_fn` argument of `pack_sequence_as`, which was not used and force the `optree` implementation to be fully rewritten in Python.
- Added `MAP_TO_NONE` to the API, added support for it in both implementations of `traverse`. This feature was documented, but not accessible and not actually implemented.
- Added support for registered classes with `dmtree`, both with `flatten` and `unflatten` passed at registration time and methods on the class.
- Tracked collections are now supported with `dmtree` (`TrackedList`, `TrackedSet` and `TrackedDict`). In particular, `TrackedSet` would be handled as a leaf and never traversed.
- Removed dependency of tracked collections on `optree` in `tree_flatten` and `tree_unflatten`.
- Tensorflow's `ListWrapper` and `_DictWrapper` are now supported with `dmtree`.
- Implemented a more efficient way for `optree` to verify structures are the same while traversing them with `map_structure` and `map_structure_up_to`. This avoids multiple traversals.
- Added documentation for `list_to_tuples` and `map_shape_structure`.
- Completely rewrote all tree API unit tests, which are now painfully thorough.
- `map_shape_structure` is now unit tested.
- Fixed unintented use of `tree` instead of `keras.tree` in unit test.
- Ran unit tests for all backends with `optree` uninstalled.

Fixes https://github.com/keras-team/keras/issues/20538

* Fix CI

* Add threshold activation (#20555)

* Add threshold activation

* Add implementations for ops, activations.py & test cases

* Adjust arg names

* Fix dtype of tf argmax

* Bump the github-actions group with 2 updates (#20571)

Bumps the github-actions group with 2 updates: [codecov/codecov-action](https://github.com/codecov/codecov-action) and [github/codeql-action](https://github.com/github/codeql-action).


Updates `codecov/codecov-action` from 4 to 5
- [Release notes](https://github.com/codecov/codecov-action/releases)
- [Changelog](https://github.com/codecov/codecov-action/blob/main/CHANGELOG.md)
- [Commits](https://github.com/codecov/codecov-action/compare/v4...v5)

Updates `github/codeql-action` from 3.27.0 to 3.27.5
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](https://github.com/github/codeql-action/compare/662472033e021d55d94146f66f6058822b0b39fd...f09c1c0a94de965c15400f5634aa42fac8fb8f88)

---
updated-dependencies:
- dependency-name: codecov/codecov-action
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: github-actions
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: github-actions
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Fix the compatibility of the quantization for MHA  (#20562)

* Fix MHA with int8 quant

* Propagate and delete mask in MHA

* Fix CI

* Random seed doc (#20575)

* Fixed error in doc of random number generators concerning seed argument.

* Update class documentation for SeedGenerator

Clarify the facts that 

- a global SeedGenerator is used by all random number generating functions in keras,
- a SeedGenerator is required for jit compilation with the JAX backend.

* Minor reformulation.

* Refined remark on JAX and tracing.

* Fixed column length.

* Fixed line length of documentation.

* Reformatted with black.

* Reformatted with black.

* Still some lines too long?

* Another long column that was intrduced by black.

* Minor nits

* Add unravel_index op (#20559)

* Add unravel_index

* Fix Tensorflow

* Fix Tensorflow impl

* fix default np.int64

* fix

* Fix torch

* fix numpy and torch

* api

* fix

* Fix tensorflow impl and docstring

* fix

* shape None case

* shape None case

* fix

* Add support for jax named scope. (#20580)

* Better handling of variable creation with tensor as initializer. (#20557)

- when the variable dtype is not specified, the dtype of the tensor/array passed as initializer is used instead of defaulting to `backend.floatx()`.
- with the JAX backend, don't needlessly create a copy of the initializer value, reuse it if possible.

* Add Equalization Layer (#20570)

* Add Equalization Layer

* api and fix format

* lint

* Add tf-data test

* data format

* Update Doc String

* Minor fix

* Multiple Example title has been removed in OneHotIoU (#20587)

Multiple Example title has been removed in OneHotIoU file

* [TextVectorization Layer] Added tests for testing the funtionality (#20586)

* [TextVectorization Layer] Added tests for funtionality

* Fix formatting

* Skip flaky TF test

* Fix masking when `_keras_mask` is unset during `call` (#20594)

* Fix using a Python number as an initializer in JAX (#20595)

* Fix the issue when using python number as the initializer in jax

* Add rise

* Fix using lambda expression as the initializer

* Fix the issue when using `Model.compile` multiple times. (#20602)

* Fix loss scaling with `tf.distribute.MirroredStrategy` and `keras.regularizers` (#20609)

* Fix loss scaling when using `tf.distribute.MirroredStrategy`

* Fix regularizer

* Remove unused fn

* Add implementations for mix_up (#20590)

* Add implementations for mix_up

* Add updated init files

* Applied some corrections

* Remove sample beta method

* Add test cases

* Correct failed test cases

* Correct failed test cases

* Add tf compatibility test case

* Update example in the code

* Fix failed test case

* Update for numpy 2.2 bool array changes (#20614)

* Update constraints_test.py

* Turn double negative into positive assert

* Fix `SeedGenerator` in `tf.distribute.MirroredStrategy` (#20612)

* Unscale loss value in TF (#20610)

* Fix issue with unsorted dict (#20613)

* Code style fix

* Use lower precision in DPA (#20615)

* Fix confusion matrix type (#20584)

* fix: fix confusion matrix float32 problem

use int

* Use float32 for threshold comparisons and include warnings when the weight are float but the dtype is int

* fix torch test on mix_up (#20623)

* Fix torch gpu ci

* update distribution_lib files docstrings. (#20625)

* Improve implementation of TF shuffle and make it XLA compilable

* Fix CI I guess

* Correct bug for MixUp initialization. (#20630)

* Correct bug for MixUp initialization.

* Update format indent

* Fix Layer normalization issue with scalar mean & variance (#20626)

* Fix normalization issue with scalar mean & variance

* Add unit test for normalization with scalar mean and variance

* Fix code format

* Add `IOU`, `CIOU` and minor fixes to bounding boxes (#20635)

* Add computer affine matrix method and reformat some of the bounding box arguments

* Add rotation for boxes

* proper reshape of the rotation matrix

* iou and random rotation using affine

* bounding boxes iou

* - add encode and decode to deltas for bounding boxes
- add iou and ciou methods

* add api points for encode and decode methods of bounding boxes

* fix arg name and proper for args for test_affine

* correct dtype mul

* Fix torch gpu ci

* Fix GPU CI (#20637)

* Fix GPU CI

* Fix dtype issue

* Remove duplicate tests

* Fix typos in simple_rnn (#20636)

I observed a  few typos in simple_rnn

* Add implementations for random_hue (#20620)

* Add implementations for random_hue

* Correct failed test cases

* Correct misspellings

* Update example on description

* Correct test case failed.

* Fix code style

* Docstring nit

* FEAT add scikit-learn wrappers (#20599)

* FEAT add scikit-learn wrappers

* import cleanup

* run black

* linters

* lint

* add scikit-learn to requirements-common

* generate public api

* fix tests for sklearn 1.5

* check fixes

* skip numpy tests

* xfail instead of skip

* apply review comments

* change names to SKL* and add transformer example

* fix API and imports

* fix for new sklearn

* sklearn1.6 test

* review comments and remove random_state

* add another skipped test

* rename file

* change imports

* unindent

* docstrings

* Rework `Model.export` and `keras.export.ExportArchive` to support exporting in TFLite and ONNX formats in the future (#20631)

* Rework `Model.export` and `keras.export.ExportArchive`

* Try fixing PyDatasetAdapterTest CI issues

* Fix random hue layer

* Update example and logic for mix_up (#20643)

* Add sklearn (#20644)

* Update example and logic for mix_up (#20642)

* Update example and logic for mix_up

* remove tf from example

* Add RandomGrayscale Layer (#20639)

* Add RandomGrayscale Layer

* Fix torch tests

* format

* fix

* fix

* Fix torch tests

* Fix torch ci

* Fix typo

* Fix issues with randomgrayscale layer

* Fix Randomhue (#20652)

* Small fix in random hue

* use self.backend for seed

* test: add test for class weights (py_dataset adapter) (#20638)

* test: add test for class weights (py_dataset adapter)

* "call _standardize_batch from enqueuer"

m

* add more tests, handle pytorch astype issue

m

* convert to numpy to ensure consistent handling of operations

* Add implementations for random_saturation (#20646)

* Correct bug for MixUp initialization.

* Update format indent

* Add implementations for random_saturation

* change parse_factor method to inner method.

* correct test cases failed.

* correct failed test cases

* Add training argument check condition

* correct source code

* add value_range args description

* update description example

* change _apply_random_saturation method to inline

* Fix random_saturation

* Fix paths for pytest in contribution guide (#20655)

* Add preliminary support of OpenVINO as Keras 3 backend (#19727)

* [POC][OV] Support OpenVINO as Keras 3 backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Mark all unsupported ops from numpy space

Signed-off-by: Kazantsev, Roman <[email protected]>

* Mark unsupported ops in core, image, and linalg spaces

Signed-off-by: Kazantsev, Roman <[email protected]>

* Mark unsupported ops in math, nn, random, and rnn spaces

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix sorting imports

Signed-off-by: Kazantsev, Roman <[email protected]>

* Format imports

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix sorting imports

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix sorting imports

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix inference

Signed-off-by: Kazantsev, Roman <[email protected]>

* Remove openvino specific code in common part

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix typo

* Clean-up code

Signed-off-by: Kazantsev, Roman <[email protected]>

* Recover imports

Signed-off-by: Kazantsev, Roman <[email protected]>

* Sort imports properly

Signed-off-by: Kazantsev, Roman <[email protected]>

* Format source code

Signed-off-by: Kazantsev, Roman <[email protected]>

* Format the rest of source code

Signed-off-by: Kazantsev, Roman <[email protected]>

* Continue format adjustment

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add OpenVINO dependency

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix inference using OV backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Support bert_base_en_uncased and mobilenet_v3_small from Keras Hub

Signed-off-by: Kazantsev, Roman <[email protected]>

* Remove extra openvino specific code from layer.py

Signed-off-by: Kazantsev, Roman <[email protected]>

* Apply code-style formatting

Signed-off-by: Kazantsev, Roman <[email protected]>

* Apply code-style formatting

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix remained code-style issue

Signed-off-by: Kazantsev, Roman <[email protected]>

* Run tests for OpenVINO backend in GHA

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add config file for openvino backend validation

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add import test for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix error in import_test.py

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add import_test for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add openvino specific integration tests in GHA

Signed-off-by: Kazantsev, Roman <[email protected]>

* Exclude coverage for OpenVINO

Signed-off-by: Kazantsev, Roman <[email protected]>

* remove coverage for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Try layer tests for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Run layer tests for openvino backend selectively

Signed-off-by: Kazantsev, Roman <[email protected]>

* Mark enabled tests for openvino backend in a different way

Signed-off-by: Kazantsev, Roman <[email protected]>

* Update .github/workflows/actions.yml

* Fix import for BackendVariable

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix errors in layer tests for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add test for Elu via openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix sorted imports

Signed-off-by: Kazantsev, Roman <[email protected]>

* Extend testing for attention

Signed-off-by: Kazantsev, Roman <[email protected]>

* Update keras/src/layers/attention/attention_test.py

* Switch on activation tests for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Switch on attention tests for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Update keras/src/layers/attention/additive_attention_test.py

* Update keras/src/layers/attention/grouped_query_attention_test.py

* Run conv tests for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix convolution in openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Work around constant creation for tuple

Signed-off-by: Kazantsev, Roman <[email protected]>

* Work around constant creation in reshape

Signed-off-by: Kazantsev, Roman <[email protected]>

* Run depthwise conv tests for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix get_ov_output for other x types

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix elu translation

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix softmax and log_softmax for None axis

Signed-off-by: Kazantsev, Roman <[email protected]>

* Run nn tests for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix numpy operations for axis to be None

Signed-off-by: Kazantsev, Roman <[email protected]>

* Run operation_test for openvino_backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Switch on math_test for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Switch on image tests for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Switch on linalg test for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Extend OpenVINOKerasTensor with new built-in methods and fix shape op

Signed-off-by: Kazantsev, Roman <[email protected]>

* Switch on core tests for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Use different way of OpenVINO model creation that supports call method

Signed-off-by: Kazantsev, Roman <[email protected]>

* Unify integration test for openvino

Signed-off-by: Kazantsev, Roman <[email protected]>

* Support new operations abs, mod, etc.

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add support for more operations like squeeze, max

Signed-off-by: Kazantsev, Roman <[email protected]>

* Try to use excluded test files list

Signed-off-by: Kazantsev, Roman <[email protected]>

* Apply formatting for normalization_test.py

Signed-off-by: Kazantsev, Roman <[email protected]>

* Correct GHA yml file

Signed-off-by: Kazantsev, Roman <[email protected]>

* Test that openvino backend is used

Signed-off-by: Kazantsev, Roman <[email protected]>

* Revert testing change in excluded test files list

Signed-off-by: Kazantsev, Roman <[email protected]>

* Include testing group

Signed-off-by: Kazantsev, Roman <[email protected]>

* Include legacy test group

Signed-off-by: Kazantsev, Roman <[email protected]>

* Exclude legacy group of tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Include initializers tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Skip tests for initializers group

Signed-off-by: Kazantsev, Roman <[email protected]>

* Remove export test group from ignore

Signed-off-by: Kazantsev, Roman <[email protected]>

* Include dtype_policies test group

Signed-off-by: Kazantsev, Roman <[email protected]>

* Reduce ignored tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix ops.cast

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add decorator for custom_gradient

Signed-off-by: Kazantsev, Roman <[email protected]>

* Shorten line in custom_gradient

Signed-off-by: Kazantsev, Roman <[email protected]>

* Ignore dtype_policy_map test

Signed-off-by: Kazantsev, Roman <[email protected]>

* Include callback tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Switch on backend tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Exclude failing tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Correct paths to excluded tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Switch on some layers tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Remove pytest.mark.openvino_backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Register mark requires_trainable_backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Ignore test files in a different way

Signed-off-by: Kazantsev, Roman <[email protected]>

* Try different way to ignore test files

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix GHA yml

Signed-off-by: Kazantsev, Roman <[email protected]>

* Support tuple axis for logsumexp

Signed-off-by: Kazantsev, Roman <[email protected]>

* Switch on some ops tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Switch on some callbacks tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add openvino export

Signed-off-by: Kazantsev, Roman <[email protected]>

* Update sklearn tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add a comment to skipp numerical_test

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add custom requirements file for OpenVINO

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add reqs of openvino installation for api changes check

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix types of Variables and switch on some variables tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix nightly code check

Signed-off-by: Kazantsev, Roman <[email protected]>

---------

Signed-off-by: Kazantsev, Roman <[email protected]>

* Make sklearn dependency optional (#20657)

* Add a condition to verify training status during image processing (#20650)

* Add a condition to verify training status during image processing

* resolve merge conflict

* fix transform_bounding_boxes logic

* add transform_bounding_boxes test

* Fix recurrent dropout for GRU. (#20656)

The simplified implementation, which used the same recurrent dropout masks for all the previous states didn't work and caused the training to not converge with large enough recurrent dropout values.

This new implementation is now the same as Keras 2. Note that recurrent dropout requires "implementation 1" to be turned on.

Fixes https://github.com/keras-team/keras/issues/20276

* Fix example title in probabilistic_metrics.py (#20662)

* Change recurrent dropout implementation for LSTM. (#20663)

This change is to make the implementation of recurrent dropout consistent with GRU (changed as of https://github.com/keras-team/keras/pull/20656 ) and Keras 2.

Also fixed a bug where the GRU fix would break when using CUDNN with a dropout and no recurrent dropout. The solution is to create multiple masks only when needed (implementation == 1).

Added coverage for the case when dropout is set and recurrent dropout is not set.

* Never pass enable_xla=False or native_serialization=False in tests (#20664)

These are invalid options in the latest version of jax2tf, they
will just immediately throw.

* Fix `PyDatasetAdapterTest::test_class_weight` test with Torch on GPU. (#20665)

The test was failing because arrays on device and on cpu were compared.

* Fix up torch GPU failing test for mix up (#20666)

We need to make sure to use get any tensors places on cpu before using
them in the tensorflow backend during preprocessing.

* Adjust value_range for random_contrast and random_hue (#20671)

* Adjust value_range for random_contrast and random_hue

* Add value_range description

* Correct failed test cases

* Multiple Example Title has removed in OneHotMeanIoU funtion (#20669)

Multiple Example Title has removed in OneHotMeanIoU funtion.

* Add random_color_jitter processing layer (#20673)

* Add implementations for random_saturation

* change parse_factor method to inner method.

* Add implementations for random_color_jitter

* Fix Randomhue (#20652)

* Small fix in random hue

* use self.backend for seed

* test: add test for class weights (py_dataset adapter) (#20638)

* test: add test for class weights (py_dataset adapter)

* "call _standardize_batch from enqueuer"

m

* add more tests, handle pytorch astype issue

m

* convert to numpy to ensure consistent handling of operations

* Fix paths for pytest in contribution guide (#20655)

* Add preliminary support of OpenVINO as Keras 3 backend (#19727)

* [POC][OV] Support OpenVINO as Keras 3 backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Mark all unsupported ops from numpy space

Signed-off-by: Kazantsev, Roman <[email protected]>

* Mark unsupported ops in core, image, and linalg spaces

Signed-off-by: Kazantsev, Roman <[email protected]>

* Mark unsupported ops in math, nn, random, and rnn spaces

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix sorting imports

Signed-off-by: Kazantsev, Roman <[email protected]>

* Format imports

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix sorting imports

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix sorting imports

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix inference

Signed-off-by: Kazantsev, Roman <[email protected]>

* Remove openvino specific code in common part

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix typo

* Clean-up code

Signed-off-by: Kazantsev, Roman <[email protected]>

* Recover imports

Signed-off-by: Kazantsev, Roman <[email protected]>

* Sort imports properly

Signed-off-by: Kazantsev, Roman <[email protected]>

* Format source code

Signed-off-by: Kazantsev, Roman <[email protected]>

* Format the rest of source code

Signed-off-by: Kazantsev, Roman <[email protected]>

* Continue format adjustment

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add OpenVINO dependency

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix inference using OV backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Support bert_base_en_uncased and mobilenet_v3_small from Keras Hub

Signed-off-by: Kazantsev, Roman <[email protected]>

* Remove extra openvino specific code from layer.py

Signed-off-by: Kazantsev, Roman <[email protected]>

* Apply code-style formatting

Signed-off-by: Kazantsev, Roman <[email protected]>

* Apply code-style formatting

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix remained code-style issue

Signed-off-by: Kazantsev, Roman <[email protected]>

* Run tests for OpenVINO backend in GHA

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add config file for openvino backend validation

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add import test for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix error in import_test.py

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add import_test for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add openvino specific integration tests in GHA

Signed-off-by: Kazantsev, Roman <[email protected]>

* Exclude coverage for OpenVINO

Signed-off-by: Kazantsev, Roman <[email protected]>

* remove coverage for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Try layer tests for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Run layer tests for openvino backend selectively

Signed-off-by: Kazantsev, Roman <[email protected]>

* Mark enabled tests for openvino backend in a different way

Signed-off-by: Kazantsev, Roman <[email protected]>

* Update .github/workflows/actions.yml

* Fix import for BackendVariable

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix errors in layer tests for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add test for Elu via openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix sorted imports

Signed-off-by: Kazantsev, Roman <[email protected]>

* Extend testing for attention

Signed-off-by: Kazantsev, Roman <[email protected]>

* Update keras/src/layers/attention/attention_test.py

* Switch on activation tests for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Switch on attention tests for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Update keras/src/layers/attention/additive_attention_test.py

* Update keras/src/layers/attention/grouped_query_attention_test.py

* Run conv tests for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix convolution in openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Work around constant creation for tuple

Signed-off-by: Kazantsev, Roman <[email protected]>

* Work around constant creation in reshape

Signed-off-by: Kazantsev, Roman <[email protected]>

* Run depthwise conv tests for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix get_ov_output for other x types

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix elu translation

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix softmax and log_softmax for None axis

Signed-off-by: Kazantsev, Roman <[email protected]>

* Run nn tests for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix numpy operations for axis to be None

Signed-off-by: Kazantsev, Roman <[email protected]>

* Run operation_test for openvino_backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Switch on math_test for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Switch on image tests for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Switch on linalg test for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Extend OpenVINOKerasTensor with new built-in methods and fix shape op

Signed-off-by: Kazantsev, Roman <[email protected]>

* Switch on core tests for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Use different way of OpenVINO model creation that supports call method

Signed-off-by: Kazantsev, Roman <[email protected]>

* Unify integration test for openvino

Signed-off-by: Kazantsev, Roman <[email protected]>

* Support new operations abs, mod, etc.

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add support for more operations like squeeze, max

Signed-off-by: Kazantsev, Roman <[email protected]>

* Try to use excluded test files list

Signed-off-by: Kazantsev, Roman <[email protected]>

* Apply formatting for normalization_test.py

Signed-off-by: Kazantsev, Roman <[email protected]>

* Correct GHA yml file

Signed-off-by: Kazantsev, Roman <[email protected]>

* Test that openvino backend is used

Signed-off-by: Kazantsev, Roman <[email protected]>

* Revert testing change in excluded test files list

Signed-off-by: Kazantsev, Roman <[email protected]>

* Include testing group

Signed-off-by: Kazantsev, Roman <[email protected]>

* Include legacy test group

Signed-off-by: Kazantsev, Roman <[email protected]>

* Exclude legacy group of tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Include initializers tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Skip tests for initializers group

Signed-off-by: Kazantsev, Roman <[email protected]>

* Remove export test group from ignore

Signed-off-by: Kazantsev, Roman <[email protected]>

* Include dtype_policies test group

Signed-off-by: Kazantsev, Roman <[email protected]>

* Reduce ignored tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix ops.cast

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add decorator for custom_gradient

Signed-off-by: Kazantsev, Roman <[email protected]>

* Shorten line in custom_gradient

Signed-off-by: Kazantsev, Roman <[email protected]>

* Ignore dtype_policy_map test

Signed-off-by: Kazantsev, Roman <[email protected]>

* Include callback tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Switch on backend tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Exclude failing tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Correct paths to excluded tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Switch on some layers tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Remove pytest.mark.openvino_backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Register mark requires_trainable_backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Ignore test files in a different way

Signed-off-by: Kazantsev, Roman <[email protected]>

* Try different way to ignore test files

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix GHA yml

Signed-off-by: Kazantsev, Roman <[email protected]>

* Support tuple axis for logsumexp

Signed-off-by: Kazantsev, Roman <[email protected]>

* Switch on some ops tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Switch on some callbacks tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add openvino export

Signed-off-by: Kazantsev, Roman <[email protected]>

* Update sklearn tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add a comment to skipp numerical_test

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add custom requirements file for OpenVINO

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add reqs of openvino installation for api changes check

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix types of Variables and switch on some variables tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix nightly code check

Signed-off-by: Kazantsev, Roman <[email protected]>

---------

Signed-off-by: Kazantsev, Roman <[email protected]>

* Make sklearn dependency optional (#20657)

* Add a condition to verify training status during image processing (#20650)

* Add a condition to verify training status during image processing

* resolve merge conflict

* fix transform_bounding_boxes logic

* add transform_bounding_boxes test

* Fix recurrent dropout for GRU. (#20656)

The simplified implementation, which used the same recurrent dropout masks for all the previous states didn't work and caused the training to not converge with large enough recurrent dropout values.

This new implementation is now the same as Keras 2. Note that recurrent dropout requires "implementation 1" to be turned on.

Fixes https://github.com/keras-team/keras/issues/20276

* Fix example title in probabilistic_metrics.py (#20662)

* Change recurrent dropout implementation for LSTM. (#20663)

This change is to make the implementation of recurrent dropout consistent with GRU (changed as of https://github.com/keras-team/keras/pull/20656 ) and Keras 2.

Also fixed a bug where the GRU fix would break when using CUDNN with a dropout and no recurrent dropout. The solution is to create multiple masks only when needed (implementation == 1).

Added coverage for the case when dropout is set and recurrent dropout is not set.

* Never pass enable_xla=False or native_serialization=False in tests (#20664)

These are invalid options in the latest version of jax2tf, they
will just immediately throw.

* Fix `PyDatasetAdapterTest::test_class_weight` test with Torch on GPU. (#20665)

The test was failing because arrays on device and on cpu were compared.

* Fix up torch GPU failing test for mix up (#20666)

We need to make sure to use get any tensors places on cpu before using
them in the tensorflow backend during preprocessing.

* Add random_color_jitter processing layer

* Add random_color_jitter test

* Update test cases

* Correct failed test case

* Correct failed test case

* Correct failed test case

---------

Signed-off-by: Kazantsev, Roman <[email protected]>
Co-authored-by: IMvision12 <[email protected]>
Co-authored-by: Enrico <[email protected]>
Co-authored-by: Marco <[email protected]>
Co-authored-by: Roman Kazantsev <[email protected]>
Co-authored-by: Matt Watson <[email protected]>
Co-authored-by: hertschuh <[email protected]>
Co-authored-by: Jasmine Dhantule <[email protected]>

* Add training status condition during image processing (#20677)

* Add training status condition during image processing

* Revert "Add training status condition during image processing"

This reverts commit 8fc5ae2c28c239663fe0f2e8ac7fa15037f41a7d.

* Reapply "Add training status condition during image processing"

This reverts commit 25a4bd1332c7a5794dc872f5aa6ddddf6ed1606b.

* Revert center_crop

* Import `pydot` first before trying backups (#20682)

* Fix: Return Attention Scores when `return_attention_scores=True` (#20684)

* Fix: Ensure Attention Layer Returns Attention Scores when `return_attention_scores=True`

This pull request addresses an issue in the Attention layer where the return_attention_scores parameter wasn't correctly handled in the compute_output_shape method. This fix ensures that attention scores are returned when return_attention_scores=True.

## Changes Made
Modified compute_output_shape method to return the shape of both the attention output and the attention scores when return_attention_scores=True.

* Formatting

* Fixed score return and added unit tests for return_attention_scores=True

* Removed debug print statement

* Add random_color_degeneration processing layer (#20679)

* Add random_color_degeneration processing layer

* Fix mistypo

* Correct failed test case

* fix attention output with symbolic tensors and attention scores (#20689)

* minor: Fix Functional API guide (#20694)

Add an empty line so the list is rendered as a list, not as a single line of text

* Introduces support for exporting `SavedModel` in the torch backend using `torch-xla` (#20685)

* Add support for exporting savedmodel in the torch backend

* Fix `actions.yml`

* Fix CI

* Remove unused `_mangle_tf_root_scope_name` and add `import_error_msg` to `LazyModule`

* Ignore `export_lib_test` in torch GPU CI

* Add random_posterization processing layer (#20688)

* Add random_posterization processing layer

* Add test cases

* correct failed case

* Fix torch gpu CI (#20696)

* Add random_sharpness processing layer (#20697)

* Add random_sharpness.py

* Update random_sharpness

* Add test cases

* Fix failed test case

* Add random_shear processing layer (#20702)

* Add random_shear processing layer

* Update method name

* Fix failed test case

* Fix failed test case

* Fix failed test case

* Fix the aggregation in the codebase (#20703)

* Bump the github-actions group with 2 updates (#20707)

Bumps the github-actions group with 2 updates: [actions/upload-artifact](https://github.com/actions/upload-artifact) and [github/codeql-action](https://github.com/github/codeql-action).


Updates `actions/upload-artifact` from 4.4.3 to 4.5.0
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](https://github.com/actions/upload-artifact/compare/b4b15b8c7c6ac21ea08fcf65892d2ee8f75cf882...6f51ac03b9356f520e9adb1b1b7802705f340c2b)

Updates `github/codeql-action` from 3.27.5 to 3.28.0
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](https://github.com/github/codeql-action/compare/f09c1c0a94de965c15400f5634aa42fac8fb8f88...48ab28a6f5dbc2a99bf1e0131198dd8f1df78169)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: github-actions
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: github-actions
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* fix: Torch MPS backend failing test (#20709)

* implement transform_bounding_boxes for random_shear (#20704)

* Fix torch GPU CI

* Update BackupAndRestore class example (#20714)

* Update BackupAndRestore class example

* Update backup_and_restore.py

---------

Co-authored-by: François Chollet <[email protected]>

* Update version number

* Refactor `keras/src/export/export_lib` and add `export_onnx` (#20710)

* Refactor export_lib and add export_onnx

Add tf2onnx requirements

* Add onnxruntime dep

* Update numpy dep

* Resolve comments

* Patch `tf2onnx` to ensure compatibility with `numpy>=2.0.0` (#20725)

* Patch tf2onnx to support numpy 2

* Fix warnings

* Update export_onnx

* Add build method to supress warning (#20729)

* Specify window_length dtype requirement in tf.keras.ops.istft in math.py (#20728)

The `window_length` parameter in `tf.keras.ops.istft` requires `tf.int32` dtype, but this isn't documented. This can cause unexpected `ValueError` when using `tf.int64` and `tf.int16`

Here is the Example case:
```
import tensorflow as tf

input_dict = {
    'stfts': tf.constant([[-0.87817144+1.14583987j, -0.32066484+0.25565411j]], dtype=tf.complex128),
    'frame_length': tf.constant(256, dtype=tf.int16),
    'frame_step': tf.constant(5120,dtype=tf.int64)
}
result = tf.signal.inverse_stft(**input_dict)
print(result)
```
The code throws the following error:
```
ValueError: window_length: Tensor conversion requested dtype int32 for Tensor with dtype int64
```

* Add rand_augment processing layer (#20716)

* Add rand_augment init

* Update rand_augment init

* Add rand_augment

* Add NotImplementedError

* Add some test cases

* Fix failed test case

* Update rand_augment

* Update rand_augment test

* Fix random_rotation bug

* Add build method to supress warning.

* Add implementation for transform_bboxes

* Fixing batch_dim_name attribute (#20674)

* fixing wrong trainer assumption that batch dim is always the first one in the mesh

* need functools partial

* lint

* fix test failure when distribution=None

* lint2

* fix for test failure

* added data sharding for 3D+ meshes

* lint3

* added @property for batch_dim_name + refactoring

* fix typo

* Add support for `dtype` / `DTypePolicy` to `JaxLayer` and `FlaxLayer`. (#20732)

The `dtype` / `DTypePolicy` is applied to all float variables.

* Allow dynamic shape in `STFTSpectrogram` layer. (#20736)

by simply using `ops.shape(x)` instead of `x.shape`.

* Remove duplicate export tests in `model_test`. (#20735)

The same te…
fchollet added a commit that referenced this pull request May 12, 2025
* Add random_posterization processing layer (#20688)

* Add random_posterization processing layer

* Add test cases

* correct failed case

* Fix torch gpu CI (#20696)

* Add random_sharpness processing layer (#20697)

* Add random_sharpness.py

* Update random_sharpness

* Add test cases

* Fix failed test case

* Add random_shear processing layer (#20702)

* Add random_shear processing layer

* Update method name

* Fix failed test case

* Fix failed test case

* Fix failed test case

* Fix the aggregation in the codebase (#20703)

* Bump the github-actions group with 2 updates (#20707)

Bumps the github-actions group with 2 updates: [actions/upload-artifact](https://github.com/actions/upload-artifact) and [github/codeql-action](https://github.com/github/codeql-action).


Updates `actions/upload-artifact` from 4.4.3 to 4.5.0
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](https://github.com/actions/upload-artifact/compare/b4b15b8c7c6ac21ea08fcf65892d2ee8f75cf882...6f51ac03b9356f520e9adb1b1b7802705f340c2b)

Updates `github/codeql-action` from 3.27.5 to 3.28.0
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](https://github.com/github/codeql-action/compare/f09c1c0a94de965c15400f5634aa42fac8fb8f88...48ab28a6f5dbc2a99bf1e0131198dd8f1df78169)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: github-actions
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: github-actions
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* fix: Torch MPS backend failing test (#20709)

* implement transform_bounding_boxes for random_shear (#20704)

* Fix torch GPU CI

* Update BackupAndRestore class example (#20714)

* Update BackupAndRestore class example

* Update backup_and_restore.py

---------

Co-authored-by: François Chollet <[email protected]>

* Update version number

* Refactor `keras/src/export/export_lib` and add `export_onnx` (#20710)

* Refactor export_lib and add export_onnx

Add tf2onnx requirements

* Add onnxruntime dep

* Update numpy dep

* Resolve comments

* Patch `tf2onnx` to ensure compatibility with `numpy>=2.0.0` (#20725)

* Patch tf2onnx to support numpy 2

* Fix warnings

* Update export_onnx

* Add build method to supress warning (#20729)

* Specify window_length dtype requirement in tf.keras.ops.istft in math.py (#20728)

The `window_length` parameter in `tf.keras.ops.istft` requires `tf.int32` dtype, but this isn't documented. This can cause unexpected `ValueError` when using `tf.int64` and `tf.int16`

Here is the Example case:
```
import tensorflow as tf

input_dict = {
    'stfts': tf.constant([[-0.87817144+1.14583987j, -0.32066484+0.25565411j]], dtype=tf.complex128),
    'frame_length': tf.constant(256, dtype=tf.int16),
    'frame_step': tf.constant(5120,dtype=tf.int64)
}
result = tf.signal.inverse_stft(**input_dict)
print(result)
```
The code throws the following error:
```
ValueError: window_length: Tensor conversion requested dtype int32 for Tensor with dtype int64
```

* Add rand_augment processing layer (#20716)

* Add rand_augment init

* Update rand_augment init

* Add rand_augment

* Add NotImplementedError

* Add some test cases

* Fix failed test case

* Update rand_augment

* Update rand_augment test

* Fix random_rotation bug

* Add build method to supress warning.

* Add implementation for transform_bboxes

* Fixing batch_dim_name attribute (#20674)

* fixing wrong trainer assumption that batch dim is always the first one in the mesh

* need functools partial

* lint

* fix test failure when distribution=None

* lint2

* fix for test failure

* added data sharding for 3D+ meshes

* lint3

* added @property for batch_dim_name + refactoring

* fix typo

* Add support for `dtype` / `DTypePolicy` to `JaxLayer` and `FlaxLayer`. (#20732)

The `dtype` / `DTypePolicy` is applied to all float variables.

* Allow dynamic shape in `STFTSpectrogram` layer. (#20736)

by simply using `ops.shape(x)` instead of `x.shape`.

* Remove duplicate export tests in `model_test`. (#20735)

The same tests exist at:
- https://github.com/keras-team/keras/blob/master/keras/src/export/saved_model_test.py#L66
- https://github.com/keras-team/keras/blob/master/keras/src/export/onnx_test.py#L62

The goal is to isolate the use of `onnxruntime` to a single file, `onnx_test.py`.

* Add OpenVINO into README.md (#20739)

* Add OpenVINO into README.md

Signed-off-by: Kazantsev, Roman <[email protected]>

* Update README.md

---------

Signed-off-by: Kazantsev, Roman <[email protected]>

* Multiple Example Title has removed in metrics.MeanIoU method (#20738)

Multiple Example Title has removed in metrics.MeanIoU method

* Fix JAX GPU CI and make formatter happy (#20749)

* Fix JAX GPU CI

* Makes formatter happy

* Makes formatter happy - 2

* Add checks to deserialization. (#20751)

In particular for functional models.

* feat(ops): Add keras.ops.numpy.rot90 operation (#20723) (#20745)

* feat(ops): Add keras.ops.image.rot90 operation

Adds a new operation to rotate tensors by 90 degrees in the specified plane:
- Implements rot90 operation in keras.ops.image module
- Adds support for multiple rotations (k parameter) and custom axes
- Matches numpy.rot90 behavior and API for consistency
- Adds comprehensive test coverage including batch images support
- Handles input validation for tensor dimensions and axes
- Supports symbolic tensor execution
The operation follows the same interface as numpy.rot90 and tf.image.rot90:
rot90(array, k=1, axes=(0, 1))

* feat: add JAX, NumPy and PyTorch backends for rot90

Add implementations of rot90() for multiple backend frameworks:
- JAX backend implementation
- NumPy backend implementation
- PyTorch backend implementation

* Move rot90 from image to numpy ops

Move rot90 operation to numpy.py files in backend implementations since it's a numpy op (https://numpy.org/doc/stable/reference/generated/numpy.rot90.html). Now exported as both keras.ops.rot90 and keras.ops.numpy.rot90.

* Fix dtype conflict in PyTorch backend's rot90 function

Resolved the 'Invalid dtype: object' error by explicitly using  to avoid naming conflicts with the custom  function.

* Replace experimental NumPy rot90 with core TF ops

Replace tf.experimental.numpy.rot90 with core TF ops for XLA compatibility.
Use convert_to_tensor for input handling.

* Fix code format

* Fix code format following ruff update

* Fix Torch GPU CI

* Update API ref

* Fix flaky `JaxLayer` test. (#20756)

The `DTypePolicy` test produces lower precision results.

* Fix serialization of domain packages. (#20755)

Not all of their symbols are exported.

* Preliminary parts needed for ragged support, including densification. (#20757)

Added `ragged` option to `KerasTensor`, `InputLayer` and `convert_to_tensor`. The logic is the same as for sparse tensors.

Fixes https://github.com/keras-team/keras/issues/20731

* Disallow pickle loading in npz files

* Implemented more generic asset tracking mechanism in saved model export. (#20758)

This new implementation is in line with what was done in Keras 2. It tracks all `TrackableResource`s, and lookup tables and hashmaps are subclasses of `TrackableResource`.

This allows users to attach preprocessing functions that are not solely based on Keras preprocessing layers.

* [Keras Ops] Add einops-style `rearrange()` to `keras.ops` (#20733)

* Add einops-style rearrange to keras.ops.einops

* Address PR comments

* Add any_symbolic_tensors() check on call

* Pass all arguments in symbolic_call

* Remove constructor and fix call

* Add basic couple of tests

* Add more tests

* Add examples to docstring

* Skip tests if backend is openvino

* Remove numpy from tests in lieu of keras.ops

* Skip tests for openvino when the testing operation isn't supported

* Remove all type annotations for consistency. (#20762)

Some tools don't like the mix of code with and without type hints.

* Porting TF fake_quant_with_min_max functions (#20641)

* QAT (squashed this time) (#1)

* adds fake_quant_with_min_max functions from TF to keras3

* Addresses PR review comments

* drops another type hint

* swaps out if statements, change float() to ops.cast and adds fake_quant_with_min_max_vars function

* fix missed if statement, adds gradient tests via main function for tf and torch

* fix unbound variable error when not using torch or tf backend (#2)

Refactor to use backend specific gradient functions in tests and merges logic into single function

* More QAT function revisions (#3)

This PR addresses review feedback to fix implementation and to move tests to using named_parameters rather  than individual functions.

* Qat revisions (#4)

Adds axis param and fixes logic for per channel function

* updated implementation

* removed redundant functions

* Add aug_mix processing layer (#20759)

* Add implementation for AugMix

* Update implementation for aug_mix

* Update description for aug_mix

* Fix some issues that was from review

* `JaxLayer` now uses the global dtype policy by default. (#20767)

All floats will now follow the global dtype policy unless a specific dtype policy is passed to the layer.

* fix(ops): Fix issue with map_coordinates for uint8 dtype (#20768)

The issue arose from improper handling of out-of-bound coordinates, causing invalid indexing when using dtype='uint8' with TensorFlow backend.

Changes made:
- Improved processing of coordinates to handle all fill_mode cases, including 'reflect', correctly.
- Simplified the logic for gathering and applying fill values, ensuring consistent behavior across data types.
- Added test cases for uint8, float32, and various fill_mode settings to validate the fix.

Tests for uint8 and float32 now succeed, and the logic for nearest fill_mode and manual casting is also fixed.

Fixes #20608

* Multiple Example Title has been removed in metrics.BinaryIoU (#20775)

* fix(ops): Fix inconsistent padding calculation in PyTorch backend ops (#20774)

* Fix "same" padding torch issue

* format

* fix type

* add condition for channels first and last

* fix(ops): Fix inconsistent padding calculation in PyTorch backend ops

Was able to still reproduce the error, the PyTorch backend had inconsistent
behavior between static shape inference and dynamic execution for pooling
operations, particularly with 'same' padding and non-unit strides, figured
that the root cause was by incorrect padding calculation logic that didn't
properly handle asymmetric padding cases.

Key changes:
- Rewrote _compute_padding_length() to handle stride-based padding
- Fixed padding calculation to properly support asymmetric padding cases
- Standardize channels_first/channels_last conversion in pooling ops
- Cleaned up padding application in _apply_same_padding()
- Added proper handling of data_format throughout pooling pipeline

This fixes the issue where MaxPooling2D with 'same' padding would produce
different shapes between compute_output_shape() and actual execution
(e.g. (1,5,2,2) vs (1,5,2,1)).

Rebased on top of Sachin's September 2024 PR to incorporate
latest keras:master changes.

---------

Co-authored-by: sachin prasad <[email protected]>

* Improve `fake_quant_with_min_max_vars` (#20772)

* Fix fake_quant_with_min_max_vars

* Add FakeQuantWithMinMaxVars operation and use shortcut for TF backend.

* Fix memory leaks in `model.evaluate`. (#20779)

The history is only used in `model.fit`, no need to create it for `evaluate` and `predict`. The history is attached to the model and therefore lives for as long as the model is around.

The executor used in `CallbackList` was never shut down, causing it to keep a thread around, which in turn had thread locals that were leaked.

* fix(applications): Improve validation and error handling for ConvNeXt weights and fix broadcasting in EfficientNetV2 (#20785)

* fix(applications): Improve validation and error handling for ConvNeXt weights

- Validate architecture and weights compatibility before API request.
- Enhance error messages for mismatched model name and weights.

* fix: Correct spurious change, and fix mean/variance shapes for channels_first preprocessing in EfficientNetV2

- Reshaped mean and variance tensors to [1,3,1,1] for proper broadcasting in channels_first mode.
- Ensured compatibility with channels_last format while addressing broadcasting errors.

* fix ciou implementation bug (#20784)

* Add cut_mix processing layer (#20776)

* Add cut_mix processing layer

* Update implementation

* Update logic and refactoring

* correct test case failed.

* Update cut_mix.py

* correct gpu test case failed.

---------

Co-authored-by: François Chollet <[email protected]>

* Add random_invert layer (#20787)

* fix(metrics): Fix BinaryAccuracy metric to handle boolean inputs (#20782)

* Fix BinaryAccuracy metric to handle boolean inputs

Previously, BinaryAccuracy would return incorrect results when given boolean
inputs in JAX backend, and would raise errors in TensorFlow backend. This was
because the metric expects numerical values (floats/integers) but wasn't
properly handling boolean array inputs.

Fix by casting y_true and y_pred to floatx() in MeanMetricWrapper.update_state().
This ensures consistent behavior across backends and proper handling of boolean
inputs.

* fix: Make the linter happy :)

* fix: Align update_state casting with metric's data type

* Fix issue with Masking layer with Tensor as `mask_value` (#20791)

* Fix issue with Masking layer with Tensor as `mask_value`

* fix formatting

* Fix reference to nonexistent namespace (#20810)

The error message produced when using, for example, a tensorflow math
operation in a layer referenced a nonexistent keras.operations namespace
(which makes fixing the issue a lot more difficult for newcomers, given
that they will encounter it while following examples from the book Deep
Learning with Python, 2nd edition). The correct name of the implied
namespace is keras.ops.

* extract metrics update logic into a helper method (#20805)

this change will allow users to customize what happens in the step
function while being able to use existing metrics update logic
without needing to duplicate it

Co-authored-by: Zoe Kendall <[email protected]>

* Turn the attribute `_return_attention_scores` into an argument (#20803)

* Add random_erasing layer (#20798)

* Add initial random_erasing

* Update random_erasing logic

* Update description and add test case

* fix value range bug

* add seed for random fill_value

* fix torch backend resize with `pad_to_aspectio_ratio` is set to `True` (#20797)

* fix torch backend resize with `pad_to_aspectio_ratio` is set to `True`

* fix axis for single image

* Fix issue for when running gpu

* add missing device type

* add unit test when pad_to_aspect_ratio set to True

* fix numpy backend

* nit

* fix api method

* fix if condition for channels_first

* Update fill_mode argument default value in RandomZoom class (#20796)

* Update fill_mode argument default value in RansdomZoom class

* Update fill_mode argument default value in RansdomZoom document

* fix(ops): Handle floating-point edge cases in ops.argmax() (#20808)

* fix(ops): Handle floating-point edge cases in argmax

- Adjust input for negative zero values in argmax.

- Modify implementation to use core ops with floating-point handling.

* fix: Make the linter happy :)

* fix: Resolve spurious change with TensorFlow graph mode compatibility issues

- Improved negative zero handling and axis resolution with graph-compatible tensor ops.

* test: Add negative zero handling test for backends (not supported for OpenVINO)

* fix: Change to self.assertEqual

* fix(ops): Fix ops.argmin() handling of subnormal float values in Keras backends (#20812)

- Update JAX and NumPy backends to handle subnormal float comparisons

- Add test case to verify subnormal float value handling

* Add random_gaussian_blur layer (#20817)

* Add random_gaussian_blur

* Update description and add test cases

* Correct failed test case

* fix(layers): Fix incorrect masked mean/variance in BatchNormalization layer (#20815)

* fix(layers): Fix incorrect masked mean/variance in BatchNormalization layer

Update masked moments calculation to properly account for broadcast dimensions when summing mask weights.

Added test to verify broadcast mask handling produces zero-centered outputs.

* change: skip test for OpenVINO

* fix: Fix OpenVINO compatibility in BatchNormalization layer ops

- Convert tuple reduction axes to list format for compatibility with OpenVINO's constant op

- Remove OpenVINO skip decorator after fixing axis format

* fix: Normalize reduction_axes to list during build

Avoid repeated type checks and conversions during forward pass.

* fix: Double type-casting

* Update SECURITY.md

* Fix for deserializing custom functions serialized with Keras <= 3.6. (#20824)

Fixes https://github.com/keras-team/keras/issues/20806

This a workaround for an incompatibility between 3.6 and 3.7 introduced by serialization bug fix https://github.com/keras-team/keras/pull/20406

* Fix jax version (#20827)

* Update requirements-jax-cuda.txt jax version

* Update requirements-jax-cuda.txt

* Fix CI breakage with torch-xla. (#20828)

Error with torch 2.6:
```
ImportError: /opt/hostedtoolcache/Python/3.9.21/x64/lib/python3.9/site-packages/_XLAC.cpython-39-x86_64-linux-gnu.so: undefined symbol: _ZN5torch8autograd12VariableInfoC1ERKN2at6TensorE

/opt/hostedtoolcache/Python/3.9.21/x64/lib/python3.9/site-packages/torch_xla/__init__.py:20: Impo
```

* Add `signbit` and fix `argmin` and `argmax` (#20821)

* Add `signbit` op and fix `argmin` and `argmax`.

* Add APIs

* Fix CI

* Fix torch CI

* Simplify the logic

* Fix TF GPU CI

* Pin version of torch-xla to 2.5.1. (#20834)

This is needed to make it compatible with the pinned version of torch we're using.

Note that torch-xla 2.6 doesn't support GPU https://pypi.org/project/torch-xla/2.6.0/
GPU support will be coming back with 2.7.

* fix(trainers): Add support for DistributedDatasetsFromFunction in data adapters (#20829)

The is_tf_dataset() function in data adapters now recognizes DistributedDatasetsFromFunction as a valid TensorFlow dataset type. This allows for properly handling distributed datasets created via strategy.distribute_datasets_from_function()

- Added test case to verify distributed datasets from function support

* Bump the github-actions group with 2 updates (#20840)

Bumps the github-actions group with 2 updates: [actions/upload-artifact](https://github.com/actions/upload-artifact) and [github/codeql-action](https://github.com/github/codeql-action).


Updates `actions/upload-artifact` from 4.5.0 to 4.6.0
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](https://github.com/actions/upload-artifact/compare/6f51ac03b9356f520e9adb1b1b7802705f340c2b...65c4c4a1ddee5b72f698fdd19549f0f0fb45cf08)

Updates `github/codeql-action` from 3.28.0 to 3.28.8
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](https://github.com/github/codeql-action/compare/48ab28a6f5dbc2a99bf1e0131198dd8f1df78169...dd746615b3b9d728a6a37ca2045b68ca76d4841a)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: github-actions
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: github-actions
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* update random_erasing factor description. (#20837)

* [OpenVINO backend] Provide more granular tests exclusion mechanism (#20845)

* [OpenVINO backend] Provide more granular tests exclusion mechanism

This mechanism is required for open-source community who will provide PRs for each operation.
In order to validate PR with the concrete operation support, they should remove the corresponding line.

Signed-off-by: Kazantsev, Roman <[email protected]>

* Optimize code in conftest.py

Signed-off-by: Kazantsev, Roman <[email protected]>

* Format code file

Signed-off-by: Kazantsev, Roman <[email protected]>

* Update keras/src/backend/openvino/excluded_concrete_tests.txt

---------

Signed-off-by: Kazantsev, Roman <[email protected]>

* Use Python 3.10 for testing environment (#20846)

* Use Python 3.10 for testing environment.

* Fix TF GPU CI

* Update requirements-jax-cuda.txt (#20852)

* Don't duplicate frozen parameters during predict() (#20851)

On the Jax backend we were not using donate_argnums during predict.
This works when a model is mostly trainable, but when a model is mostly
or all frozen, this will result in 2x the memory jump (which is why
we use donate_argnums for fit and evaluate).

This change adds donate_argnums to the predict function to avoid the
memory spike. But because this means all incoming state (including the
trainable variables) will be deleted by jax, this means we need to
sync the trainable variables state much like in fit and evaluate. An
alternative would be to change the predict_step signature (so we
could only donate non-trainable variables), but this would be a
breaking change and confusing.

* provided y_true or y_pred add labels for plot image gallery method (#20853)

* Fix convnext to work with any custom input tensors (#20854)

* Add applications_test.py test for custom input tensors that currently breaks convnext networks

* Fix convnext to work with any custom input tensors

* Fix code formatting

* Fix code formatting

* Fix code formatting

* Prevent information leakage and improve the ONNX export for the torch backend (#20859)

* Use a better setting for `verbose` and improve the onnx export for the torch backend

* Fix torch CI

* Add Rematerialization to Keras (#20743)

* add remat op

* update test

* remove print statements

* remove memory testing

* run api_gen.sh

* update docstring

* add remat scope

* code reformat

* update scope to return all configs

* add remat wrapper to layer

* add output size mode

* add activation mode to remat

* add warnings and ops to numpy and openvino backend

* fix torch implementatiopn

* update tests

* fix tests

* update numpy and openvino#

* address review comments

* fix indentation

* skip tests in numpy and openvino

* also wrap quantized call

* fix jax test

* fix test

* update docstring and example and expose rematscope

* run api_gen

* address review comments

* update core.py

* fix tests

* update get remat mode

* update exposed apis

* update docstring

* run api_gen.sh

* address review comments

* add mock tests to verify remat being called

* address review comments

* update quantization test

* add functional model test

* skip tests for numpy and openvino

* update remat docstring

* fix torch test

* rollback changes to test

* fix torch test

* fix format errors

* move remat wrapping logic to operations.py

* change jax cuda version to see if array deallocation gets resolved

* disable jax gpu test

* fix jax version

* Add random_perspective layer (#20836)

* Add random_perspective layer

* Add range check for scale

* Update quote for description string

* Update transform_bounding_boxes method.

* Clear JAX state sharding after `fit`, `evaluate` and `predict`. (#20865)

The state sharding is leaked at the end of `fit`, `evaluate` and `predict`. The values are not reused if `fit`, `evaluate` and `predict` is called again.

* add backticks to docstring string code keywords (#20863)

* add backticks to docstring string code keywords

* Update remat.py

* fix(layers): Update Conv2D docstring to clarify numerical precision across backends (#20867)

* fix(layers): Update Conv2D docstring to clarify numerical precision across backends

Clarify that Conv2D operations may exceed the documented 1e-7 precision difference across backends

Document that large convolutions can show notable variations due to accumulated floating-point operations

* Update conv2d.py

---------

Co-authored-by: François Chollet <[email protected]>

* Remove `torchvision` dep and simplify `resize` and `rgb_to_grayscale` in torch backend (#20868)

* Remove `torchvision` dependency and simplify `resize`.

* Add pillow as the testing requirement

* fix time_distributed layer with mask and partial_batch_size (#20765)

* fix time_distributed layer with mask and partial_batch_size

* Fix test fails for non TF backends

* Fix formatting issue

* test case and inline import of TF

* Disable testcase for Numpy backend

* Fix lint error

* Fix Torch GPU CI (#20877)

* fix solve method on linalg (#20879)

* [OpenVINO backend] Support numpy.amax and numpy.amin (#20883)

Signed-off-by: Kazantsev, Roman <[email protected]>

* `HashedCrossing` layer preserves the static batch size when known. (#20889)

Previously, the output of `HashedCrossing` would always have `None` batch size as a result of the underlying Tensorflow `tf.sparse.cross_hashed`.

The previous reshaping logic in `HashedCrossing` would fix the last dimension (expected to be 1) but not the batch dimension.

* `TextVectorization` with `output_sequence_length` returns outputs with a static last dimension of `output_sequence_length`. (#20892)

When handling a ragged intermediate tensor, the padding code would still be executed even though `Ragged.to_tensor` already pads correctly. Changed control flow to skip padding.

When handling a dense intermediate tensor, the padding is applied from the dynamic shape. Added `set_shape` to apply the static `output_sequence_length`.

* fix(ops): Fix TensorFlow backend keras.ops.rot90 shape transformation and improve test coverage (#20882)

* fix(ops): Correct TF rot90 shape transformation and improve test coverage

Fix shape handling in TF rot90 to correctly swap height/width dimensions based on k rotations.

Refactor test suite to use parameterized test cases and cover edge conditions more thoroughly.

* refactor: Make linter happy :)

* fix ifft2 op with TF backend (#20905)

* docs: add params to Sequential.pop docstring (#20896)

* docs: add params to Sequential.pop docstring in  sequential.py

* Remove trailing white space in Sequential.pop docstring in sequential.py

* Remove trailing white space in sequential.py

* docs: add the default argument value in Sequential.pop docstring in sequential.py

* sytle: reformat sequential.py with black

* docs: fix Sequential.pop docstring formatting

* Always allow `ExportArchive.track` to track TensorFlow resources. (#20906)

Previously, `track` would only work with `Layer`s or `Model`s unless the backend was TensorFlow. It would raise an error on JAX for instance.

It is now possible to export saved models with a mix of Keras models and TensorFlow native preprocessing involving resources even with the JAX backend.

- Added example on how to use `ExportArchive` to export a function combining a model with some TensorFlow native preprocessing with a resource.
- Added unit test testing the combining of a model with some TensorFlow native preprocessing with a resource.
- Renamed `track` to `_track_layer` in backend specific `ExportArchive` classes because that is the use case.
- Use `super()` instead of `BackendExportArchive` for consistency.

* Add iterations property to LossScaleOptimizer (#20901)

Fixes #20878. TensorBoard isn't able to report the correct step because
this optimizer doesn't forward the `iterations` property.

* Fix cloning with compiled sequential model (#20888)

* Fix cloning with compiled sequential model

* Fix cloning with compiled functional model

* remove redundant code

* Remove redundant code

* Add perspective_transform for ops (#20899)

* Add perspective_transform for ops

* Add perspective_transform for torch

* Add perspective_transform for jax

* Add perspective_transform for ops

* Add perspective_transform test

* Fix failed test cases

* Fix failed test on torch ci

* Update random_perspective to use ops.perspective_transform (#20915)

* Update get_perspective_matrix method

* Update bbox logic

* refactoring random_perspective

* apply tensor cast

* add dtype conversion

* Update base scale factor

* correct failed test case

* correct failed test case

* correct failed test case

* Remove scale zero test case

* update the logic to use perspective_transform on image layer

* Update test cases

* Only load OpenVINO excludes file when backend is "openvino". (#20923)

It is not necessary to decorate excluded openvino tests with other backends.

* Fix `masking_test.py` saving a file in the current folder. (#20924)

Tests should only write files in a temp folder.

* Recognize placer as a remote location (#20926)

* Recognize placer as a remote location

Recognize `/placer` paths as remote locations,
allowing users to save Keras models directly to
Placer paths.

* Running ./shell/format.sh

* [OpenVINO backend] Support arctan2. (#29010) (#20921)

* support arctan2 ov backend

* fix format

* fix corner case: both x1 and x2 equal zero

* [OpenVINO Backend] Include NumpyDtype tests (#20929)

Signed-off-by: Kazantsev, Roman <[email protected]>

* Remove unused dependency (#20932)

* Fix failing jax remat test (#20935)

* add jit compile for jax training

* change to dense

* [Keras Ops] Add `keras.ops.polar` operation  (#20930)

* Add polar operation and tests

* Fix values for corectness test

* Specify dtype

* merge conflicts (#20934)

Co-authored-by: Mohamed I. Hammad <[email protected]>

* [Openvino Backend] support arange, modify dtype check (#20941)

* Fix mean metrics to allow non-tensor inputs (#20954)

* Fix tril/triu ops (#20900)

* Fix tril/triu ops

* Small change

* Facepalm

* Handle tensors

* Add comment

* Address comments

* Fix `BinaryAccuracy` to handle boolean inputs. (#20956)

This is a follow up to https://github.com/keras-team/keras/pull/20782 and a replacement for https://github.com/keras-team/keras/pull/20782

We cannot cast `y_pred` and `y_true` to the expected output dtype in `MeanMetricWrapper`. Some metrics expect integers (indices or IDs for instance) and fail if `y_pred` and `y_true` are provided as floats.

It is the responsibility of the metric function to cast as needed.

In this case, the correct approach in `BinaryAccuracy` is to use the regular type promotion rules to ensure that the comparison between `y_pred` and `threshold` is done without losing precision. `ops.greater` already does the type promotion correctly. Previously, `threshold` was incorrectly cast to the `y_pred` dtype, which in this case would lower its precision.

* Add gaussian_blur for image (#20943)

* Add Gaussian Blur

* Add Gaussian Blur for ops

* Add gaussian_blur test

* Update gaussian_blur args

* Correct bug for numpy implementation

* Update argument base value

* [OpenVINO backend] Support arctan2, pass the NumpyDtypeTest::arctan2 test (#20928)

* pass NumpyDtypeTest::arctan2 and add some test cases in NumpyTwoInputOpsCorrectnessTest::arctan2

* newly add NumpyDtypeTest::test_arctan2

* Fix JAX CPU tests - saved_model_export.py (#20962)

With JAX 0.5.1, `jax2tf` exports XLA that is not compatible with TensorFlow 2.18, making the `saved_model_export.py` tests fail.

Since Tensorflow 2.19 is not out yet, we pin JAX to 0.5.0 for now.

* Update the RandomGaussianBlur layer to utilize the image layer method (#20958)

* Update random_gaussian_blur layer to use image layer method

* Combine two statements into one

* Add `antialias` to `layers.Resizing` and add more tests. (#20972)

* fix legacy model saving & reloading with axis argument in its layer (#20973)

* fix legacy model saving & relaoding with axis arg in layer

* fix formatting issue

* add temp_file_path

* Make gaussian_blur to use scipy convolve2d (#20974)

* [OpenVino BackEnd]support np.count_nonzero for ov BackEnd (#20945)

* suppoer np.count_nonzero for ov BackEnd

* modifing function vars to lowercase

* Bump the github-actions group with 3 updates (#20975)

Bumps the github-actions group with 3 updates: [ossf/scorecard-action](https://github.com/ossf/scorecard-action), [actions/upload-artifact](https://github.com/actions/upload-artifact) and [github/codeql-action](https://github.com/github/codeql-action).


Updates `ossf/scorecard-action` from 2.4.0 to 2.4.1
- [Release notes](https://github.com/ossf/scorecard-action/releases)
- [Changelog](https://github.com/ossf/scorecard-action/blob/main/RELEASE.md)
- [Commits](https://github.com/ossf/scorecard-action/compare/62b2cac7ed8198b15735ed49ab1e5cf35480ba46...f49aabe0b5af0936a0987cfb85d86b75731b0186)

Updates `actions/upload-artifact` from 4.6.0 to 4.6.1
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](https://github.com/actions/upload-artifact/compare/65c4c4a1ddee5b72f698fdd19549f0f0fb45cf08...4cec3d8aa04e39d1a68397de0c4cd6fb9dce8ec1)

Updates `github/codeql-action` from 3.28.8 to 3.28.10
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](https://github.com/github/codeql-action/compare/dd746615b3b9d728a6a37ca2045b68ca76d4841a...b56ba49b26e50535fa1e7f7db0f4f7b4bf65d80d)

---
updated-dependencies:
- dependency-name: ossf/scorecard-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: github-actions
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: github-actions
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: github-actions
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [OpenVINO backend] Support numpy.append (#20951)

* [OpenVINO backend] Support numpy.append

Signed-off-by: Lim, Kuan Xian <[email protected]>

* Remove NumpyDtype test_append_ from exclude list

Signed-off-by: Lim, Kuan Xian <[email protected]>

* Fix attribute error

Signed-off-by: Lim, Kuan Xian <[email protected]>

* Fix NumpyDtypeTest error

Signed-off-by: Lim, Kuan Xian <[email protected]>

* Update concat to append

Signed-off-by: Lim, Kuan Xian <[email protected]>

---------

Signed-off-by: Lim, Kuan Xian <[email protected]>

* Fix PyTorch stateful RNN/LSTM gradient computation error resolves #20875 (#20916)

* Fix PyTorch stateful RNN gradient computation error

* Updates post feedback

* [Keras Ops and Layer] Add keras.ops.rms_norm() and keras.layers.RMSNormalization() (#20911)

* Add RMSNorm and rms_norm

* math.square -> numpy.square

* Update docstrings

* Add RMSNormalization Layer

* Update docstrings

* Lint with new ruff version

* Add tests for layer

* Address comments

* Convert to tensor if not - avoid openvino and torch typing issues if scale is scalar

* address comments

* Fix tests

* Add reference to paper

* Fix docstring to remove input_dim argument

* Update layer_normalization.py

---------

Co-authored-by: François Chollet <[email protected]>

* Fix docstring

* Update version number

* Enable cuDNN RNNs when dropout is set and `training=True` (#20983)

* Fix `Discretization` serialization when `num_bins` is used. (#20971)

Previously, serialization / deserialization would fail if:
- the layer was saved / restored before `adapt` was called
- the layer was saved / restored after `adapt` was called, but the dataset was such that the number of bins learned was fewer than `num_bins`

The fix consists in adding a `from_config` to handle `bin_boundaries` separately. This is because at initial creation, `bin_boundaries` and `num_bins` cannot be both set, but when restoring the layer after `adapt`, they are both set.

Tightened the error checking:
- never allow `num_bins` and `bin_boundaries` to be specified at the same time, even if they match (same as `tf_keras`)
- don't allow `num_bins` and `bin_boundaries` to be `None` at the same time
- verify that `adapt` has been called in `call`

Also removed `init_bin_boundaries` as the value was never used and its presence can be inferred.

* Add access to native mesh and layout distribution objects. (#20897)

- Added `backend_mesh` property to `keras.distribution.DeviceMesh` to access the native mesh object.
- Added `backend_layout` property to `keras.distribution.TensorLayout` to access the native layout or sharding object.

The values are cached. Changed the code to access these directly instead of calling the convertion functions every time.

Made the following renames so that these functions can be used in backend agnostic code:
- `_to_jax_device` to `_to_backend_device`
- `_to_jax_mesh` and `_to_dtensor_mesh` to `_to_backend_mesh`
- `_to_jax_layout` and `_to_dtensor_layout` to `_to_backend_layout`

* Don't require jax on the numpy backend (#20989)

We can still use it for the resize op, but we shouldn't fail to import
without jax installed.

* Fixes inconsistent serialization logic for inputs (#20993)

* Removes unnesting logic for input tensors in functional model deserialization flow

* Adds test case for verifying nested input restoration after deserialization

removes unnecessary imports

* fixes imports

* Fix flash attention TPU error (#20994)

* Fix flash attention TPU error

* fix space

* fix default mask

* update default mask if none check in wrapping function instead

* Add optional arg for attention logits soft cap for jax tpu backend (#20999)

* Fix flash attention TPU error

* fix space

* fix default mask

* update default mask if none check in wrapping function instead

* allow dot_product attention to accept optional logits soft cap value

* add optional attention soft cap arg

* fix test and add error message

* fix import error

* code reformat

* remove jax dependency from numpy image layer (#21000)

* Wrap tf variables in keras variables for TFSMLayer (#20995)

Fixes #20955

* [OpenVINO Backend] Support numpy exp and expand_dims (#21006)

Signed-off-by: Kazantsev, Roman <[email protected]>

* [Good First Issue][Keras 3 OpenVINO Backend]: Support numpy.dot operation #29119 (#20982)

* Implement dot operation for openvino

* Enable dot tests

* Add pytest.ini in the root directory

* Fix style issues

* Handle scaler inputs and fix code formate

* Handle scaler inputs and fix code formate

* Delete pytest.ini

* Remove scaler handling

* Handle scaler inputs

* Handle scalers and style format

* update scaler handling

* Fix the format of the numpy.py file

* Fix sytling issues

* Fix sytling issues

---------

Co-authored-by: Saif Mohammed <[email protected]>

* Add elastic_transform processing for image.py (#20977)

* Add elastic_transform for numpy

* Add elastic_transform for torch

* Add elastic_transform for jax

* Add elastic_transform for tensorflow

* Add seed generator for elastic_transform

* Add interpolation args

* Add fill_model and fill_value for args

* Add elastic_transform for ops layer

* Add test cases

* Ensures that the layer is marked as built when the `build` is not overriden (#20880)

* Ensure that the layer is correctly marked as built.

* Add `_build_at_init` in `Layer` and use it everywhere.

* Fix typos and add a test case for elastic_transform (#21007)

* fix mis typo

* Add test case

* Re-run test case CI

* [OpenVINO backend]: Support numpy.bincount (#20940)

* feat: implement numpy.bincount for openvino backend

rebased

fix: hardcode dtype int32 when weights=none

Signed-off-by: 11happy <[email protected]>

fix: use np.expand_dims

Signed-off-by: 11happy <[email protected]>

remove unecessary headers

Signed-off-by: 11happy <[email protected]>

style: reformat numpy_test.py

Signed-off-by: 11happy <[email protected]>

* fix: correct test files

Signed-off-by: 11happy <[email protected]>

* fix: reshape depth to scalar

Signed-off-by: 11happy <[email protected]>

* fix: use reshape correctly

Signed-off-by: 11happy <[email protected]>

* fix: take reference from transpose impl to use scalar shape

Signed-off-by: 11happy <[email protected]>

* fix use squeeze

Signed-off-by: 11happy <[email protected]>

* revert to prv impl

Signed-off-by: 11happy <[email protected]>

* fix: scalar type issue

Signed-off-by: 11happy <[email protected]>

* refactor: reduce on rank-1 to have correct results

Signed-off-by: 11happy <[email protected]>

---------

Signed-off-by: 11happy <[email protected]>

* Fix torch CI

* [OpenVINO backend] Support numpy.argsort (#20913)

* [OpenVINO backend] Support numpy.argsort

* [OpenVINO backend] explicitly specify bf16 in get_ov_output from bfloat16 numpy arrays

* remove NumpyOneInputOpsCorrectnessTest::test_argsort

* Fix argsort to handle dynamic shapes

* Fix incorrect argument in JAX flash attention. (#21014)

The mask is named `array` in `NumpyMask`.

* Restore variables on `fit()` interrupt with Jax backend (#21019)

* restore variables on `fit()` interrupt

* fix test

* linter fixes

* [OpenVINO backend] Support numpy.full_like (#21008)

* [OpenVino BackEnd] support np.diff for ov BackEnd (#20950)

* [OpenVino BackEnd] support np.diff for ov BackEnd

* [OpenVino BackEnd] support np.diff for ov BackEnd

* [OpenVino BackEnd] support np.diff for ov BackEnd

* [OpenVino BackEnd] support np.diff for ov BackEnd

* [OpenVino BackEnd] support np.diff for ov BackEnd

* [OpenVino BackEnd] support np.diff for ov BackEnd

* [OpenVino BackEnd] support np.diff for ov BackEnd

* [OpenVINO backend] Support numpy.empty (#21010)

* [OpenVINO Backend] numpy.empty implementation

* fix: reformatted

* fix: fixed final lint issues

* fix: updated empty logic

* Add RandomElasticTransform layer (#21018)

* Add random_elastic_transform

* Add random_elastic_transform test case

* Correct random_elastic_transform failed test case

* Make `import_test.py` debuggable from console output. (#21033)

Previously, if no wheel was found, the `[-1]` subscript would fail, preventing the `if not whl_path` clause from outputting the error message.

* Make code compatible with Numpy >= 2.1. (#21032)

Starting with 2.1, the first argument of `np.reshape` is positional only.

Removed keyword `a` and for consistency did the same with other backends.

* Fix bitwise `left_shift` and `right_shift` result dtype... (#21034)

when second argument is a constant int.

Previously, a `convert_to_tensor` was applied to the second argument, making it an `int32` or `int64`. The result dtype would take into account this dtype, which could upgrade the dtype of the result.

The expectation is that if the second argument is a constant, the result dtype is the same as the first argument. This is already supported correctly by all underlying backend implementations.

* [OpenVINO Backend] Get back tests for exp and expand_dims to precommit (#21038)

Signed-off-by: Kazantsev, Roman <[email protected]>

* [Documentation] Updated Binary Focal Crossentropy Loss Docstring (#21036)

* updated binary focal loss docstring

* update to docstring comment

* fixed typo

* Fix optree regsitration (#21049)

The following will break as reimporting Keras will try to re-register
the Tensorflow list/dict wrappers. Presumably anything that forced an
actual reimport of `keras` would trigger the same crash.

```python
import keras

keras.config.set_backend("tensorflow")
```

* Lion typo fix (#21056)

* Add support for torch tensors on meta device (#21053)

* Add support for torch tensors on meta device

* Add unitary test

* Fix unitary test

* feat: add Categorical Generalized Cross Entropy (GCE) loss (#21024)

* feat: add Categorical Generalized Cross Entropy (GCE) loss

* run api generation

* docs: Align docstrings with Keras style guide

* docs: more docstring changes

* Fix torch gpu tests. (#21063)

* Introduce weights sharding (#21022)

* Introduce weights sharding

* Address comments and update the format of the config file.

* Update docstring

* Resovle comments and add more basic tests for `H5IOStore` and `ShardedH5IOStore`.

* Improve `H5IOStore`. (#21067)

* [Documentation] Added Dice Loss Function Example to Docstring (#21064)

* added example to dice loss function

* linted with ruff

* Allow synchronization value to be set on Variables (#21072)

And use on_read synchronization for Metric variables.

* implement of muon optimizer (#21037)

* implement of muons

* format

* renew note

* api_gen

* api_gen

* api_gen

* fix argument and args

* fix argument and args

* Docstring fixes for Muon optimizer.

* Add pre-commit hooks (#21074)

* Add pre-commit hooks

* Add instructions to run pre-commit manually

* Use tf.int32.min rather than relying on integer overflow (#21077)

* Fix warning for random_saturation (#21066)

* Fix warning for random_saturation

* Update random_saturation.py

* Update random_saturation.py

* Update 1e-6 to epsilon()

* merge master

---------

Co-authored-by: François Chollet <[email protected]>

* Special handling of Torch DDP in callback (#21081)

* Special handling of Torch DDP in callback

* Use inheritance tree for DDP check

Modified DDP check to use isinstance rather than type().__name__ for
robustness. Fixed additional whitepace

* Fixing comment.

* inlining DDP import where its needed.

* fix muon document (#21079)

* fix muon argument

* fix muon argument

* change behavior

* add some test

* add some test

* fix

* fix

* [OpenVINO backend] Support numpy.log10 (#21042)

* [OpenVINO backend] Support numpy.log10

* Address review feedback on log10 implementation

* Fix log function and update excluded_concrete_tests.txt

* Raise error if inputs are not connected with output in functional model (#20705)

* Raise error if inputs are not connected with output in functional model

* Fix Failing test case for unconnected inputs/outputs

* fix formatting issue

* Fix functional dict inputs to support optional ones (#21030)

* Fix functional dict inputs to support optional ones

* Add unit test for optional dict inputs

* Fix unit test formatting

* [OpenVino BackEnd] support np.log2 for ov BackEnd (#21048)

* [OpenVino BackEnd] support np.log2 for ov BackEnd

* [OpenVino BackEnd] support np.log2 for ov BackEnd

* [OpenVino BackEnd] support np.log2 for ov BackEnd

* [OpenVino BackEnd] support np.log2 for ov BackEnd

* Fix `Model.export` to Saved Model for models with dict inputs. (#21095)

Fixes https://github.com/keras-team/keras/issues/20835

Also changed multi-input tests to exercise `model.export()` and its signature inference logic.

* Fix scatter_update for torch (#21101)

* Refactor ModelCheckpoint Save Logic (#21100)

The _save_model method combined the logic to determine if the checkpoint should be saved, and the logic to create the paths and save the checkpoint.

This commit separates the check the determine whether the checkpoint should be saved from the I/O logic, and in doing so resolves two bugs in the current implementation:

1) Host directory is created for every for save iteration, regardless of whether the model will be saved or not. For example, when `save_freq == 'epoch'` and `save_best_only == True`, a folder is created for every epoch, even though the model is only saved when the monitored condition is satisfied. This results in a large number of empty folders and makes it difficult to identify the most recently saved checkpoint.

With this commit, the directory to save the model or model weights is only created when necessary.

2) If save_best_only=True, and the monitored value is an np.ndarray or backend tensor, then it falls back to `save_best_only=False` and saves the model. However, in this scenario, it save saves the whole model without regard to the value of `self.save_weights_only`

This commit uses consistent save logic that always checks the value of `self.save_weights_only`.

* Add verification to remat tests. (#21102)

The functions that go through `remat()` should actually be called, if not, remat is not really applied.

* Fix Remat error when called with a model (#21094)

* add print

* fix remat issue

* simplify code

* enable traceback filtering and update the function sig

* add a wrapper for activations

* change to except

* add layer call decorator

* fix remat call

* `TrackingTest` no longer assigns None to variables (#21106)

JAX will soon fail when `jnp.array` is called with None, so this test will be broken under newer JAX versions if kept as is.

* #21088: fixes activation layer serialization/deserialization logic (#21117)

* fixes activation layer serialization logic

* adds additional test case for string identifiers

* makes pre-commit happy

* fixed torch version issue for macOS (#21136)

* Add alpha argument description to elu docstring (#21142)

* [OpenVINO backend] Support numpy.expm1 (#21141)

* [OpenVINO backend] Support numpy.expm1

* remove a line with NumpyOneInputOpsCorrectnessTest::test_expm1

* does nothing

* does nothing

* Fix Functional model graph under global dtype policy. (#21134)

When constructing a Functional model with a global dtype policy, a spurious `Cast` operation would appear in the graph before each layer. This cast is part of the layer `__call__` method and should not appear separately.

* Bump the github-actions group with 2 updates (#21113)

Bumps the github-actions group with 2 updates: [actions/upload-artifact](https://github.com/actions/upload-artifact) and [github/codeql-action](https://github.com/github/codeql-action).


Updates `actions/upload-artifact` from 4.6.1 to 4.6.2
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](https://github.com/actions/upload-artifact/compare/4cec3d8aa04e39d1a68397de0c4cd6fb9dce8ec1...ea165f8d65b6e75b540449e92b4886f43607fa02)

Updates `github/codeql-action` from 3.28.10 to 3.28.13
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](https://github.com/github/codeql-action/compare/b56ba49b26e50535fa1e7f7db0f4f7b4bf65d80d...1b549b9259bda1cb5ddde3b41741a82a2d15a841)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: github-actions
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: github-actions
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update training_with_built_in_methods.py (#21098)

Clarify parameter name

* [OpenVINO backend]: Implement numpy.identity (#21083)

* openvino backend implement numpy.identity

Signed-off-by: 11happy <[email protected]>

* use openvino DTYPES and exclued test

Signed-off-by: 11happy <[email protected]>

---------

Signed-off-by: 11happy <[email protected]>

* Enable SparseCategoricalCrossentropy to accept and propagate axis (#21104)

* feat: Enable SparseCategoricalCrossentropy to accept and propagate axis; minor PyTorch implementation update to support channel-first layouts

* formatting

* Modified Example code in numerical_utils (#21125)

* Add configurable lora_alpha parameter for LoRA in multiple Keras layers (#21139)

* feat: Add alpha parameter to enable_lora

Adds an alpha scaling parameter to LoRA layers, defaulting to rank for backward compatibility.

* feat: Add lora_alpha tests to Dense, Embedding, and EinsumDense layers

* fix: Fix LoRA test failures by using ops to do numpy conversion

* fix: remove .numpy() in LoRA tests

* docs: Apply backticks to keywords per review

Updated docstrings to enclose parameters like 'alpha' and 'rank' in backticks as requested in PR review.

* Add OpenVINO backend support for argmin and argmax (#21060)

* Update numpy.py

* Update excluded_concrete_tests.txt

* all issues fixed

* Update numpy.py

* numpy.py reformatted

* Update excluded_concrete_tests.txt

* Update excluded_concrete_tests.txt

* Update excluded_concrete_tests.txt

* Update excluded_concrete_tests.txt

* Update excluded_concrete_tests.txt

* Update excluded_concrete_tests.txt

* Add support for dynamic dimensions for ops handling `tf.IndexedSlices`. (#21148)

Fixes https://github.com/keras-team/keras/issues/21069

* [OpenVINO backend] Added support for numpy.isclose operation (#21138)

* Added decomposition for numpy.isclose

* Removed test from excluded list

* Fixed failed test cases

* Fixed dtype error

* Aligns Softmax masking behavior with JAX for fully masked axis (#21149)

* Fixes softmax masking logic to match JAX behavior

* fix comment

* use backend.numpy.multipy for element-wise multiplication

* Removing references to jax.config.spmd_mode('allow_all'). (#21164)

This flag no longer does anything in jax.

* allow TorchModuleWrapper compute output shape (#21160)

* allow TorchModuleWrapper compute output shape

* modify

* Add details when `TestCase.run_layer_test` output verification fails. (#21165)

Adds the expected/actual output shapes/dtypes in the failure message.

Also greatly simplifies the code by using `keras.tree`.

* Improve `tf.RaggedTensor` support in `DataAdapter`s. (#21170)

Previously, only 2D Tensorflow ragged tensors were supported. This adds support for any rank.

Also added tests for ragged tensors with `GeneratorDataAdapter`.

* WIP: Add PyTorch backend support for LSTM with CuDNN optimization (#21135)

* WIP: Add PyTorch backend support for LSTM with CuDNN optimization

* WIP: Add PyTorch backend support for LSTM with CuDNN optimization

* Add backward compatibility to PyTorch-backed LSTM implementation with cuDNN support

* Updates to adress failed tests

* Handling formatting errors

* Add `tf.RaggedTensor` support to `Embedding` layer. (#21171)

Adds support for indices indices in the form of a `tf.RaggedTensor` to the `Embedding` layer by adding support to `ops.take`. The output is also ragged.

Also:
- adds support for negative indices in the sparse tensor use case.
- adds support for ragged tensors in `TestCase.run_layer_test`.

* [OpenVINO Backend]: support numpy.ndim (#21176)

* feat: support numpy.ndim

Signed-off-by: 11happy <[email protected]>

* use shapeof shapeof method

Signed-off-by: 11happy <[email protected]>

---------

Signed-off-by: 11happy <[email protected]>

* Fix Embedding test with ragged tensors on GPU. (#21177)

The loss needs to not have any non-compilable op.

* Add sparse_sigmoid activation (#21175)

* Add sparse_sigmoid activation layer

* Correct typo

* [OpenVINO BACKEND] - feat: implement numpy.nonzero for openvino backend (#21163)

* feat: implement numpy.nonzero for openvino backend

Signed-off-by: 11happy <[email protected]>

* format code

Signed-off-by: 11happy <[email protected]>

---------

Signed-off-by: 11happy <[email protected]>

* Add sparse support to `ops.ones_like` and `ops.zeros_like`. (#21181)

`ops.zeros_like` is in particular useful for creating a mask of the populated values in the sparse tensor.

* Fix dtype detection for JAX types. (#21184)

The jax types like `jax.float32` have a string representation of
```
<class 'jax.numpy.float32'>
```
so with the previous code, would be "standardized" as `float32'>` (trailing quote and angle bracket),
which is an invalid type.  But, the JAX dtypes _do_ have a `__name__` property, so should be
properly detected if we switch the order around.

Kept the old `jax.numpy` string version in place in case that worked with older versions of JAX.

* Bump the python group with 5 updates (#21114)

Updates the requirements on [tensorflow-cpu](https://github.com/tensorflow/tensorflow), [tensorflow](https://github.com/tensorflow/tensorflow), [torch](https://github.com/pytorch/pytorch), [torch-xla](https://github.com/pytorch/xla) and [tensorflow[and-cuda]](https://github.com/tensorflow/tensorflow) to permit the latest version.

Updates `tensorflow-cpu` to 2.18.1
- [Release notes](https://github.com/tensorflow/tensorflow/releases)
- [Changelog](https://github.com/tensorflow/tensorflow/blob/v2.18.1/RELEASE.md)
- [Commits](https://github.com/tensorflow/tensorflow/compare/v2.18.0...v2.18.1)

Updates `tensorflow` to 2.18.1
- [Release notes](https://github.com/tensorflow/tensorflow/releases)
- [Changelog](https://github.com/tensorflow/tensorflow/blob/v2.18.1/RELEASE.md)
- [Commits](https://github.com/tensorflow/tensorflow/compare/v2.18.0...v2.18.1)

Updates `torch` from 2.5.1+cu121 to 2.6.0
- [Release notes](https://github.com/pytorch/pytorch/releases)
- [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md)
- [Commits](https://github.com/pytorch/pytorch/commits/v2.6.0)

Updates `torch-xla` from 2.5.1 to 2.6.0
- [Release notes](https://github.com/pytorch/xla/releases)
- [Commits](https://github.com/pytorch/xla/compare/v2.5.1...v2.6.0)

Updates `tensorflow[and-cuda]` to 2.18.1
- [Release notes](https://github.com/tensorflow/tensorflow/releases)
- [Changelog](https://github.com/tensorflow/tensorflow/blob/v2.18.1/RELEASE.md)
- [Commits](https://github.com/tensorflow/tensorflow/compare/v2.18.0...v2.18.1)

---
updated-dependencies:
- dependency-name: tensorflow-cpu
  dependency-type: direct:production
  dependency-group: python
- dependency-name: tensorflow
  dependency-type: direct:production
  dependency-group: python
- dependency-name: torch
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: python
- dependency-name: torch-xla
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: python
- dependency-name: tensorflow[and-cuda]
  dependency-type: direct:production
  dependency-group: python
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Fix `Embedding.compute_output_spec` with a non-`KerasTensor` input. (#21192)

The `ragged` attribute exists only with `KerasTensor`s.

Minor fix of a unit tests that was using the same local variable for two nested loops.

* Allow `Embedding` subclasses to only override `compute_output_shape`. (#21195)

Without the need to also override `compute_output_spec`.

* Return explicitly layout if already set on variable. (#21194)

If explicitly overwriting a variable._layout, we want to keep this
layout in any future calls.  This allows auxiliary variables (e.g.
optimizer gradients, momentums) to use the same explicit layout.

* Don't scale gradients if overwriting variable with gradient. (#21193)

If overwriting, the gradient represents the desired final value of the variable,
so if we did scale it, we're changing that value.

* Redundant imports; no path hacking in package (#21187)

* Add back shell/format.sh, but it just runs pre-commit (#21197)

For folks who are used to the old format, this will print instructions.
And for people like me, saves needing to remember
`SKIP=api-gen pre-commit run --all-files`
When I just want the formatter. api_gen.py is too slow to run every time.

* Add openvino to the basic requirements file (#21198)

Unlike jax/torch/tensorflow which all vie for a certain cuda, I
don't think openvino has trouble co-installing.

And without the basic requirements.txt will not give a working
dev environment. You can't run pre-commit without openvino installed.

* [Keras 3 OpenVINO Backend]: Support numpy.log1p operation #29487 (#21129)

* Supports numpy.log1p operation

* Applied api-gen hook modifications

* Revert "Applied api-gen hook modifications"

This reverts commit 2b880fa3a3c47650fdbd32ebc98005fa1949e887.

* Excluded Concrete Tests

* Put Blank Line

* Add pre-commit to the common requirements file (#21199)

We also want it for cuda installations.

* Fix nightly releases (#21203)

They have been broken for a month

* Update version number

* [OpenVINO Backend] Support numpy min operation (#21168)

* Add numpy min for OV Backend

* Add boolean case

* Fix failing tests issue

* Update implementation

* Adds Support For Custom Call-Context Arguments (#21204)

* Adds support for call context args

* formatting fixes

* passes kwargs to compute_output_spec of each layer for a sequential model

* removes requirement for outer layers to declare context args in call signature

* renames call_context_flags to call_context_args

* Adds default return value for dictionary lookup

* addresses comments

* fixup comments

* modifies test case to not handle context-arg in intermediate layer

* fix comment

* Recognize /tfhub as a remote location. (#21211)

* Recognize /tfhub as a remote location.

* Add test

* Fix Trainer.get_compile_config base case (empty dict) (#21212)

* Implement angle function in keras.ops (#21200)

* Add first version of angle operation on numpy

* Skip test with bfloat16 on numpy

* Remove bfloat16 checking on Angle

* Fix test case for float16 on torch cuda

* exclude openvino test case

* exclude openvino test case

* exclude openvino test case

* Update init files

* Fix warnings

* [OpenVINO Backend] : add support for numpy.nan_to_num (#21186)

* feat: add support for numpy.nan_to_num

Signed-off-by: 11happy <[email protected]>

* use np.inf

Signed-off-by: 11happy <[email protected]>

* correct implementation based on new tests

Signed-off-by: 11happy <[email protected]>

* use np only torch having import errors

Signed-off-by: 11happy <[email protected]>

* use inf approach

Signed-off-by: 11happy <[email protected]>

* refactor code

Signed-off-by: 11happy <[email protected]>

---------

Signed-off-by: 11happy <[email protected]>

* Clear static loss-scale for inner optimizer in LossScaleOptimizer. (#21233)

The outer `LossScaleOptimizer` ignores the inner's loss-scale factor
when scaling the loss.  When computing unscaled gradients, we therefore
need to eliminate the inner's loss scale factor, otherwise the gradients
get incorrectly scaled.

* Update conftest.py (#21220)

* Update conftest.py

updated the requires_trainable_backend decorator to use in operator for checking backend values.

* Update conftest.py

* Adds `_register_call_context_args` to declare and use call-context arguments. (#21222)

* Adds register_call_context_args API to layer class for better UX

* remove type hints

* Fixes typo + adds tests

* Fixes comment

* Improves test coverage

* Added tests

* Makes methods underscore-private

* Rename @property to _call_context_args

* makes _register_call_context_args the canonical way to use call context args

* minor test fix

* Updated confusion_metrics.py (#21227)

Modified compile() API Code.

* Don't create unused optimizer variables. (#21232)

If `variable.overwrite_with_gradient == True`, then the only optimizer
variable ever used for that variable is `base_optimizer._accumulated_gradients`.
All other optimizer variables are unused.  This can be extremely wasteful
if the training variables are large, for example in the case of large embedding
tables that span multiple hosts/devices.

Added a convenience function in the base optimizer `add_optimizer_variables(...)`
that loops through the variable list and automatically adds a variable only
if appropriate.  If a variable would otherwise be unused, a `None` is inserted
into the list.  This is needed to keep `optimizer._get_variable_index()` consistent.
Updated all built-in optimizers to use this.

NOTE: if a custom optimizer that exists out in the wild still does create
unused optimizer variables, the optimizer should still work - it will just
be wasteful.  IOW this should not be a breaking change.

* Implement bartlett function in keras.ops (#21214)

* Add bartlett for ops

* Update excluded_concrete_tests.txt

* Fix stacked RNN with mask in JAX & Numpy backends (#21224)

* Fix stacked RNN with mask in JAX backend

* Add unit test for stacked RNN mask

* Fix stacked RNN with mask in Numpy backend

* Move unit test to stacked_rnn_cells_test

* Bump github/codeql-action in the github-actions group (#21237)

Bumps the github-actions group with 1 update: [github/codeql-ac…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants