Skip to content

Support for bounded dynamic model #701

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jun 14, 2025
Merged

Conversation

preetha-intel
Copy link

Description

Introduce an user option to convert a fully dynamic model into a bounded dynamic model for efficiency.

@jatinwadhwa921 jatinwadhwa921 force-pushed the jatin_ub_lb_redesign branch from 624ba6f to 010200c Compare June 6, 2025 13:24
Add support for output copy post ov infer to ORT for NPU with bounded
dims
ov_tensor_data.tensor_ptr = std::make_shared<ov::Tensor>(input_info.type, input_info.ov_shape.get_shape(),
const_cast<void*>(tensor.GetTensorRawData()));
if (is_cpu) {
tensor_ptr = std::make_shared<ov::Tensor>(input_info.type, input_tensor_shape, (void*)tensor_data);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix this using cpp style cast

@ankitm3k ankitm3k requested a review from Copilot June 11, 2025 07:25
Copilot

This comment was marked as outdated.

@ankitm3k ankitm3k requested a review from Copilot June 11, 2025 16:19
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for bounded dynamic models by introducing a new user option (reshape_input) to convert fully dynamic models into bounded dynamic ones for better efficiency. Key changes include:

  • Enhancements to the OpenVINO provider factory and parser utilities to parse and validate the new reshape_input configuration.
  • Updates to backends and session tests to handle dynamic input shape validation and reshaping.
  • Minor formatting and logging improvements in various OpenVINO-related components.

Reviewed Changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
onnxruntime/test/providers/openvino/openvino_ep_context_test.cc Validates error handling when a folder path is provided for ep_context_file_path.
onnxruntime/test/perftest/ort_test_session.cc Adds support for the new "reshape_input" configuration in performance tests.
onnxruntime/python/tools/quantization/matmul_nbits_quantizer.py Reformats conditions for readability without functional changes.
onnxruntime/core/providers/openvino/ov_interface.h Adds necessary includes for dynamic shapes.
onnxruntime/core/providers/openvino/openvino_provider_factory.cc Incorporates parsing logic for the new reshape_input option.
onnxruntime/core/providers/openvino/openvino_parser_utils.{h,cc} Implements parsing of reshape_input using regular expressions and helper functions.
onnxruntime/core/providers/openvino/contexts.h Introduces the reshape_t type to represent input reshaping settings.
onnxruntime/core/providers/openvino/backends/basic_backend.{h,cc} Uses dynamic_flags to track static, fully dynamic, and bounded dynamic states and introduces dimension validation.
onnxruntime/core/providers/openvino/backend_utils.cc Applies the reshape configuration to the OV model.
onnxruntime/core/providers/openvino/backend_manager.{h,cc} Implements dynamic input validation and ensures reshape_input covers all dynamic inputs.
Comments suppressed due to low confidence (1)

onnxruntime/test/perftest/ort_test_session.cc:826

  • Consider adding test cases to verify that the new 'reshape_input' key is correctly handled and its effects are appropriately validated in performance tests.
ov_options[key] = value;

int64_t min_dim = ov_dim.get_min_length();
int64_t max_dim = ov_dim.get_max_length();
if (ort_dim < min_dim || ort_dim > max_dim) {
ORT_THROW(" ORT Dimension is out of range");
Copy link
Preview

Copilot AI Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enhance the error message in ValidateOrtDimsAgainstPartialShape to include additional context (such as the dimension index and expected range) to aid debugging.

Copilot uses AI. Check for mistakes.

@@ -146,6 +146,10 @@ CreateOVModel(std::string&& model,
try {
auto ov_model = OVCore::Get()->ReadModel(std::move(model), session_context.onnx_model_path_name.string());

if (!session_context.reshape.empty()) {
LOGS_DEFAULT(INFO) << log_tag << "Reshaping the ov tensor to specified shape";
Copy link
Preview

Copilot AI Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider logging the specific reshape details (e.g., target shape values) to provide clearer insight during debugging and validation of the reshaping process.

Suggested change
LOGS_DEFAULT(INFO) << log_tag << "Reshaping the ov tensor to specified shape";
LOGS_DEFAULT(INFO) << log_tag << "Reshaping the ov tensor to specified shape: " << session_context.reshape;

Copilot uses AI. Check for mistakes.

Copy link

@sfatimar sfatimar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved

@sfatimar
Copy link

@preetha-intel @jatinwadhwa921 please fix Internal CI ..

@sfatimar
Copy link

@vthaniel please approve from your end.

@vthaniel
Copy link

@preetha-intel @sfatimar @jatinwadhwa921
Tested the functionality with a python sample. Seems to work fine.
Please fix the unittest failures (same failures are seen on Windows also)

@preetha-intel
Copy link
Author

All unit tests pass. Internal CI building fine.

@preetha-intel preetha-intel merged commit ca06b7a into ovep-develop Jun 14, 2025
6 of 8 checks passed
javier-intel pushed a commit that referenced this pull request Jun 15, 2025
* Refactored the code for reshape feature

* Refactor the inference logic accomodating  bounded dimensions

* Fix lint issues

* Refactor OV shapes classification to be a part of bindings struct

* Refactor the provider options key verification for python interface

* Restrict removal of model proto when CPU fallback is enabled and fix unit test failures

---------

Co-authored-by: jatinwadhwa921 <[email protected]>
// Unified OV compile_model is efficient when ov model caching is enabled
// Unified OV compile_model API is supported with AUTO from version 2024.3 and above
// Inputs with static dimensions
// Not enabled for models with external weights and when ep context is set.
const std::string model = model_proto->SerializeAsString();
// we have the serialized string, so we can release model proto to lower the peak memory consumption
model_proto.reset();
if (disable_cpu_fallback) model_proto.reset();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@preetha-intel .. Wouldn't this introduce a compile time memory regression if CPU fallback is not disabled and model fully compiles on NPU

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants