You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
DeserializerWrapper prevents the use of valid input Content-Types during inference - SageMaker endpoitns build with ModelBuilder and SchemaBuilder
#5006
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
Describe the bug DeserializerWrapper overwrites the content_type provided to deserialize() to be the value of Accept. This DeserializerWrapper is used for both Input and Output de-serialization when using SchemaBuilder. The comments mention:
We need to overwrite the accept type because model servers like XGBOOST always returns "text/html"
but this effectively prevents endpoints deployed through the use of ModelBuilder and SchemaBuilder to support additional Content-Types, despite the various implementations of BaseDeserializer supporting multiple Content-Types.
For example, an inference endpoint deployed via ModelBuilder and SchemaBuilder which takes a np.array as input cannot be invoked with a Content-Type such as application/json or text/csv due to this overwriting. This also makes the developer experience confusing as the stack trace shows the execution path for application/x-npy Content-Type, despite a different Content-Type being explicitly provided.
To reproduce
importsagemakerschema_builder=sagemaker.serve.SchemaBuilder(
sample_input=np.array([[6.4, 2.8, 5.6, 2.2]]),
sample_output=np.array(
[[0.09394703, 0.4797692, 0.42628378]], dtype=np.float32
),
)
# similar usage to that defined in unit tests for base NumpyDeserializer: https://github.com/aws/sagemaker-python-sdk/blob/master/tests/unit/sagemaker/deserializers/test_deserializers.py#L113schema_builder.input_deserializer.deserialize(io.BytesIO(b"[[6.4, 2.8, 5.6, 2.2]]"), content_type="application/json")
# Below is not needed to reproduce, but is provided as an example for how I am using SchemaBuildermodel_builder=sagemaker.serve.ModelBuilder(
mode=sagemaker.serve.mode.function_pointers.Mode.SAGEMAKER_ENDPOINT,
model_metadata={
"MLFLOW_MODEL_PATH": f"models:/{model_name}/{model_version}",
"MLFLOW_TRACKING_ARN": TRACKING_SERVER_ARN,
},
model_server=sagemaker.serve.ModelServer.TENSORFLOW_SERVING,
role_arn=sagemaker.get_execution_role(),
schema_builder=schema_builder,
)
model=model_builder.build()
model.deploy(
initial_instance_count=1,
instance_type="ml.m5.xlarge",
)
Expected behavior
I would like the content_type to not be overwritten for input de-serialization, so that I can use SchemaBuilder for inference endpoints while providing a Content-Type other than application/x-npy, such as application/json via CURL.
System information
A description of your system. Please provide:
SageMaker Python SDK version: 2.237.3
Framework name (eg. PyTorch) or algorithm (eg. KMeans): TensorFlow (but applies to all)
Framework version: 2.16 (applies to all)
Python version: 3.12.7
CPU or GPU: CPU
Custom Docker image (Y/N): N
Additional context model.deploy() returns an instance of Predictor, which does work properly. However, this limits inference to Python clients who have created a Predictor instance. Clients in other languages are unable to invoke the inference server, for example.
Describe the bug
DeserializerWrapper
overwrites thecontent_type
provided todeserialize()
to be the value ofAccept
. ThisDeserializerWrapper
is used for both Input and Output de-serialization when usingSchemaBuilder
. The comments mention:but this effectively prevents endpoints deployed through the use of
ModelBuilder
andSchemaBuilder
to support additionalContent-Types
, despite the various implementations ofBaseDeserializer
supporting multiple Content-Types.For example, an inference endpoint deployed via
ModelBuilder
andSchemaBuilder
which takes anp.array
as input cannot be invoked with aContent-Type
such asapplication/json
ortext/csv
due to this overwriting. This also makes the developer experience confusing as the stack trace shows the execution path forapplication/x-npy
Content-Type, despite a different Content-Type being explicitly provided.To reproduce
Expected behavior
I would like the
content_type
to not be overwritten for input de-serialization, so that I can useSchemaBuilder
for inference endpoints while providing a Content-Type other thanapplication/x-npy
, such asapplication/json
via CURL.Screenshots or logs
System information
A description of your system. Please provide:
Additional context
model.deploy()
returns an instance ofPredictor
, which does work properly. However, this limits inference to Python clients who have created aPredictor
instance. Clients in other languages are unable to invoke the inference server, for example.Workarounds may be possible by implementing
CustomPayloadTranslator
and providing it viainput_translator
, but I have not yet tested this.The text was updated successfully, but these errors were encountered: