Skip to content

Idea: Use constrained decoding (structured output) so that you always get back the same 10 items you put in #11

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
danielbodart opened this issue Mar 1, 2025 · 1 comment

Comments

@danielbodart
Copy link

danielbodart commented Mar 1, 2025

Instead of needing to retry and incure higher costs, you can enforce the model to always return the items you have asked it to rank. The process is called constrained decoding and OpenAI supports a subset of the capability via structured outputs and JSON Schema.

Here is an example JSON Schema that says the response is going to be a JSON Array with 10 unique items in it

{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "array",
"minItems": 10,
"maxItems": 10,
"uniqueItems": true,
"items": {
"type": "string",
"enum": ["A", "B", "C", "D", "E", "F", "G", "H", "I", "J"]
}
}

So you could just change the minItems, maxItems and enum to match your input.

According to the OpenAI docs there is a latency delay as it converts the schema into a grammar that is then injected into the model execution context, so this idea might have a downside or potentially they don't support this set of schema features.

(Good article BTW)

@noperator
Copy link

I'm currently maintaining Raink over at https://github.com/noperator/raink (updates will occasionally flow upstream to this source as this repo's maintainers have time).

Thanks for the idea. Will report back when if/when I give it a try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants