-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Extract code blocks only after Code marker #1223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@@ -187,6 +187,7 @@ def parse_code_blobs(text: str) -> str: | |||
ValueError: If no valid code block is found in the text. | |||
""" | |||
pattern = r"```(?:py|python)?\s*\n(.*?)\n```" | |||
text = text.split("Code:")[-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change assumes that only one "Code:" marker appears in the model output.
@aymeric-roucher do you think this is a sensible assumption? Alternative assumptions?
I think it's more reasonable to assume that the model will often forget to put the "Code:" header. To solve #1219, it would be more adapted IMO to just enforce in the regex "header is |
This PR does not enforce the presence of the "Code:" marker: it can handle model output with or without "Code:" marker.
Additionally, note that the word "py" or "python" after the triple backtick is always optional. My question above was:
|
To clarify: I know that currently the header Sometimes LLM generate their action in 2 parts, and forget to put the "Code:" header This is why, I think it makes more sense to enforce "action code blobs have a mandatory header |
Thanks for the clarification, @aymeric-roucher. Just a naive question: do you think it could be plausible that the model might generate a py/python code block within the "Thought" section (e.g., as part of reasoning or planning), which should not be parsed as an action code block? If so, maybe a combined approach would be more solid... Curious to hear your thoughts on that edge case. |
@albertvillanova it is possible indeed! But in terms of reducing false positives / false negatives, I think the solution "force the heading with |
Extract code blocks only after Code marker.
Fix #1219.