Add ThinkAgents/ThinkAgent-1B #928

0xayman · 2025-03-03T16:58:14Z

Add ThinkAgents/ThinkAgent-1B model and model handler

HuanzhiMao

Thanks for the PR @0xayman !
Is there a reason why the _format_prompt method is using a different chat template logic than the one linked in huggingface (link)?

0xayman · 2025-03-13T22:58:31Z

When i used the default one, the generations were messy, so I changed it to be the same formatting function i used for finetuning the model.

HuanzhiMao · 2025-03-15T00:06:53Z

When i used the default one, the generations were messy, so I changed it to be the same formatting function i used for finetuning the model.

In that case, if you’re using a custom chat template that provides better generation results, please document it in the model card. This way, users will know exactly how to replicate your function-calling setup, and we’ll benchmark the model using your recommended approach so the score accurately reflects the typical user experience.

0xayman · 2025-03-17T17:35:38Z

I have updated the model tokenizer to use the correct chat template used in the _format_prompt function. please review it and let me know if further updates are needed.

HuanzhiMao · 2025-03-17T22:30:13Z

I think the chat template and _format_prompt function is still misaligned.
For example, your chat template has the following section:

{%- for message in messages %}
            {%- if not (message.role == 'tool' or 'tool_calls' in message) %}
                {{- '<|start_header_id|>' + message['role'] + '<|end_header_id|>' + message['content'] | trim + '<|eot_id|>' }}
            ...
            {%- endif %}
        {%- endfor %}

but in your `_format_prompt), you only have this:

        for message in messages:
            formatted_prompt += f"{message['content']}<|eot_id|>\n"

Notice how the <|start_header_id|> and <|end_header_id|> thing are missing.

HuanzhiMao · 2025-03-18T07:08:37Z

@0xayman Would you be fine if I directly made modifications to your branch? I can also raise a PR to your branch instead. Either way is fine with me.

0xayman · 2025-03-18T13:28:58Z

Yes it's fine, you can go ahead with and make and required modifications.

HuanzhiMao · 2025-03-19T00:30:52Z

Are you doing the tools_in_user_message approach or no (according to your chat template)?

0xayman · 2025-03-19T01:13:55Z

Yes I'm passing the tools in the user message instead of the system prompt. I found this to work better

HuanzhiMao · 2025-03-19T02:20:07Z

I got the following for data_overall.csv. Does this align with what you obtained?

Rank,Overall Acc,Model,Model Link,Cost ($ Per 1k Function Calls),Latency Mean (s),Latency Standard Deviation (s),Latency 95th Percentile (s),Non-Live AST Acc,Non-Live Simple AST,Non-Live Multiple AST,Non-Live Parallel AST,Non-Live Parallel Multiple AST,Non-Live Exec Acc,Non-Live Simple Exec,Non-Live Multiple Exec,Non-Live Parallel Exec,Non-Live Parallel Multiple Exec,Live Acc,Live Simple AST,Live Multiple AST,Live Parallel AST,Live Parallel Multiple AST,Multi Turn Acc,Multi Turn Base,Multi Turn Miss Func,Multi Turn Miss Param,Multi Turn Long Context,Relevance Detection,Irrelevance Detection,Organization,License
1,18.03%,ThinkAgent-1B,https://huggingface.co/ThinkAgents/ThinkAgent-1B,N/A,11.12,18.31,61.04,56.38%,49.00%,73.00%,59.00%,44.50%,0.00%,0.00%,0.00%,0.00%,0.00%,27.05%,42.64%,28.11%,31.25%,16.67%,0.00%,0.00%,0.00%,0.00%,0.00%,61.11%,19.33%,ThinkAgents,apache-2.0

0xayman · 2025-03-19T02:38:58Z

Can you please share the csv file so I can check it ?

HuanzhiMao · 2025-03-19T02:39:57Z

ThinkAgents_ThinkAgent-1B.zip

0xayman · 2025-03-19T02:44:34Z

It is close to what I get for parallel_mutliple and multiple ast tests. But very far in simple and parallel ast tests.
These are the only 4 measures we're interested on.

0xayman · 2025-03-19T02:46:01Z

My latest evaluation records:
Simple: 77.25
Parallel Multiple: 45
Paralllel: 65.5
Multiple: 71.5

0xayman · 2025-03-20T15:01:50Z

I've updated the handler and attached the latest evaluation results.
score.zip

HuanzhiMao · 2025-03-22T00:48:16Z

I generated another run, and attached the fully formatted prompt before it hit the completion endpoint for test case id simple_399. Is that expected? There seem to be conflicting formatting instructions in the system and user prompts, plus duplicated function documentation in both. That’s why I didn’t include the default system prompt earlier.

If everything looks good to you, I’ll go ahead and merge the PR and update the leaderboard with your model’s score!

<|begin_of_text|><|start_header_id|>system<|end_header_id|>Cutting Knowledge Date: December 2023Today Date: 07 Dec 2024You are an expert in composing functions. You are given a question and a set of possible functions. Based on the question, you will need to make one or more function/tool calls to achieve the purpose.\nIf none of the functions can be used, point it out. If the given question lacks the parameters required by the function, also point it out.\nYou should only return the function calls in your response.\n\nIf you decide to invoke any of the function(s), you MUST put it in the format of [func_name1(params_name1=params_value1, params_name2=params_value2...), func_name2(params)]\nYou SHOULD NOT include any other text in the response.\n\nAt each turn, you should try your best to complete the tasks requested by the user within the current turn. Continue to output functions to call until you have fulfilled the user's request to the best of your ability. Once you have no more functions to call, the system will consider the current turn complete and proceed to the next turn or task.\n\nHere is a list of functions in JSON format that you can invoke.\n[{'name': 'restaurant_search', 'description': 'Locates top rated restaurants based on specific criteria such as type of cuisine, ratings, and facilities. Note that the provided function is in Python 3 syntax.', 'parameters': {'type': 'dict', 'properties': {'location': {'type': 'string', 'description': 'The city and state, e.g. New York City, NY'}, 'cuisine': {'type': 'string', 'description': 'Preferred type of cuisine e.g., Italian, Indian, American, etc.'}, 'rating': {'type': 'integer', 'description': 'Minimum average customer rating out of 5'}, 'accepts_credit_cards': {'type': 'boolean', 'description': 'If the restaurant should accept credit cards.'}}, 'required': ['location', 'cuisine', 'rating', 'accepts_credit_cards']}}]<|eot_id|><|start_header_id|>user<|end_header_id|>Given the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}.Do not use variables.{\n "name": "restaurant_search",\n "description": "Locates top rated restaurants based on specific criteria such as type of cuisine, ratings, and facilities. Note that the provided function is in Python 3 syntax.",\n "parameters": {\n "location": {\n "type": "string",\n "description": "The city and state, e.g. New York City, NY"\n },\n "cuisine": {\n "type": "string",\n "description": "Preferred type of cuisine e.g., Italian, Indian, American, etc."\n },\n "rating": {\n "type": "integer",\n "description": "Minimum average customer rating out of 5"\n },\n "accepts_credit_cards": {\n "type": "boolean",\n "description": "If the restaurant should accept credit cards."\n }\n }\n}Find me the best Italian restaurants in New York City with average customer ratings of more than 4 and accepts credit cards.<|eot_id|><|start_header_id|>assistant<|end_header_id|>

0xayman · 2025-03-22T01:42:03Z

Yes the default system prompt should not be included. Here is the formatting function I was usnig initially:

def _format_prompt(self, messages, function):
        # We first format the function signature and then add the messages
        tools = self._convert_functions_format(function)

        formatted_prompt = f"""<|begin_of_text|><|start_header_id|>system<|end_header_id|>
Cutting Knowledge Date: December 2023
Today Date: 07 Dec 2024

<|eot_id|><|start_header_id|>user<|end_header_id|>

Given the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.

Respond in the format {{"name": function name, "parameters": dictionary of argument name and its value}}.Do not use variables.

{tools}

"""
        
        for message in messages:
            formatted_prompt += f"{message['content']}<|eot_id|>\n"

        formatted_prompt += "<|start_header_id|>assistant<|end_header_id|>\n"
        return formatted_prompt

Can you please tell me if there are any particular reasons we can't use it?

HuanzhiMao · 2025-03-22T03:31:40Z

Yes the default system prompt should not be included. Here is the formatting function I was usnig initially:

def _format_prompt(self, messages, function):
        # We first format the function signature and then add the messages
        tools = self._convert_functions_format(function)

        formatted_prompt = f"""<|begin_of_text|><|start_header_id|>system<|end_header_id|>
Cutting Knowledge Date: December 2023
Today Date: 07 Dec 2024

<|eot_id|><|start_header_id|>user<|end_header_id|>

Given the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.

Respond in the format {{"name": function name, "parameters": dictionary of argument name and its value}}.Do not use variables.

{tools}

"""
        
        for message in messages:
            formatted_prompt += f"{message['content']}<|eot_id|>\n"

        formatted_prompt += "<|start_header_id|>assistant<|end_header_id|>\n"
        return formatted_prompt

Can you please tell me if there are any particular reasons we can't use it?

As explained here, the issue with your formatting function is that, it is not aligned with what the chat template from the model card on huggingface is suggesting.

0xayman · 2025-03-22T05:45:35Z

I think I misunderstood the last message. but,, where is this part of the prompt is coming from:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>Cutting Knowledge Date: December 2023Today Date: 07 Dec 2024You are an expert in composing functions. You are given a question and a set of possible functions. Based on the question, you will need to make one or more function/tool calls to achieve the purpose.\nIf none of the functions can be used, point it out. If the given question lacks the parameters required by the function, also point it out.\nYou should only return the function calls in your response.\n\nIf you decide to invoke any of the function(s), you MUST put it in the format of [func_name1(params_name1=params_value1, params_name2=params_value2...), func_name2(params)]\nYou SHOULD NOT include any other text in the response.\n\nAt each turn, you should try your best to complete the tasks requested by the user within the current turn. Continue to output functions to call until you have fulfilled the user's request to the best of your ability. Once you have no more functions to call, the system will consider the current turn complete and proceed to the next turn or task.\n\nHere is a list of functions in JSON format that you can invoke.\n[{'name': 'restaurant_search', 'description': 'Locates top rated restaurants based on specific criteria such as type of cuisine, ratings, and facilities. Note that the provided function is in Python 3 syntax.', 'parameters': {'type': 'dict', 'properties': {'location': {'type': 'string', 'description': 'The city and state, e.g. New York City, NY'}, 'cuisine': {'type': 'string', 'description': 'Preferred type of cuisine e.g., Italian, Indian, American, etc.'}, 'rating': {'type': 'integer', 'description': 'Minimum average customer rating out of 5'}, 'accepts_credit_cards': {'type': 'boolean', 'description': 'If the restaurant should accept credit cards.'}}, 'required': ['location', 'cuisine', 'rating', 'accepts_credit_cards']}}]<|eot_id|>

If I'm not mistaken, the formatting function in the current version of the code uses the following logic:

formatted_prompt = "<|begin_of_text|>"

        system_message = ""
        remaining_messages = messages
        if messages[0]["role"] == "system":
            system_message = messages[0]["content"].strip()
            remaining_messages = messages[1:]

        formatted_prompt += "<|start_header_id|>system<|end_header_id|>"
        formatted_prompt += "Cutting Knowledge Date: December 2023"
        formatted_prompt += "Today Date: 07 Dec 2024"
        formatted_prompt += system_message + "<|eot_id|>"

It cutsoff the default system prompt then append my custom system prompt.

HuanzhiMao · 2025-03-22T21:15:02Z

The default system prompt is not cut off. It's still included through these two lines.

system_message = messages[0]["content"].strip()
formatted_prompt += system_message + "<|eot_id|>"

0xayman · 2025-03-22T22:41:51Z

Please review the latest commit I've made.
I removed the redundent system prompt. now the model should be fine.

0xayman · 2025-03-27T22:11:16Z

@HuanzhiMao Just a reminder to check if everthing is going good.

HuanzhiMao · 2025-03-28T00:43:57Z

Regarding your last commit, 7f1f62f, it makes sense to not include the default system prompt here. However, these changes don't make sense. They have nothing to do with system prompt, and you are not following your own chat template.

For example, for this part of the chat template on function doc format, it does not translate to just formatted_prompt +=f"{tools}\n":

{%- for t in tools %}
    {{- {"name": t.name, "description": t.description, "parameters": t.parameters.properties} | tojson(indent=4) }}
    {{- "" }}
{%- endfor %}

0xayman · 2025-03-28T19:51:30Z

I've fixed the chat template and made a new commit.

0xayman · 2025-04-02T13:43:42Z

@HuanzhiMao Any updates so far?

HuanzhiMao · 2025-04-05T06:18:40Z

Regarding your last commit, 7f1f62f, it makes sense to not include the default system prompt here. However, these changes don't make sense. They have nothing to do with system prompt, and you are not following your own chat template.

For example, for this part of the chat template on function doc format, it does not translate to just formatted_prompt +=f"{tools}\n":
{%- for t in tools %}
    {{- {"name": t.name, "description": t.description, "parameters": t.parameters.properties} | tojson(indent=4) }}
    {{- "" }}
{%- endfor %}

I believe you haven't addressed my above concern in your new commit.

0xayman · 2025-04-05T18:34:09Z

The part you've mentioned is no longer part of the chat_template, it has been replaced with this code In the last commit. Also the model's chat_template has been updated in huggingface.

HuanzhiMao · 2025-04-08T23:43:22Z

What about these new-added \n? 7f1f62f#diff-40b57a4c4264b132e498ec51e74df9b92cb88c3316d4670f4255743d21c3b5d9R130-R132

0xayman · 2025-04-09T03:19:13Z

Are you sure you are viewing the latest commit ? https://github.com/0xayman/gorilla/tree/382b4957f60a3245c37a5446a2a96cb758e645f6

HuanzhiMao · 2025-04-10T07:58:00Z

Are you sure you are viewing the latest commit ? https://github.com/0xayman/gorilla/tree/382b4957f60a3245c37a5446a2a96cb758e645f6

Yes. For example, the \n is still at the end of this line https://github.com/0xayman/gorilla/blob/382b4957f60a3245c37a5446a2a96cb758e645f6/berkeley-function-call-leaderboard/bfcl/model_handler/local_inference/think_agent.py#L119

0xayman · 2025-04-10T16:12:21Z

I'm not sure if it will affect the results, but I've removed them to be consistent.
Please let me know if further updates are needed.

0xayman and others added 5 commits March 3, 2025 16:42

Add ThinkAgents/ThinkAgent-1B

dc0f0c8

Add ThinkAgents/ThinkAgent-1B

8b463aa

update model handler

8e64758

add ThinkAgents/ThinkAgent-1B to supported models

d99d9f3

Merge branch 'main' into main

7fd8011

HuanzhiMao reviewed Mar 13, 2025

View reviewed changes

include chat template

c367524

Merge branch 'main' into pr/0xayman/928

cfacce7

0xayman and others added 6 commits March 18, 2025 01:36

update formatting function to match chat template

7865944

Merge branch 'main' of https://github.com/0xayman/gorilla

d85a3ff

Merge branch 'main' into pr/0xayman/928

10a9b03

remove extra <|eot_id|>

17f296e

align the chat template with formatting function

1b239bd

align the chat template with formatting function

d65f28b

HuanzhiMao added 2 commits March 18, 2025 17:19

no default system prompt for think agent

304839d

Merge branch 'main' into pr/0xayman/928

fc3fad3

update _format_prompt

695da64

update handler to match expected results

35fc68f

Merge branch 'main' into pr/0xayman/928

8e12b6c

update format prompt function

7f1f62f

fix chat template

382b495

remove extra \n

9b713f7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ThinkAgents/ThinkAgent-1B #928

Add ThinkAgents/ThinkAgent-1B #928

0xayman commented Mar 3, 2025

HuanzhiMao left a comment

0xayman commented Mar 13, 2025

HuanzhiMao commented Mar 15, 2025

0xayman commented Mar 17, 2025

HuanzhiMao commented Mar 17, 2025

HuanzhiMao commented Mar 18, 2025

0xayman commented Mar 18, 2025

HuanzhiMao commented Mar 19, 2025

0xayman commented Mar 19, 2025

HuanzhiMao commented Mar 19, 2025

0xayman commented Mar 19, 2025

HuanzhiMao commented Mar 19, 2025

0xayman commented Mar 19, 2025

0xayman commented Mar 19, 2025

0xayman commented Mar 20, 2025

HuanzhiMao commented Mar 22, 2025

0xayman commented Mar 22, 2025

HuanzhiMao commented Mar 22, 2025

0xayman commented Mar 22, 2025

HuanzhiMao commented Mar 22, 2025

0xayman commented Mar 22, 2025

0xayman commented Mar 27, 2025

HuanzhiMao commented Mar 28, 2025 •

edited

Loading

0xayman commented Mar 28, 2025

0xayman commented Apr 2, 2025

HuanzhiMao commented Apr 5, 2025

0xayman commented Apr 5, 2025

HuanzhiMao commented Apr 8, 2025

0xayman commented Apr 9, 2025

HuanzhiMao commented Apr 10, 2025

0xayman commented Apr 10, 2025

Add ThinkAgents/ThinkAgent-1B #928

Are you sure you want to change the base?

Add ThinkAgents/ThinkAgent-1B #928

Conversation

0xayman commented Mar 3, 2025

HuanzhiMao left a comment

Choose a reason for hiding this comment

0xayman commented Mar 13, 2025

HuanzhiMao commented Mar 15, 2025

0xayman commented Mar 17, 2025

HuanzhiMao commented Mar 17, 2025

HuanzhiMao commented Mar 18, 2025

0xayman commented Mar 18, 2025

HuanzhiMao commented Mar 19, 2025

0xayman commented Mar 19, 2025

HuanzhiMao commented Mar 19, 2025

0xayman commented Mar 19, 2025

HuanzhiMao commented Mar 19, 2025

0xayman commented Mar 19, 2025

0xayman commented Mar 19, 2025

0xayman commented Mar 20, 2025

HuanzhiMao commented Mar 22, 2025

0xayman commented Mar 22, 2025

HuanzhiMao commented Mar 22, 2025

0xayman commented Mar 22, 2025

HuanzhiMao commented Mar 22, 2025

0xayman commented Mar 22, 2025

0xayman commented Mar 27, 2025

HuanzhiMao commented Mar 28, 2025 • edited Loading

0xayman commented Mar 28, 2025

0xayman commented Apr 2, 2025

HuanzhiMao commented Apr 5, 2025

0xayman commented Apr 5, 2025

HuanzhiMao commented Apr 8, 2025

0xayman commented Apr 9, 2025

HuanzhiMao commented Apr 10, 2025

0xayman commented Apr 10, 2025

HuanzhiMao commented Mar 28, 2025 •

edited

Loading