Why are there so much redundant and irrelevant prompt data in the question classifier ? #17035
Replies: 3 comments 2 replies
-
I found 2 similar discussions that might be helpful:
|
Beta Was this translation helpful? Give feedback.
-
问题分类器 确实不怎么好用,你有什么更好的方式提高他的准确性吗? |
Beta Was this translation helpful? Give feedback.
-
It seems that the redundant and irrelevant prompts come from this file :https://github.com/langgenius/dify/blob/main/api/core/workflow/nodes/question_classifier/template_prompts.py |
Beta Was this translation helpful? Give feedback.
-
Self Checks
Content
The question classifier often fails to classify accurately.
I found in the running trace log that many irrelevant prompt words were added in the question classifier. Will this consume too many tokens and affect the performance of the LLM as well as the classification judgment?
I only inputted two words as "nice job", but the question classifier got a lot of prompts.
The complete JSON text is shown as follows:
{
"model_mode": "chat",
"prompts": [
{
"role": "system",
"text": "\n ### Job Description',\n You are a text classification engine that analyzes text data and assigns categories based on user input or automatically determined categories.\n ### Task\n Your task is to assign one categories ONLY to the input text and only one category may be assigned returned in the output. Additionally, you need to extract the key words from the text that are related to the classification.\n ### Format\n The input text is in the variable input_text. Categories are specified as a category list with two filed category_id and category_name in the variable categories. Classification instructions may be included to improve the classification accuracy.\n ### Constraint\n DO NOT include anything other than the JSON array in your response.\n ### Memory\n Here are the chat histories between human and assistant, inside XML tags.\n \n \n \n",
"files": []
},
{
"role": "user",
"text": "\n { "input_text": ["I recently had a great experience with your company. The service was prompt and the staff was very friendly."],\n "categories": [{"category_id":"f5660049-284f-41a7-b301-fd24176a711c","category_name":"Customer Service"},{"category_id":"8d007d06-f2c9-4be5-8ff6-cd4381c13c60","category_name":"Satisfaction"},{"category_id":"5fbbbb18-9843-466d-9b8e-b9bfbb9482c8","category_name":"Sales"},{"category_id":"23623c75-7184-4a2e-8226-466c2e4631e4","category_name":"Product"}],\n "classification_instructions": ["classify the text based on the feedback provided by customer"]}\n",
"files": []
},
{
"role": "assistant",
"text": "\n
json\n {\"keywords\": [\"recently\", \"great experience\", \"company\", \"service\", \"prompt\", \"staff\", \"friendly\"],\n \"category_id\": \"f5660049-284f-41a7-b301-fd24176a711c\",\n \"category_name\": \"Customer Service\"}\n
\n","files": []
},
{
"role": "user",
"text": "\n {"input_text": ["bad service, slow to bring the food"],\n "categories": [{"category_id":"80fb86a0-4454-4bf5-924c-f253fdd83c02","category_name":"Food Quality"},{"category_id":"f6ff5bc3-aca0-4e4a-8627-e760d0aca78f","category_name":"Experience"},{"category_id":"cc771f63-74e7-4c61-882e-3eda9d8ba5d7","category_name":"Price"}],\n "classification_instructions": []}\n",
"files": []
},
{
"role": "assistant",
"text": "\n
json\n {\"keywords\": [\"bad service\", \"slow\", \"food\", \"tip\", \"terrible\", \"waitresses\"],\n \"category_id\": \"f6ff5bc3-aca0-4e4a-8627-e760d0aca78f\",\n \"category_name\": \"Experience\"}\n
\n","files": []
},
{
"role": "user",
"text": "\n '{"input_text": ["nice job"],',\n '"categories": [{"category_id": "1711529038361", "category_name": "正面评价"}, {"category_id": "1711529041725", "category_name": "负面评价"}], ',\n '"classification_instructions": [""]}'\n",
"files": []
},
{
"role": "user",
"text": "nice job",
"files": []
}
],
"usage": {
"prompt_tokens": 729,
"prompt_unit_price": "0.0005",
"prompt_price_unit": "0.001",
"prompt_price": "0.0003645",
"completion_tokens": 38,
"completion_unit_price": "0.0015",
"completion_price_unit": "0.001",
"completion_price": "0.000057",
"total_tokens": 767,
"total_price": "0.0004215",
"currency": "USD",
"latency": 0.6129214520005917
},
"finish_reason": "stop"
}
Beta Was this translation helpful? Give feedback.
All reactions