Question Classifcation function in workflow maybe abnormal #8430

tigflanker · 2024-09-14T07:43:43Z

tigflanker
Sep 14, 2024

Self Checks

This is only for bug report, if you would like to ask a question, please head to Discussions.
I have searched for existing issues search for existing issues, including closed ones.
I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[FOR CHINESE USERS] 请务必使用英文提交 Issue，否则会被关闭。谢谢！:）
Please do not modify this template :) and fill in all the required fields.

Dify version

v0.7.3

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

Hello experts, I conducted a comparison test in version v0.7.3.

On the left is the regular model invocation, and on the right is the issue classification within the workflow.

The green box represents the same prompt, the yellow box is the test question and the response, and the red box on the right indicates potential issues.

In this code(https://github.com/langgenius/dify/blob/main/api/core/workflow/nodes/question_classifier/template_prompts.py), it seems that some prompt words were not cleaned up and removed, which might be the reason for the inaccurate classification. Please check this issue.

✔️ Expected Behavior

No response

❌ Actual Behavior

No response

tigflanker · 2024-09-14T07:44:05Z

tigflanker
Sep 14, 2024
Author

In addition, I replaced the Question Classifcation module with a general large model + prompt, and it operates quite smoothly with a high accuracy rate. 👍

Another idea I have is to collect bad cases using a knowledge base approach to create a mapping from questions to classifications. Therefore, this issue is not too urgent.

3 replies

tu-jiajun Dec 17, 2024

请问找到解决方法了吗

Copilotes Dec 26, 2024

What's your prompt for the llm, can you share it with me? I'm afraid I was in the same situation.

liujiawei9 Mar 28, 2025

There is a problem with using a large model node to replace the question classifier node. That is, the large model node cannot accurately restrict its output. The large model node often does not output in the format I require, which will lead to errors in the subsequent processes.

Copilotes · 2024-12-30T16:48:15Z

Copilotes
Dec 30, 2024

Perhaps it's time to enhance this classifier, as it currently seems very poorly functioning to me, with most of the time allocating randomly.

0 replies

tigflanker · 2025-03-28T08:36:09Z

tigflanker
Mar 28, 2025
Author

Ultimately, I did not continue using this feature because the accuracy of intent recognition in real business scenarios is likely to be higher than what a prompt engineering solution can achieve.

It is recommended to fine-tune Tiny-BERT for multi-class classification, and the training examples can be entirely generated by a large model based on classification keywords.
Bad cases can be collected later for further optimization.

Additionally, in the actual workflow, the HTTP service may be called multiple times, so it is advisable to build the model service using Flask + Gunicorn.

1 reply

tigflanker Mar 28, 2025
Author

Ex.

from flask import Flask, request, jsonify
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
from torch.utils.data import DataLoader, Dataset
import operator
import json

import os
os.environ["CUDA_VISIBLE_DEVICES"] = "3"

app = Flask(__name__)

# 设置 Flask 不对中文字符进行转义
app.config['JSON_AS_ASCII'] = False

# 加载模型和tokenizer
model_path = '/data/notebooks/model/intent_recognition'  # 替换为您的模型路径
tokenizer = AutoTokenizer.from_pretrained(model_path, clean_up_tokenization_spaces=True)

# 加载模型，但忽略形状不匹配的参数
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = AutoModelForSequenceClassification.from_pretrained(model_path, num_labels=65, ignore_mismatched_sizes=True).to(device)
model.eval()

# 加载标签映射表
label_mapping_path = '/data/notebooks/model/intent_recognition/label_mapping.json'
with open(label_mapping_path, 'r', encoding='utf-8') as f:
    label_mapping = json.load(f)

# 定义数据集类
class SentenceDataset(Dataset):
    def __init__(self, sentences, tokenizer, max_length):
        self.sentences = sentences
        self.tokenizer = tokenizer
        self.max_length = max_length

    def __len__(self):
        return len(self.sentences)

    def __getitem__(self, idx):
        sentence = self.sentences[idx]
        inputs = self.tokenizer(sentence, return_tensors='pt', padding='max_length', truncation=True, max_length=self.max_length)
        return {k: v.squeeze(0) for k, v in inputs.items()}

def predict_sentences(sentences, batch_size=32, max_length=512):
    # 创建数据集和DataLoader
    dataset = SentenceDataset(sentences, tokenizer, max_length)
    dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=False)
    
    # 存储预测结果
    all_top1_labels = []
    all_other_predict = []

    # 进行预测
    with torch.no_grad():
        for batch in dataloader:
            # 将输入移动到相应的设备上
            batch = {k: v.to(device) for k, v in batch.items()}
            
            # 打印输入张量的设备信息
            print(f"Input tensor device: {next(iter(batch.values())).device}")
            
            # 获取模型输出
            outputs = model(**batch)
            
            # 获取预测的logits
            logits = outputs.logits
            
            # 应用softmax得到概率分布
            probabilities = torch.nn.functional.softmax(logits, dim=-1)
            
            # 将结果转换为numpy数组
            probabilities = probabilities.cpu().numpy()
            
            # 获取最高概率对应的标签值及概率
            top1_indices = probabilities.argmax(axis=1)
            top1_labels = [label_mapping[str(idx)] for idx in top1_indices]
            
            # 构建标签和概率的字典
            label_probabilities = [{label_mapping[str(i)]: float(prob) for i, prob in enumerate(p)} for p in probabilities]
            
            # 提取top5的概率
            top5_probs = [dict(sorted(probs.items(), key=operator.itemgetter(1), reverse=True)[:5]) for probs in label_probabilities]

            # 格式化其他可能分类
            other_predict = [f'{label} (概率：{prob:.2%})' for label, prob in top5_probs[0].items()]
            other_predict = '\n'.join(other_predict)
            
            # 添加到结果列表
            all_top1_labels.append(top1_labels[0])
            all_other_predict.append(other_predict)
    
    return all_top1_labels, all_other_predict

@app.route('/predict', methods=['POST'])
def predict():
    # 获取JSON请求体中的文本
    data = request.get_json()
    sentences = data['sentences']

    # 打印模型设备信息
    print(f"Model device: {next(model.parameters()).device}")

    # 进行预测
    top1_labels, other_predict = predict_sentences(sentences)
    
    # 返回预测结果
    response = {
        'predict': f'意图识别：{top1_labels[0]}。\n其他可能：\n{other_predict[0]}'
    }
    
    return jsonify(response)

if __name__ == '__main__':
    app.run(host='xx.xx.xx.xx', port=5678, debug=True)

    # nohup /data/miniconda3/envs/llm/bin/python /data/notebooks/model/intent_recognition.py > output.log 2>&1 &

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question Classifcation function in workflow maybe abnormal #8430

{{title}}

Replies: 3 comments 4 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Question Classifcation function in workflow maybe abnormal #8430

tigflanker Sep 14, 2024

Self Checks

Dify version

Cloud or Self Hosted

Steps to reproduce

✔️ Expected Behavior

❌ Actual Behavior

Replies: 3 comments · 4 replies

tigflanker Sep 14, 2024 Author

tu-jiajun Dec 17, 2024

Copilotes Dec 26, 2024

liujiawei9 Mar 28, 2025

Copilotes Dec 30, 2024

tigflanker Mar 28, 2025 Author

tigflanker Mar 28, 2025 Author

tigflanker
Sep 14, 2024

Replies: 3 comments 4 replies

tigflanker
Sep 14, 2024
Author

Copilotes
Dec 30, 2024

tigflanker
Mar 28, 2025
Author

tigflanker Mar 28, 2025
Author