You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm experiencing issues with the Whisper-Large-v3-turbo model when using it for transcription tasks with the Transformers library (version 4.38.3).
Problems:
Incorrect word timestamps: The timestamps generated by the model are not accurate. I've noticed that the timestamps are often incorrect.
Word repetitions: I've also noticed that the model is repeating words in the transcription output. I've tried setting the repetition_penalty to 1.2, which has helped to reduce the repetitions, but the issue is not completely resolved.
Best regards
Who can help?
No response
Information
The official example scripts
My own modified scripts
Tasks
An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)
Reproduction
Load the Whisper-Large-v3-turbo model using the Transformers library (version 4.38.3).
Use the model to transcribe an audio file.
Observe the word timestamps and transcription output.
Expected behavior
Accurate word timestamps.
No word repetitions in the transcription output.
The text was updated successfully, but these errors were encountered:
System Info
Hello,
Description:
I'm experiencing issues with the Whisper-Large-v3-turbo model when using it for transcription tasks with the Transformers library (version 4.38.3).
Problems:
Incorrect word timestamps: The timestamps generated by the model are not accurate. I've noticed that the timestamps are often incorrect.
Word repetitions: I've also noticed that the model is repeating words in the transcription output. I've tried setting the repetition_penalty to 1.2, which has helped to reduce the repetitions, but the issue is not completely resolved.
Best regards
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Load the Whisper-Large-v3-turbo model using the Transformers library (version 4.38.3).
Use the model to transcribe an audio file.
Observe the word timestamps and transcription output.
Expected behavior
Accurate word timestamps.
No word repetitions in the transcription output.
The text was updated successfully, but these errors were encountered: