-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add transcription with experimental_transcribe
#5496
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces the foundations for transcribing audio by adding a new experimental_generateTranscript function along with the necessary implementation, tests, documentation, and integration in the AI SDK.
- Added comprehensive audio input conversion utilities.
- Implemented and integrated the generateTranscript function with the transcription provider.
- Enhanced test coverage and documentation for the new transcription capabilities.
Reviewed Changes
Copilot reviewed 27 out of 31 changed files in this pull request and generated no comments.
Show a summary per file
File | Description |
---|---|
packages/provider-utils/src/convert-audio-input.ts | Adds audio input conversion supporting multiple formats. |
packages/openai/src/openai-transcription-settings.ts | Defines types for transcription settings. |
packages/openai/src/openai-transcription-model.ts | Implements transcription model logic and API interaction. |
packages/openai/src/openai-transcription-model.test.ts | Provides tests for transcription model behavior. |
packages/openai/src/openai-provider.ts | Integrates transcription into the OpenAI provider. |
packages/ai/errors/no-transcript-generated-error.ts | Introduces a custom error for missing transcript cases. |
packages/ai/core/types/transcription-model.ts | Defines types for transcription models (documentation comment needs updating). |
packages/ai/core/generate-transcript/* | Implements the generateTranscript function, result types, and tests. |
examples/ai-core/src/generate-transcript/openai.ts | Adds an example script for using the transcription feature. |
Files not reviewed (4)
- content/docs/03-ai-sdk-core/65-transcription.mdx: Language not supported
- content/docs/07-reference/01-ai-sdk-core/11-generate-transcript.mdx: Language not supported
- content/providers/01-ai-sdk-providers/02-openai.mdx: Language not supported
- packages/ai/tsconfig.vitest-temp.json: Language not supported
Comments suppressed due to low confidence (2)
packages/ai/core/types/transcription-model.ts:7
- The comment incorrectly references 'Image model' instead of 'Transcription model'. Please update the comment to accurately describe the transcription model.
* Image model that is used by the AI SDK Core functions.
packages/openai/src/openai-transcription-model.ts:48
- [nitpick] Consider using a dynamic filename derived from the original audio input rather than the hard-coded 'audio.wav' to improve accuracy and flexibility in file handling.
formData.append('file', file, 'audio.wav');
content/docs/07-reference/01-ai-sdk-core/11-generate-transcript.mdx
Outdated
Show resolved
Hide resolved
packages/ai/core/generate-transcript/generate-transcript-result.ts
Outdated
Show resolved
Hide resolved
Co-authored-by: Nico Albanese <[email protected]>
This PR creates the foundations for migrating Orate into the AI SDK, starting with transcribing audio. The changes span across documentation, implementation, and testing to support the new
transcribe
function.Documentation Updates:
content/docs/03-ai-sdk-core/36-transcription.mdx
: Added a new documentation page for the transcription feature, including usage examples, settings, and error handling.content/docs/03-ai-sdk-core/index.mdx
: Updated the index to include a link to the new transcription documentation.content/docs/07-reference/01-ai-sdk-core/11-transcribe.mdx
: Added an API reference page for thetranscribe
function, detailing parameters, return values, and examples.content/providers/01-ai-sdk-providers/02-openai.mdx
: Documented the OpenAI transcription models and their capabilities.New API Implementation:
packages/ai/core/generate-transcript/generate-transcript-result.ts
: Defined theTranscriptionResult
interface to structure the transcription output.examples/ai-core/src/generate-transcript/openai.ts
: Added an example script demonstrating how to use the transcription feature with OpenAI.Test Coverage:
packages/ai/core/generate-transcript/generate-transcript.test.ts
: Implemented tests for thetranscribe
function, covering argument handling, warnings, transcript generation, and error scenarios.These changes collectively add a robust transcription feature to the AI SDK, complete with detailed documentation and thorough testing.