|
1 |
| -# openai-realtime-api-nodejs-dashboard |
2 |
| - A NodeJS Frontend + Backend application that works as a chatbot with the new openai realtime api. |
| 1 | +# OpenAI Realtime API Node.js Dashboard |
| 2 | + |
| 3 | +The documentation provided by OpenAI was pretty average in providing an easy way to test out the new realtime api with a working frontend, so I thought I would take a crack at making one. |
| 4 | + |
| 5 | +The app works as a chatbot utilising the new realtime websocket system to have as minimal delay as possible. You can either type messages to the chatbot, or enable 'Conversation Mode' to simulate a realtime conversation. Utilises socket.io websockets to handle server-to-client communication. |
| 6 | + |
| 7 | +## Feature Overview |
| 8 | + |
| 9 | +- Real-Time Conversation: Chat with an AI assistant in real-time. Customise to your liking |
| 10 | +- Audio Transcription: Transcribe user input via voice. |
| 11 | +- Audio Playback: The assistant speaks responses using the integrated audio streaming. |
| 12 | +- Responsive User Interface: A simple, user-friendly chat interface. |
| 13 | +- Toggle Voice Mode: Enable/disable conversation mode to switch between text and voice input. |
| 14 | + |
| 15 | +## Demo |
| 16 | + |
| 17 | +Include a screenshot showing the app interface |
| 18 | + |
| 19 | +## Technologies Used |
| 20 | + |
| 21 | +- Node.js: Backend server for handling API calls and Socket.io connections. |
| 22 | +- Express.js: Web framework for serving static files and handling HTTP requests. |
| 23 | +- Socket.io: Real-time, bidirectional communication between the client and server. |
| 24 | +- OpenAI Realtime API: Provides AI capabilities for real-time conversation and speech synthesis. |
| 25 | +- JavaScript: Used for both server and client-side logic. |
| 26 | +- HTML/CSS: Frontend structure and styling (with EJS). |
| 27 | + |
| 28 | +## Getting Started |
| 29 | + |
| 30 | +### Prerequisites |
| 31 | + |
| 32 | +- Node.js (v14.x or later) |
| 33 | +- npm or yarn |
| 34 | +- An OpenAI API key (with access to the Realtime API) |
| 35 | + |
| 36 | +### Installation |
| 37 | + |
| 38 | +1. Clone the Repository |
| 39 | + |
| 40 | +```bash |
| 41 | +git clone https://github.com/yourusername/openai-realtime-api-nodejs-dashboard.git |
| 42 | +cd openai-realtime-api-nodejs-dashboard |
| 43 | +``` |
| 44 | + |
| 45 | +2. Install Dependencies |
| 46 | + |
| 47 | +```bash |
| 48 | +npm install |
| 49 | +``` |
| 50 | + |
| 51 | +3. Set Up Environment Variables |
| 52 | + Create a .env file in the root directory and add your OpenAI API key: |
| 53 | + |
| 54 | +```bash |
| 55 | +OPENAI_API_KEY=your_openai_api_key |
| 56 | +``` |
| 57 | + |
| 58 | +4. Run the Application (Defaults on port 3000) |
| 59 | + |
| 60 | +```bash |
| 61 | +npm start |
| 62 | +``` |
| 63 | + |
| 64 | +Or for development with live reloading, use nodemon: |
| 65 | + |
| 66 | +```bash |
| 67 | +npm run dev |
| 68 | +``` |
| 69 | + |
| 70 | +5. Access the Application |
| 71 | + Open your browser and navigate to http://localhost:3000. |
| 72 | + |
| 73 | +## File Structure |
| 74 | + |
| 75 | +```bash |
| 76 | +openai-realtime-api-nodejs-dashboard/ |
| 77 | +| |
| 78 | +|-- public/ # Frontend files |
| 79 | +| |-- /wavtools # Assets for speech recognition/synthesis. |
| 80 | +| |-- dashboard.js # Client-side JavaScript |
| 81 | +| `-- style.css # Styling for the application |
| 82 | +| |
| 83 | +|-- views/ # Frontend HTML files |
| 84 | +| -- index.ejs |
| 85 | +| |
| 86 | +|-- .env # Environment variables (not included in version control) |
| 87 | +|-- server.js # Main server file |
| 88 | +|-- package.json # Project metadata and dependencies |
| 89 | +`-- README.md # Project documentation |
| 90 | +``` |
| 91 | + |
| 92 | +## Usage |
| 93 | + |
| 94 | +- Chat Interaction: Type a message in the input field and press "Send" or use the microphone button |
| 95 | + to enable/disable voice conversation mode. |
| 96 | +- Audio Playback: The assistant will speak responses if voice output is enabled (may need to check Chrome settings). |
| 97 | +- Responsive Display: The conversation log updates in real-time, displaying both user input and |
| 98 | + assistant responses. |
| 99 | + |
| 100 | +## Customisation |
| 101 | + |
| 102 | +- Update Instructions: Modify server.js to customise the instructions given to the AI. Update any other settings to your preference. |
| 103 | +```js |
| 104 | +client.updateSession({ |
| 105 | + instructions: 'You are a helpful, english speaking assistant.', |
| 106 | + voice: 'alloy', |
| 107 | + turn_detection: { type: 'server_vad', threshold: 0.3 }, |
| 108 | + output_audio: { model: 'audio-davinci', format: 'pcm' }, |
| 109 | + input_audio_transcription: { model: 'whisper-1' }, |
| 110 | + }); |
| 111 | +``` |
| 112 | +- Styling: Change the style.css file to update the appearance of the chat interface. |
| 113 | + |
| 114 | +## Troubleshooting |
| 115 | + |
| 116 | +- Server Errors: Ensure the OpenAI API key is valid and that your environment variables are |
| 117 | + correctly set. |
| 118 | +- Audio Issues: Verify your browser supports audio playback on localhost. |
| 119 | + |
| 120 | +## Contributing |
| 121 | + |
| 122 | +Contributions are welcome! If you'd like to improve the code or add new features, please submit a pull request. |
| 123 | + |
| 124 | +## License |
| 125 | + |
| 126 | +This project is licensed under the MIT License. See the LICENSE file for more details. |
0 commit comments