[FR] Support OpenAI compatible API endpoint for local AI inference #7686

rampa3 · 2025-04-04T20:49:03Z

Description

I suggest adding support for OpenAI compatible API endpoint for local AI inference. Main reasoning is, that while Ollama support is useful for some users, majority of the target audience of local AI users will likely already have a local AI inference API running for usage with other apps, and unless they were Ollama users since the beginning, their local API will be OAI compatible endpoint, as it is the most widespread AI API type. I think incompatibility with most common AI API type is a major obstacle slowing down adoption of the newly added free local AI option.

Impact

current users of local inference APIs will be able to use their current inference API without need of installing secondary API and installing models twice, as Ollama is not able to share its model library with other inference server solutions
new users of local inference APIs will be able to pick more user friendly server apps, serving more widely used type of inference API, and will be able to reuse the API server used with AppFlowy with other apps

Additional Context

Links to related AppFlowy Discord conversations (+ comments to the messages):

users would like to use simple to use inference servers, which all use OpenAI compatible endpoints, such as LM Studio with AppFlowy:
https://discord.com/channels/903549834160635914/903553722804748309/1351760764569911367
Comment: Ollama API is Ollama specific; also UX-wise, Ollama as a fully CLI operated app is unfriendly to less computer savvy users
if user already uses OpenAI compatible endpoint in other tools, providing only Ollama compatibility will force such user to use multiple API servers, and multiple model installs:
https://discord.com/channels/903549834160635914/903553722804748309/1351912249404559362
Comment: using two inference API servers side-by-side is problematic due to storage and system resources constraints
(my own opinion) local OpenAI compatible endpoints can provide more capabilties, which could be used for future integrations (while Ollama is purely text and embedding focused):
https://discord.com/channels/903549834160635914/903553722804748309/1351890916230823966

Extra context:

Ollama adds OpenAI compatible API to be compatible with software not exclusively written for it:
https://ollama.com/blog/openai-compatibility
Quote from the linked blog post: "Ollama now has built-in compatibility with the OpenAI Chat Completions API, making it possible to use more tooling and applications with Ollama locally."

Personal case to illustrate the storage constraints problem:

As a user of local AI models, I was looking forward to AppFlowy's local AI implementation since announcement, but it being limited only to Ollama as inference API server is sadly a dealbreaker, as I already run an instance of LocalAI API, and don't want to run two APIs side-by-side due to storage constraints. I have quite a big amount of various models installed in the LocalAI instance, which also means that the model folder for it is quite big. Having to duplicately install my text generation models and embedders in Ollama would take up the space for these models twice, once in LocalAI for OpenAI compatible tools, once in Ollama for AppFlowy. Doing this would mean having at minimum (counting only one text model + embedder) around 5 GiBs of duplicate files in my storage, cutting away from free space for installing apps, documents storage and system files.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FR] Support OpenAI compatible API endpoint for local AI inference #7686

[FR] Support OpenAI compatible API endpoint for local AI inference #7686

rampa3 commented Apr 4, 2025

[FR] Support OpenAI compatible API endpoint for local AI inference #7686

[FR] Support OpenAI compatible API endpoint for local AI inference #7686

Comments

rampa3 commented Apr 4, 2025

Description

Impact

Additional Context

Links to related AppFlowy Discord conversations (+ comments to the messages):

Extra context:

Personal case to illustrate the storage constraints problem: