- Gen AI stuff
- A lightweight OpenAI API compatible server: av_connect http server in C++
- Text-generation: llama.cp
- Web UI: Provide a simple web UI interface to explore/experiment (borrowed from @llama.cpp project)
Obtain the latest container from docker hub
** Currently, the docker build is quite outdated build **
docker image pull harryavble/av_llm
Access to Web interface at http://127.0.0.1:8080
- LLaMA 1
- LLaMA 2
- LLaMA 3
- Mistral-7B
- Mixtral MoE
- DBRX
- Falcon
- Chinese-LLaMA-Alpaca This application is built on the top of llama.cpp, so it should work any model which the llama.cpp supports
docker run -p 8080:8080 -v $your_host_model_folder:/work/model av_llm ./av_llm -m /work/model/$your_model_file
$ cmake -B build && cmake --build build
$ build/av_llm -m <path to gguf file>
Should work with below UI
- Support more LLM models
- Support more OpenAI API server
- Support more application
This is demonstration version, some issues or error checking is not fully validated.
Contact me via avble.harry dot gmail.com
if any