Update README

countzero · countzero · commit 0140c583478e · 2023-06-11T20:34:47.000+02:00
diff --git a/.gitattributes b/.gitattributes
@@ -0,0 +1,4 @@
+# This overrides the core.autocrlf setting - http://git-scm.com/docs/gitattributes
+# Set default behaviour, in case users don't have core.autocrlf set.
+# We default to Unix line endings (LF) because the exceptions from this are rare.
+* text=auto eol=lf
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,6 @@
+#
+# .gitignore
+#
+
+# Ignore Sublime Text files.
+*.sublime-workspace
diff --git a/.gitmodules b/.gitmodules
@@ -1,3 +1,3 @@
 [submodule "vendor/llama.cpp"]
 	path = vendor/llama.cpp
-	url = https://github.com/ggerganov/llama.cpp
+	url = git@github.com:ggerganov/llama.cpp.git
diff --git a/README.md b/README.md
@@ -0,0 +1,101 @@
+# Windows llama.cpp
+
+Some PowerShell automation to rebuild [llama.cpp](https://github.com/ggerganov/llama.cpp) for a Windows environment.
+
+## Installation
+
+### 1. Install Prerequisites
+
+Download and install the latest versions:
+
+* [CMake](https://cmake.org/download/)
+* [Cuda](https://developer.nvidia.com/cuda-downloads)
+* [Git Large File Storage](https://git-lfs.com)
+* [Git](https://git-scm.com/downloads)
+* [Miniconda](https://conda.io/projects/conda/en/stable/user-guide/install)
+* [Visual Studio 2022 - Community](https://visualstudio.microsoft.com/)
+
+### 2. Clone the repository from GitHub
+
+Clone the repository to a nice place on your machine via:
+
+```Shell
+git clone --recurse-submodules git@github.com:countzero/windows_llama.cpp.git
+```
+
+### 3. Update the llama.cpp submodule to the latest version (optional)
+This repository can reference an outdated version of the stadtwerk_ssh_authorized_keys repository. To update the submodule to the latest version execute the following.
+
+```Shell
+git submodule update --remote --merge
+```
+
+Then add, commit and push the changes to make the update available for others.
+
+```Shell
+git add --all; git commit -am "Update llama.cpp submodule to latest commit"; git push
+```
+
+**Hint:** This is optional because the build script will pull the latest version.
+
+### 4. Create a new Conda environment
+
+Create a new Conda environment for this project with a specific version of Python:
+
+```Shell
+conda create --name llama.cpp python=3.10
+```
+
+### 5. Initialize Conda for shell interaction
+
+To make Conda available in you current shell execute the following:
+
+```Shell
+conda init
+```
+
+**Hint:** You can always revert this via `conda init --reverse`.
+
+### 6. Execute the build script
+
+To build llama.cpp binaries for a Windows environment with CUDA support execute the script:
+
+```PowerShell
+./rebuild_llama.cpp.ps1
+```
+
+### 7. Download a large language model
+
+Download a large language model (LLM) with weights in the GGML format into the `./vendor/llama.cpp/models` directory. You can for example download the [open-llama-7b](https://huggingface.co/openlm-research/open_llama_7b) model in a quantized GGML format:
+
+* https://huggingface.co/TheBloke/open-llama-7b-open-instruct-GGML/resolve/main/open-llama-7B-open-instruct.ggmlv3.q4_K_M.bin
+
+**Hint:** See the [🤗 Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) for best in class open source LLMs.
+
+## Usage
+
+### Chat
+
+You can now chat with the model:
+
+```PowerShell
+./vendor/llama.cpp/build/bin/Release/main `
+    --model "./vendor/llama.cpp/models/open-llama-7B-open-instruct.ggmlv3.q4_K_M.bin" `
+    --ctx-size 2048 `
+    --n-predict 2048 `
+    --threads 16 `
+    --n-gpu-layers 10 `
+    --reverse-prompt '[[USER_NAME]]:' `
+    --file "./vendor/llama.cpp/prompts/chat-with-vicuna-v1.txt" `
+    --color `
+    --interactive
+```
+
+### Rebuild llama.cpp
+
+Every time there is a new release of [llama.cpp](https://github.com/ggerganov/llama.cpp) you can simply execute the script to rebuild the binaries and update the Python dependencies:
+
+```PowerShell
+./rebuild_llama.cpp.ps1
+```
+https://huggingface.co/TheBloke/open-llama-7b-open-instruct-GGML/resolve/main/open-llama-7B-open-instruct.ggmlv3.q4_K_M.bin
diff --git a/windows_llama.cpp b/windows_llama.cpp
@@ -0,0 +1,8 @@
+{
+	"folders":
+	[
+		{
+			"path": "."
+		}
+	]
+}