diff --git a/README.md b/README.md index d06102e..8a304ae 100644 --- a/README.md +++ b/README.md @@ -2,13 +2,6 @@ A quick prototype to self-host [Open WebUI](https://docs.openwebui.com/) backed by [Ollama](https://ollama.com/) to run LLM inference locally. -## Goals - -* Streamline deployment of a local LLM for experimentation purpose. -* Deploy a ChatGPT Clone for daily use. -* Deploy an OpenAI-like API for hacking on Generative AI using well-supported libraries. -* Use docker to prepare for an eventual deployment on a container orchestration platform like Kubernetes. - ## Getting started ### Prerequisites @@ -21,18 +14,18 @@ A quick prototype to self-host [Open WebUI](https://docs.openwebui.com/) backed 1. Make sure your drivers are up to date. 2. Install the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html). 3. Clone the repo. -4. Copy the NVIDIA compose spec to select it. `cp docker-compose.nvidia.yml docker.compose.yml` +4. Symlink the NVIDIA compose spec to select it. `ln -s docker-compose.nvidia.yml docker.compose.yml` 5. Run `docker compose up`. Wait for a few minutes for the model to be downloaded and served. 6. Browse http://localhost:8080/ 7. Create an account and start chatting! ### Steps for AMD GPU -**Warning: AMD was not tested on Windows.** +**Warning: AMD will *doesn't* support Windows at the moment. Use Linux.** 1. Make sure your drivers are up to date. 2. Clone the repo. -3. Copy the AMD compose spec to select it. `cp docker-compose.amd.yml docker.compose.yml` +3. Symlink the AMD compose spec to select it. `ln -s docker-compose.amd.yml docker.compose.yml` 4. Run `docker compose up`. Wait for a few minutes for the model to be downloaded and served. 5. Browse http://localhost:8080/ 6. Create an account and start chatting! @@ -43,39 +36,11 @@ A quick prototype to self-host [Open WebUI](https://docs.openwebui.com/) backed 1. Make sure your drivers are up to date. 2. Clone the repo. -3. Copy the CPU compose spec to select it. `cp docker-compose.cpu.yml docker.compose.yml` +3. Symlink the CPU compose spec to select it. `ln -s docker-compose.cpu.yml docker.compose.yml` 4. Run `docker compose up`. Wait for a few minutes for the model to be downloaded and served. 5. Browse http://localhost:8080/ 6. Create an account and start chatting! -## Configuring additional models - -### Self-hosted (Ollama) - -Browse the [Ollama models library](https://ollama.ai/library) to find a model you wish to add. For this example we will add [gemma](https://ollama.com/library/gemma) - -#### Configuring via the command-line -``` sh -docker compose exec ollama ollama pull gemma -``` - -### External providers (OpenAI, Mistral, Anthropic, etc.) - -External providers can be configured through a [LiteLLM](https://github.com/BerriAI/litellm) instance embedded into open-webui. A full list of supported providers, and how to configure them, can be found in the [documentation](https://docs.litellm.ai/docs/providers). - -Let say we want to configure gpt-3.5-turbo with an OpenAI API key. - -#### Configuring via a config file -1. Open the file *./litellm/config.yaml* in your editor. -2. Add an entry under `model_list`: - ``` yaml - model_list: - - model_name: gpt-3.5-turbo - litellm_params: - model: gpt-3.5-turbo - api_key: - ``` -3. Run `docker compose restart open-webui` to restart Open WebUI. ## Using the API @@ -121,6 +86,10 @@ curl -H "Authorization: Bearer " http://localhost:808 The JWT token can be used in place of the OpenAI API key for OpenAI-compatible libraries/applications. +## Update + +Simply run `docker compose pull` followed by `docker compose restart`. + ## Alternatives Check out [LM Studio](https://lmstudio.ai/) for a more integrated, but non web-based alternative!