update readme

update compose
2024-11-12 23:27:30 -05:00 · 2024-11-12 23:27:20 -05:00
7 changed files with 35 additions and 82 deletions
--- a/README.md
+++ b/README.md
@ -2,13 +2,6 @@

 A quick prototype to self-host [Open WebUI](https://docs.openwebui.com/) backed by [Ollama](https://ollama.com/) to run LLM inference locally.

-## Goals
-
-* Streamline deployment of a local LLM for experimentation purpose.
-* Deploy a ChatGPT Clone for daily use.
-* Deploy an OpenAI-like API for hacking on Generative AI using well-supported libraries.
-* Use docker to prepare for an eventual deployment on a container orchestration platform like Kubernetes.
-
 ## Getting started

 ### Prerequisites
@ -21,18 +14,18 @@ A quick prototype to self-host [Open WebUI](https://docs.openwebui.com/) backed
 1. Make sure your drivers are up to date.
 2. Install the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html).
 3. Clone the repo.
-4. Copy the NVIDIA compose spec to select it. `cp docker-compose.nvidia.yml docker.compose.yml`
+4. Symlink the NVIDIA compose spec to select it. `ln -s docker-compose.nvidia.yml docker.compose.yml`
 5. Run `docker compose up`. Wait for a few minutes for the model to be downloaded and served.
 6. Browse http://localhost:8080/
 7. Create an account and start chatting!

 ### Steps for AMD GPU

-**Warning: AMD was not tested on Windows.**
+**Warning: AMD will *doesn't* support Windows at the moment. Use Linux.**

 1. Make sure your drivers are up to date.
 2. Clone the repo.
-3. Copy the AMD compose spec to select it. `cp docker-compose.amd.yml docker.compose.yml`
+3. Symlink the AMD compose spec to select it. `ln -s docker-compose.amd.yml docker.compose.yml`
 4. Run `docker compose up`. Wait for a few minutes for the model to be downloaded and served.
 5. Browse http://localhost:8080/
 6. Create an account and start chatting!
@ -43,39 +36,11 @@ A quick prototype to self-host [Open WebUI](https://docs.openwebui.com/) backed

 1. Make sure your drivers are up to date.
 2. Clone the repo.
-3. Copy the CPU compose spec to select it. `cp docker-compose.cpu.yml docker.compose.yml`
+3. Symlink the CPU compose spec to select it. `ln -s docker-compose.cpu.yml docker.compose.yml`
 4. Run `docker compose up`. Wait for a few minutes for the model to be downloaded and served.
 5. Browse http://localhost:8080/
 6. Create an account and start chatting!

-## Configuring additional models
-
-### Self-hosted (Ollama)
-
-Browse the [Ollama models library](https://ollama.ai/library) to find a model you wish to add. For this example we will add [gemma](https://ollama.com/library/gemma)
-
-#### Configuring via the command-line
-``` sh
-docker compose exec ollama ollama pull gemma
-```
-
-### External providers (OpenAI, Mistral, Anthropic, etc.)
-
-External providers can be configured through a [LiteLLM](https://github.com/BerriAI/litellm) instance embedded into open-webui. A full list of supported providers, and how to configure them, can be found in the [documentation](https://docs.litellm.ai/docs/providers).
-
-Let say we want to configure gpt-3.5-turbo with an OpenAI API key.
-
-#### Configuring via a config file
-1. Open the file *./litellm/config.yaml* in your editor.
-2. Add an entry under `model_list`:
-   ``` yaml
-   model_list:
-     - model_name: gpt-3.5-turbo
-       litellm_params:
-         model: gpt-3.5-turbo
-         api_key: <put your OpenAI API key here>
-   ```
-3. Run `docker compose restart open-webui` to restart Open WebUI.

 ## Using the API

@ -121,6 +86,10 @@ curl -H "Authorization: Bearer <Paste your JWT token here>" http://localhost:808

 The JWT token can be used in place of the OpenAI API key for OpenAI-compatible libraries/applications.

+## Update
+
+Simply run `docker compose pull` followed by `docker compose restart`.
+
 ## Alternatives

 Check out [LM Studio](https://lmstudio.ai/) for a more integrated, but non web-based alternative!
--- a/docker-compose.amd.yml
+++ b/docker-compose.amd.yml
@ -6,11 +6,9 @@ services:
  ollama:
    image: ollama/ollama:rocm
    restart: unless-stopped
-    entrypoint: /bootstrap.sh
-    command: mistral
-    network_mode: service:open-webui
-    environment:
-      OLLAMA_HOST: http://localhost:11434
+    ports: 
+      - 11434:11434
+
    # begin for AMD GPU support
    devices:
      - /dev/kfd
@ -22,12 +20,13 @@ services:
      - SYS_PTRACE
    security_opt:
      - seccomp=unconfined
-    environment:
-      # https://github.com/ROCm/ROCm/issues/2625
-      GPU_MAX_HW_QUEUES: 1
-      # https://github.com/ROCm/ROCm/issues/2788#issuecomment-1915765846
-      # HSA_OVERRIDE_GFX_VERSION: 11.0.0
+    # environment:
+    #   # https://github.com/ROCm/ROCm/issues/2788#issuecomment-1915765846
+    #   HSA_OVERRIDE_GFX_VERSION: 11.0.0
    # end of section for AMD GPU support
+
    volumes:
-      - ./ollama/bootstrap.sh:/bootstrap.sh:ro
-      - ./ollama:/root/.ollama
+      - ollama_data:/root/.ollama
+
+volumes:
+  ollama_data:
--- a/docker-compose.base.yml
+++ b/docker-compose.base.yml
@ -18,13 +18,12 @@ services:
    image: ghcr.io/open-webui/open-webui:main
    ports:
      - 8080:8080
-      - 11434:11434
    environment:
-      OLLAMA_BASE_URL: http://localhost:11434
+      OLLAMA_BASE_URL: http://ollama:11434
+      WEBUI_AUTH: "False"
    extra_hosts:
      - host.docker.internal:host-gateway
    volumes:
-      - ./litellm/config.yaml:/app/backend/data/litellm/config.yaml
      - open-webui_data:/app/backend/data

 volumes:
--- a/docker-compose.cpu.yml
+++ b/docker-compose.cpu.yml
@ -6,11 +6,10 @@ services:
  ollama:
    image: ollama/ollama:latest
    restart: unless-stopped
-    entrypoint: /bootstrap.sh
-    command: mistral
-    network_mode: service:open-webui
-    environment:
-      OLLAMA_HOST: http://localhost:11434
+    ports: 
+      - 11434:11434
    volumes:
-      - ./ollama/bootstrap.sh:/bootstrap.sh:ro
-      - ./ollama:/root/.ollama
+      - ollama_data:/root/.ollama
+
+volumes:
+  ollama_data:
--- a/docker-compose.nvidia.yml
+++ b/docker-compose.nvidia.yml
@ -6,11 +6,9 @@ services:
  ollama:
    image: ollama/ollama:latest
    restart: unless-stopped
-    entrypoint: /bootstrap.sh
-    command: mistral
-    network_mode: service:open-webui
-    environment:
-      OLLAMA_HOST: http://localhost:11434
+    ports:
+      - 11434:11434
+
    # begin for NVIDIA GPU support
    deploy:
      resources:
@ -20,6 +18,9 @@ services:
              count: 1
              capabilities: [gpu]
    # end of section for NVIDIA GPU support
+
    volumes:
-      - ./ollama/bootstrap.sh:/bootstrap.sh:ro
-      - ./ollama:/root/.ollama
+      - ollama_data:/root/.ollama
+
+volumes:
+  ollama_data:
--- a/ollama/.gitignore
+++ b/ollama/.gitignore
@ -1,3 +0,0 @@
-*
-!.gitignore
-!bootstrap.sh
--- a/ollama/bootstrap.sh
+++ b/ollama/bootstrap.sh
@ -1,11 +0,0 @@
-#!/bin/bash -x
-
-ollama serve &
-
-sleep 1
-
-for model in ${@:-mistral}; do
-    ollama pull "$model"
-done
-
-wait
Author	SHA1	Message	Date
Massaki Archambault	e359def1cc	update readme	2024-11-12 23:27:30 -05:00
Massaki Archambault	5b8ec8fc3d	update compose	2024-11-12 23:27:20 -05:00