1
0
Fork 0

update readme

This commit is contained in:
Massaki Archambault 2024-11-13 20:12:23 -05:00
parent 91d421848d
commit 0c64372a0b
1 changed files with 59 additions and 15 deletions

View File

@ -15,9 +15,9 @@ A quick prototype to self-host [Open WebUI](https://docs.openwebui.com/) backed
2. Install the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html).
3. Clone the repo.
4. Symlink the NVIDIA compose spec to select it. `ln -s docker-compose.nvidia.yml docker.compose.yml`
5. Run `docker compose up`. Wait for a few minutes for the model to be downloaded and served.
5. Run `docker compose up`.
6. Browse http://localhost:8080/
7. Create an account and start chatting!
7. Add a model and start chatting!
### Steps for AMD GPU
@ -26,9 +26,9 @@ A quick prototype to self-host [Open WebUI](https://docs.openwebui.com/) backed
1. Make sure your drivers are up to date.
2. Clone the repo.
3. Symlink the AMD compose spec to select it. `ln -s docker-compose.amd.yml docker.compose.yml`
4. Run `docker compose up`. Wait for a few minutes for the model to be downloaded and served.
4. Run `docker compose up`.
5. Browse http://localhost:8080/
6. Create an account and start chatting!
6. Add a model and start chatting!
### Steps for NO GPU (use CPU)
@ -37,14 +37,56 @@ A quick prototype to self-host [Open WebUI](https://docs.openwebui.com/) backed
1. Make sure your drivers are up to date.
2. Clone the repo.
3. Symlink the CPU compose spec to select it. `ln -s docker-compose.cpu.yml docker.compose.yml`
4. Run `docker compose up`. Wait for a few minutes for the model to be downloaded and served.
4. Run `docker compose up`.
5. Browse http://localhost:8080/
6. Create an account and start chatting!
6. Add a model and start chatting!
## Adding models
Ollama makes it easy to download and start using new LLM models. It's structure is quite similar to `docker` so using it should feel familiar if you have used docker before. A list of available models can be found on [their site](https://ollama.com/search) (analogous to Docker Hub). You can also import models downloaded from other platforms like [HuggingFace](https://huggingface.co/) using [Modelfile](https://github.com/ollama/ollama/blob/main/docs/modelfile.md) (analogous to Dockerfile).
### GUI
Open WebUI provide an easy-to-use frontend to manage your Ollama models. You can do so via the **Settings > Admin Settings > Models** page.
Open WebUI can also be used a a front-end for SaaS such as [OpenAI](https://openai.com/), [Anthropic](https://www.anthropic.com/), [Mistral](https://mistral.ai/), etc. Refer to the [documentation](https://docs.openwebui.com/).
### Command-line
If you prefer using the command line,
1. Ensure the docker-compose project is up and running
2. Make sure your working directory is set to the folder where you cloned this repo.
Then, you should be able to run the `ollama` command line directly inside the *ollama* container.
Examples:
To download a model:
``` sh
docker compose exec ollama ollama pull gemma2
```
To list all downloaded models:
``` sh
docker compose exec ollama ollama list
```
To delete a model:
``` sh
docker compose exec ollama ollama rm gemma2
```
A full list of command can be seen by running
``` sh
docker compose exec ollama ollama help
```
## Using the API
Open WebUI act as a proxy to Ollama and LiteLLM. For both API, authentication is done though a JWT token which can be fetched in the **Settings > About** page in Open WebUI.
### Open WebUI
Open WebUI can act as a proxy to Ollama. Authentication is done though a JWT token which can be fetched in the **Settings > About** page in Open WebUI.
Open WebUI exposes the Ollama API at the url http://localhost:8080/ollama/api.
Example usage:
@ -52,12 +94,14 @@ Example usage:
curl -H "Authorization: Bearer <Paste your JWT token here>" http://localhost:8080/ollama/api/tags
```
The Ollama API can also be queried directly on port 11434, without proxing through Open WebUI. In some cases, like when working locally, it may be easier to use without having to proxy through Open WebUI. In that case, there is no authentification.
The Ollama API can also be queried directly on port 11434, without proxing through Open WebUI. In some cases, like when working locally, it may be easier to use without having to proxy through Open WebUI. There is no authentication.
Example usage:
``` sh
curl http://localhost:11434/api/tags
```
### Ollama
[Ollama also have some OpenAI-compatible APIs](https://ollama.com/blog/openai-compatibility). See the blog post for more detailed usage instructions.
Example usage:
``` sh
@ -78,15 +122,15 @@ curl http://localhost:11434/v1/chat/completions \
}'
```
Open WebUI exposes the LiteLLM API (for external providers) at the url http://localhost:8080/litellm/api/v1.
Example usage:
``` sh
curl -H "Authorization: Bearer <Paste your JWT token here>" http://localhost:8080/litellm/api/v1/models
```
### Examples integrations
The JWT token can be used in place of the OpenAI API key for OpenAI-compatible libraries/applications.
Using the API, this deployment can be used as the basis for other applications which leverages LLM technology.
## Update
Examples:
* [continue.dev](https://continue.dev) [openwebui documentaion](https://docs.openwebui.com/tutorials/integrations/continue-dev)
* [aiac](https://github.com/gofireflyio/aiac)
## Updating
Simply run `docker compose pull` followed by `docker compose restart`.