Skip to content

Commit

Permalink
feat: browser sat (#51)
Browse files Browse the repository at this point in the history
# Description
Introduces a web browser voice satellite option with OpenWakeWord and
Silero VAD running in JavaScript, all served via FastAPI. Allows for
both text and voice commands to a Neon Diana deployment.

This new option may one day replace the Gradio webpage.

---------

Co-authored-by: Daniel McKnight <[email protected]>
  • Loading branch information
mikejgray and NeonDaniel authored Dec 28, 2023
1 parent 063f6a1 commit 824a912
Show file tree
Hide file tree
Showing 31 changed files with 1,389 additions and 28 deletions.
5 changes: 5 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
.git/
tests/
*.egg-info
build/
dist/
13 changes: 11 additions & 2 deletions .github/workflows/publish_release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,16 @@ jobs:
build_and_publish_pypi_and_release:
uses: neongeckocom/.github/.github/workflows/publish_stable_release.yml@master
secrets: inherit
build_and_publish_docker:
build_and_publish_docker_gradio:
needs: build_and_publish_pypi_and_release
uses: neongeckocom/.github/.github/workflows/publish_docker.yml@master
secrets: inherit
secrets: inherit
with:
build_args: EXTRAS=gradio
build_and_publish_docker_websat:
needs: build_and_publish_pypi_and_release
uses: neongeckocom/.github/.github/workflows/publish_docker.yml@master
secrets: inherit
with:
build_args: EXTRAS=web_sat
image_name: ${{ github.repository }}-websat
13 changes: 11 additions & 2 deletions .github/workflows/publish_test_build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,16 @@ jobs:
version_file: "neon_iris/version.py"
setup_py: "setup.py"
publish_prerelease: true
build_and_publish_docker:
build_and_publish_docker_gradio:
needs: publish_alpha_release
uses: neongeckocom/.github/.github/workflows/publish_docker.yml@master
secrets: inherit
secrets: inherit
with:
build_args: EXTRAS=gradio
build_and_publish_docker_websat:
needs: publish_alpha_release
uses: neongeckocom/.github/.github/workflows/publish_docker.yml@master
secrets: inherit
with:
build_args: EXTRAS=web_sat
image_name: ${{ github.repository }}-websat
54 changes: 44 additions & 10 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,21 +1,55 @@
# Stage 1: Use a base image to install ffmpeg
FROM jrottenberg/ffmpeg:4.1 as ffmpeg-base

# Stage 2: Build the final image
FROM python:3.8-slim

# Label for vendor
LABEL vendor=neon.ai \
ai.neon.name="neon-iris"

ENV OVOS_CONFIG_BASE_FOLDER neon
ENV OVOS_CONFIG_FILENAME neon.yaml
ENV XDG_CONFIG_HOME /config
# Build argument for specifying extras
ARG EXTRAS

RUN apt update && \
apt install -y ffmpeg
ENV OVOS_CONFIG_BASE_FOLDER=neon \
OVOS_CONFIG_FILENAME=neon.yaml \
XDG_CONFIG_HOME=/config

ADD . /neon_iris
WORKDIR /neon_iris
# Copy ffmpeg binaries from the ffmpeg-base stage
COPY --from=ffmpeg-base /usr/local/bin/ /usr/local/bin/
COPY --from=ffmpeg-base /usr/local/lib/ /usr/local/lib/

RUN pip install wheel && \
pip install .[gradio]
RUN mkdir -p /neon_iris/requirements
COPY ./requirements/* /neon_iris/requirements

RUN pip install wheel && pip install -r /neon_iris/requirements/requirements.txt
RUN if [ "$EXTRAS" = "gradio" ]; then \
pip install -r /neon_iris/requirements/gradio.txt; \
elif [ "$EXTRAS" = "web_sat" ]; then \
pip install -r /neon_iris/requirements/web_sat.txt; \
else \
pip install -r /neon_iris/requirements/requirements.txt; \
fi

WORKDIR /neon_iris
ADD . /neon_iris
RUN pip install .

COPY docker_overlay/ /

CMD ["iris", "start-gradio"]
# Expose port 8000 for websat
EXPOSE 8000

# Set the ARG value as an environment variable
ENV EXTRAS=${EXTRAS}

# Create a non-root user with a home directory and change ownership of necessary directories

RUN groupadd -r neon && useradd -r -m -g neon neon \
&& mkdir -p /config/neon \
&& chown -R neon:neon /neon_iris /usr/local/bin /config

# Use the non-root user to run the container
USER neon

ENTRYPOINT ["/neon_iris/entrypoint.sh"]
117 changes: 111 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,24 @@
# Neon Iris

Neon Iris (Interactive Relay for Intelligence Systems) provides tools for
interacting with Neon systems remotely, via [MQ](https://github.com/NeonGeckoCom/chat_api_mq_proxy).

Install the Iris Python package with: `pip install neon-iris`
The `iris` entrypoint is available to interact with a bus via CLI. Help is available via `iris --help`.

## Configuration
Configuration files can be specified via environment variables. By default,
`Iris` will read configuration from `~/.config/neon/diana.yaml` where

Configuration files can be specified via environment variables. By default,
`Iris` will read configuration from `~/.config/neon/diana.yaml` where
`XDG_CONFIG_HOME` is set to the default `~/.config`.
More information about configuration handling can be found
More information about configuration handling can be found
[in the docs](https://neongeckocom.github.io/neon-docs/quick_reference/configuration/).
> *Note:* The neon-iris Docker image uses `neon.yaml` by default because the

> _Note:_ The neon-iris Docker image uses `neon.yaml` by default because the
> `iris` web UI is often deployed with neon-core.
A default configuration might look like:

```yaml
MQ:
server: neonaialpha.com
Expand All @@ -34,22 +38,123 @@ iris:
```
### Language Support
For Neon Core deployments that support language support queries via MQ, `languages`
may be removed and `enable_lang_api: True` added to configuration. This will use
the reported STT/TTS supported languages in place of any `iris` configuration.

## Interfacing with a Diana installation

The `iris` CLI includes utilities for interacting with a `Diana` backend. Use
`iris --help` to get a current list of available commands.

### `iris start-listener`
This will start a local wake word recognizer and use a remote Neon

This will start a local wake word recognizer and use a remote Neon
instance connected to MQ for processing audio and providing responses.

### `iris start-gradio`

This will start a local webserver and serve a Gradio UI to interact with a Neon
instance connected to MQ.

### `iris start-client`
This starts a CLI client for typing inputs and receiving responses from a Neon

This starts a CLI client for typing inputs and receiving responses from a Neon
instance connected via MQ.

### `iris start-websat`

This starts a local webserver and serves a web UI for interacting with a Neon
instance connected to MQ.

## Docker

### Building

To build the Docker image, run:

```bash
docker build -t ghcr.io/neongeckocom/neon-iris:latest .
```

To build the Docker image with gradio extras, run:

```bash
docker build --build-arg EXTRAS=gradio -t ghcr.io/neongeckocom/neon-iris:latest .
```

To build the Docker image with websat extras, run:

```bash
docker build --build-arg EXTRAS=websat -t ghcr.io/neongeckocom/neon-iris:latest .
```

### Running

The Docker image that is built for this service runs the `iris` CLI with the
`-h` argument by default. In order to use the container to run different services,
you must override the entrypoint. For example, to run the `start-websat` service,
you would run:

```bash
docker run --rm -p 8000:8000 ghcr.io/neongeckocom/neon-iris:latest start-websat
```

Running the container without any arguments gives you a list of commands that
can be run. You can choose to run any of these commands by replacing `start-websat`
in the above command with the command you want to run.

## websat

### Configuration

The `websat` web UI is a simple web UI for interacting with a Neon instance. It
accepts special configuration items prefixed with `webui_` to customize the UI.

| parameter | description | default |
| ----------------------- | -------------------------------------------------------------------------------------------------------------------------------------- | ---------------------- |
| webui_description | The header text for the web UI | Chat with Neon |
| webui_title | The title text for the web UI in the browser | Neon AI |
| webui_input_placeholder | The placeholder text for the input box | Ask me something |
| webui_ws_url | The websocket URL to connect to, which must be accessible from the browser you're running in. Note that the default will usually fail. | ws://localhost:8000/ws |

Iris uses the `Configuration()` class from OVOS to handle configuration. This
means that you can specify configuration in a `neon.yaml` file in the
`~/.config/neon`. When using a container, you can mount a volume to
`/home/neon/.config/neon` to provide a configuration file.

Example configuration block:

```yaml
iris:
webui_title: Neon AI
webui_description: Chat with Neon
webui_input_placeholder: Ask me something
webui_ws_url: wss://neonaialpha.com/ws
```

### Customization

The websat web UI reads in the following items from `neon_iris/static/custom`:

- `error.mp3` - Used for error responses
- `wake.mp3` - Used for wake word responses
- `favicon.ico` - The favicon for the web UI
- `logo.webp` - The logo for the web UI

To customize these items, you can replace them in the `neon_iris/static/custom` folder and rebuild the image.

### Websocket endpoint

The websat web UI uses a websocket to communicate with OpenWakeWord, which can
load `.tflite` or `.onnx` models. The websocket endpoint is `/ws`, but since it
is served with FastAPI, it also supports `wss` for secure connections. To
use `wss`, you must provide a certificate and key file.

### Chat history

The websat web UI stores chat history in the browser's [local storage](https://developer.mozilla.org/en-US/docs/Web/API/Window/localStorage).
This allows chat history to persist between browser sessions. However, it also
means that if you clear your browser's local storage, you will lose your chat
history. This is a feature, not a bug.
3 changes: 2 additions & 1 deletion docker_overlay/etc/neon/neon.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ iris:
webui_chatbot_label: Chat History
webui_mic_label: Speak to Neon
webui_text_label: Text with Neon
webui_ws_url: ws://localhost:8000/ws # Override, as this needs to be reachable by the browser
server_address: "0.0.0.0"
server_port: 7860
default_lang: en-us
Expand Down Expand Up @@ -43,4 +44,4 @@ logs:
error:
- pika
warning:
- filelock
- filelock
37 changes: 37 additions & 0 deletions docker_overlay/neon_iris/entrypoint.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
#!/bin/bash
# NEON AI (TM) SOFTWARE, Software Development Kit & Application Development System
# All trademark and other rights reserved by their respective owners
# Copyright 2008-2024 Neongecko.com Inc.
# BSD-3
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
# 1. Redistributions of source code must retain the above copyright notice,
# this list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
# 3. Neither the name of the copyright holder nor the names of its
# contributors may be used to endorse or promote products derived from this
# software without specific prior written permission.
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
# THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA,
# OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
# LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
# NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
# SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

set -e

if [ "$EXTRAS" = "gradio" ]; then
exec iris start-gradio
elif [ "$EXTRAS" = "web_sat" ]; then
exec iris start-websat
else
echo "No extras specified, showing help. To execute a command, use 'docker run iris <command>'"
exec iris -h
fi
13 changes: 13 additions & 0 deletions neon_iris/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,19 @@ def start_gradio():
click.echo("Unable to connect to MQ server")


@neon_iris_cli.command(help="Create a Web Voice Satellite session")
@click.option("--port", "-p", default=8000, help="Port to run on, defaults to 8000")
@click.option("--host", default="0.0.0.0", help="Host to run on, defaults to 0.0.0.0")
def start_websat(port, host):
from neon_iris.web_sat_client import app
_print_config()
try:
import uvicorn
uvicorn.run(app, host=host, port=port)
except OSError:
click.echo("Unable to connect to MQ server")


@neon_iris_cli.command(help="Query Neon Core for supported languages")
def get_languages():
from neon_iris.util import query_neon
Expand Down
2 changes: 1 addition & 1 deletion neon_iris/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -333,7 +333,7 @@ def _send_utterance(self, utterance: str, lang: str,
self._send_serialized_message(serialized)

def _send_audio(self, audio_file: str, lang: str,
username: str, user_profiles: list,
username: Optional[str], user_profiles: Optional[list],
context: Optional[dict] = None):
context = context or dict()
audio_data = encode_file_to_base64_string(audio_file)
Expand Down
27 changes: 27 additions & 0 deletions neon_iris/models/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# NEON AI (TM) SOFTWARE, Software Development Kit & Application Development System
# All trademark and other rights reserved by their respective owners
# Copyright 2008-2024 Neongecko.com Inc.
# BSD-3
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
# 1. Redistributions of source code must retain the above copyright notice,
# this list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
# 3. Neither the name of the copyright holder nor the names of its
# contributors may be used to endorse or promote products derived from this
# software without specific prior written permission.
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
# THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA,
# OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
# LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
# NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
# SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

from .web_sat import UserInput, UserInputResponse # noqa
Loading

0 comments on commit 824a912

Please sign in to comment.