Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/dockerize #1

Open
wants to merge 14 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions .gitattributes

This file was deleted.

2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,5 @@ ud_japanese_pud_test.pickle
ud_japanese_bccwj_dev.pickle
vulcan/examples/ja_bccwj-ud-dev.conllu
output.txt

.venv
30 changes: 30 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# syntax=docker/dockerfile:1
FROM python:3.10-bullseye

# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1

# Install dependencies
RUN apt update
RUN apt install -y gettext
RUN pip install gunicorn

# Copy requirements file
COPY requirements.txt /
RUN pip install --no-cache-dir -r /requirements.txt

# Set working directory
COPY app /app
WORKDIR /app

ENV FLASK_APP=app.py
ENV FLASK_DEBUG=$VULCAN_DEBUG
ENV VULCAN_PORT=$VULCAN_PORT
ENV VULCAN_SECRET_KEY=$VULCAN_SECRET_KEY

# Expose the port
EXPOSE $VULCAN_PORT

# Run server
CMD ["flask", "run", "--host=0.0.0.0"]
82 changes: 80 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,84 @@

This is a fork of the original VULCAN, developed and maintained by [Jonas Groschwitz (jgroschwitz)](https://github.com/jgroschwitz). Please refer to the [original repo](https://github.com/jgroschwitz/vulcan) for the original documentation, including setup and usage instructions.

This fork adapts VULCAN to be used within the ParsePort project as a visualization tool for linguistic parses made by the Minimalist Parser developed by [Meaghan Fowlie (megodoonch)](https://github.com/megodoonch) at Utrecht University. More documentation on ParsePort, developed at the Centre For Digital Humanities at Utrecht University, can be found [here](https://github.com/CentreForDigitalHumanities/parseport).
This fork adapts VULCAN to be used within the ParsePort project as a visualization tool for syntactic parses made by the Minimalist Parser developed by [Meaghan Fowlie (megodoonch)](https://github.com/megodoonch) at Utrecht University. More documentation on ParsePort, developed at the Centre For Digital Humanities at Utrecht University, can be found [here](https://github.com/CentreForDigitalHumanities/parseport).

More information will be added in the future.
This project contains the code for a web-based visualization tool, run on a Flask webserver. The server accepts both regular HTTP requests and WebSocket connections. In addition, there is a small SQLite database to keep track of individual parse results.

## Aim and functionality

The server is designed to receive parse results from the Minimalist Parser and turn them into Layout objects that can be rendered in the browser. The server is designed to be run in a Docker container, and it is part of the ParsePort container network.

The server's main endpoint is `/` and only accepts POST requests there with a JSON object of the following form.

```json
{
"uuid": "unique-identifier-for-parse",
"parse": "base64-encoded-parse-result",
}
```

The UUID is a unique identifier for the parse result, and the parse is a base64-encoded string of the parse result (originally in binary/bytes). The server decodes the parse result, converts it into a `Layout` object, and stores it in the database together with the UUID.

When a client connects to the server through WebSocket, the server sends the client the layout object corresponding to the UUID the client requested. The client then renders the layout object in the browser.

In addition to this main endpoint and WebSocket events, the server also has a `/status/` endpoint that returns `{"ok": "true"}` if the server is running and ready to receive connections.


## Running a local development server

The server can be run in three different ways:

1. Locally, using Flask's built-in development server.
2. In a Docker container.
3. As part of the ParsePort container network, using Docker Compose.

To run the server locally, you need to have Python 3.12 or higher installed (lower versions may work but have not been tested).

### Running the server locally

1. Recommended: set up a virtual environment. You can do this by running the following commands in the root directory of the project:

```bash
python3 -m venv venv
source venv/bin/activate
```

On Windows, you can use the following commands:

```bash
python -m venv venv
venv\Scripts\activate
```

This will create a virtual environment in the `venv` directory and activate it. You can deactivate the virtual environment by running `deactivate`.

2. Install the required dependencies. You can do this by running the following command:

```bash
pip install -r requirements.txt
```

3. Start the development server by running the following command in the `/app` folder:

```bash
flask run --host 0.0.0.0
```

This will start the server on `http://localhost:5000`. Visit `http://localhost:5000/status/` to check if the server is running.


### Running the server in a Docker container

To run the server in a stand-alone container, run the following commands in the root directory of the project:

```bash
docker build -t vulcan-parseport .
docker run -d -p 5000:5000 --name vulcan-parseport vulcan-parseport
```

This will build the Docker image and run a container with the image. The server will be available at `http://localhost:5000`. You can add `-v ./app:/app:rw` to the `docker run` command to mount the `app` directory to the container, which enables auto-reload whenever the code in `app` is changed.

### Running the server as part of the ParsePort container network

Please refer to the ParsePort documentation for instructions on how to run the server as part of the ParsePort container network.
103 changes: 103 additions & 0 deletions app/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
import os
from flask import Flask, request, session
from flask_socketio import SocketIO, emit
from logger import log
from vulcan.file_loader import create_layout_from_filepath
from server_methods import instance_requested

# from process_parse_data import process_parse_data

# TODO: Handle CORS properly.
socketio = SocketIO(cors_allowed_origins="*")

def create_app() -> Flask:
log.info("Creating app...")

log.info("Creating standard layout...")
standard_layout = create_layout_from_filepath(
# Petit Prince
input_path="./little_prince_simple.pickle",
# Test pickle from Meaghan, should be used if no input is provided.
# input_path="./all.pickle",
is_json_file=False,
propbank_path=None,
)
log.info("Standard layout created.")

app = Flask(
__name__, template_folder="vulcan/client", static_folder="vulcan/client/static"
)
app.config["SECRET_KEY"] = os.environ.get("VULCAN_SECRET_KEY")
app.config["SQLALCHEMY_DATABASE_URI"] = "sqlite:///db.sqlite"

@app.route("/status/", methods=["GET"])
def status():
if standard_layout is not None:
return {"ok": True}, 200
return {"ok": False}, 500

@app.route("/", methods=["POST"])
def handle_parse_request():
"""
Extract parse data from the request and proceeds to create a layout
from it.

After validating the input, the input is stored in a SQLite database
along with the UUID and a timestamp.
"""
# try:
# # process_parse_data(request, db)
# except Exception as e:
# log.exception(f"An exception occurred while parsing the data: {e}")
# return {"ok": False}, 500

# TODO: see if we can convert the input to a layout and store that in the database instead (somehow).

return {"ok": True}, 200

@socketio.on("connect")
def handle_connect():
from vulcan.server.server import (
make_layout_sendable,
create_list_of_possible_search_filters,
)

print("Connected!")
sid = request.sid

# TODO: investigate if we can serialize the layout instead.

if sid in session:
layout = session[sid]
else:
layout = standard_layout

show_node_names = session.get("show_node_names")
print("Layout for session:", layout)
print("Client connected with SID", sid)

try:
print(sid, "connected")
emit("set_layout", make_layout_sendable(layout), to=sid)
emit("set_corpus_length", layout.corpus_size, to=sid)
emit("set_show_node_names", {"show_node_names": show_node_names}, to=sid)
emit(
"set_search_filters",
create_list_of_possible_search_filters(layout),
to=sid,
)
instance_requested(sid, layout, 0)
except Exception as e:
log.exception(e)
emit("server_error", to=sid)

@socketio.on("disconnect")
def handle_disconnect():
print("Client disconnected")

@socketio.on("instance_requested")
def handle_instance_requested(index):
instance_requested(request.sid, standard_layout, index)

socketio.init_app(app)
return app
26 changes: 26 additions & 0 deletions app/create_layout.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
import json
from vulcan.server.basic_layout import BasicLayout
from vulcan.data_handling.data_corpus import from_dict_list

def create_layout_from_input(data: list[dict]) -> BasicLayout:
"""
Create a layout from the given data.

Simplified version of vulcan.file_loader.create_layout_from_filepath.
"""
input_dicts = json.load(data)

data_corpus = from_dict_list(
data=input_dicts,
propbank_frames_path=None,
show_wikipedia=False
)

layout = BasicLayout(
slices=data_corpus.slices.values(),
linkers=data_corpus.linkers,
corpus_size=data_corpus.size
)

return layout

File renamed without changes.
6 changes: 6 additions & 0 deletions app/logger.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
import logging

# TODO: Adjust this based on env!
logging.basicConfig(level=logging.DEBUG)

log = logging.getLogger("vulcan")
Loading