-
Notifications
You must be signed in to change notification settings - Fork 2
Technical architecture
Quentin Gérôme edited this page Apr 29, 2024
·
8 revisions
OpenHEXA is a data integration platform composed of a series of components:
- The OpenHEXA backend, usually called
openhexa-app
for historical reasons, a Python/Django application providing a GraphQL API, a data pipelines' orchestration engine and user management capabilities - The OpenHEXA frontend (
openhexa-frontend
), a Typescript/React/Next.js application providing the OpenHEXA user interface on top of the backend - The OpenHEXA notebooks environment (see
openhexa-notebooks
), a heavily customized JupyterHub/JupyterLab setup running the same image as the pipelines environment
In terms of data storage, we have to make a distinction between:
- Application data storage, which resides in a PostgreSQL database
- Workspace storage or user storage (see User manual for more information about workspaces), which is stored either in PosgtreSQL databases or in Object Storage buckets (Google Cloud Storage, AWS S3 or Minio)
When running code using Jupyter notebooks or OpenHEXA data pipelines, technical users can leverage the OpenHEXA Python SDK to interact with the OpenHEXA backend (see openhexa-sdk-python
).
Notebooks and data pipelines typically run in containers using one of our Docker images (see openhexa-docker-images
) or a custom one set by workspace.
The whole OpenHEXA stack is meant to be deployed in a Kubernetes cluster, so that notebooks and pipelines run in isolated environments and leverage the auto-scaling capabilities offered by Kubernetes.