What is Hopsworks?

Hopsworks is a data platform for ML with a Python-centric Feature Store and MLOps capabilities. Hopsworks is a modular platform. You can use it as a standalone Feature Store, you can use it to manage, govern, and serve your models, and you can even use it to develop and operate feature pipelines and training pipelines. Hopsworks brings collaboration for ML teams, providing a secure, governed platform for developing, managing, and sharing ML assets - features, models, training data, batch scoring data, logs, and more.

🚀 Quickstart

APP - Serverless (beta)

→ Go to app.hopsworks.ai

Hopsworks is available as a serverless app, simply head to app.hopsworks.ai and register with your Gmail or Github accounts. You will then be able to run a tutorial or access Hopsworks directly and try yourself. This is the preferred way to first experience the platform before diving into more advanced uses and installation requirements.

Azure, AWS & GCP

Managed Hopsworks is our platform for running Hopsworks and the Feature Store in the cloud and integrates directly with the customer AWS/Azure/GCP environment. It also integrates seamlessly with third party platforms such as Databricks, SageMaker and KubeFlow.

If you wish to run Hopsworks on your Azure, AWS or GCP environment, follow one of the following guides in our documentation:

Installer - On-premise

It is possible to use Hopsworks on-premises, which means that companies can run their machine learning workloads on their own hardware and infrastructure, rather than relying on a cloud provider. This can provide greater flexibility, control, and cost savings, as well as enabling companies to meet specific compliance and security requirements.

Working on-premises with Hopsworks typically involves collaboration with the Hopsworks engineering teams, as each infrastructure is unique and requires a tailored approach to deployment and configuration. The process begins with an assessment of the company's existing infrastructure and requirements, including network topology, security policies, and hardware specifications.

For further details about on-premise installations: contact us.

Requirements

You need at least one server or virtual machine on which Hopsworks will be installed with at least the following specification:

Centos/RHEL 8.x or Ubuntu 22.04;
at least 32GB RAM,
at least 8 CPUs,
100 GB of free hard-disk space,
a UNIX user account with sudo privileges.

🎓 Documentation and API

Documentation

Hopsworks documentation includes user guides, feature store documentation and an administration guide. We also include concepts to help user navigates the abstractions and logics of the feature stores and MLOps in general:

Feature Store: https://docs.hopsworks.ai/3.0/concepts/fs/
Projects: https://docs.hopsworks.ai/3.0/concepts/projects/governance/
MLOps: https://docs.hopsworks.ai/3.0/concepts/mlops/prediction_services/

APIs

Hopsworks API documentation is divided in 3 categories; Hopsworks API covers project level APIs, Feature Store API covers feature groups, feature views and connectors, and finally MLOps API covers Model Registry, serving and deployment.

Hopsworks API - https://docs.hopsworks.ai/hopsworks-api/3.0.1/generated/api/connection/
Feature Store API - https://docs.hopsworks.ai/feature-store-api/3.0.0/generated/api/connection_api/
MLOps API - https://docs.hopsworks.ai/machine-learning-api/3.0.0/generated/connection_api/

Tutorials

Most of the tutorials require you to have at least an account on app.hopsworks.ai. You can explore the dedicated https://github.com/logicalclocks/hopsworks-tutorials repository containing our tutorials or jump directly in one of the existing use cases:

Fraud (batch): https://github.com/logicalclocks/hopsworks-tutorials/tree/master/fraud_batch
Fraud (online): https://github.com/logicalclocks/hopsworks-tutorials/tree/master/fraud_online
Churn prediction https://github.com/logicalclocks/hopsworks-tutorials/tree/master/churn

📦 Main Features

Project-based Multi-Tenancy and Team Collaboration

Hopsworks provides projects as a secure sandbox in which teams can collaborate and share ML assets. Hopsworks' unique multi-tenant project model even enables sensitive data to be stored in a shared cluster, while still providing fine-grained sharing capabilities for ML assets across project boundaries. Projects can be used to structure teams so that they have end-to-end responsibility from raw data to managed features and models. Projects can also be used to create development, staging, and production environments for data teams. All ML assets support versioning, lineage, and provenance provide all Hopsworks users with a complete view of the MLOps life cycle, from feature engineering through model serving.

Development and Operations

Hopsworks provides development tools for Data Science, including conda environments for Python, Jupyter notebooks, jobs, or even notebooks as jobs. You can build production pipelines with the bundled Airflow, and even run ML training pipelines with GPUs in notebooks on Airflow. You can train models on as many GPUs as are installed in a Hopsworks cluster and easily share them among users. You can also run Spark, Spark Streaming, or Flink programs on Hopsworks, with support for elastic workers in the cloud (add/remove workers dynamically).

Available on any Platform

Hopsworks is available as a both managed platform in the cloud on AWS, Azure, and GCP, and can be installed on any Linux-based virtual machines (Ubuntu/Redhat compatible), even in air-gapped data centers. Hopsworks is also available as a serverless platform that manages and serves both your features and models.

🧑‍🤝‍🧑 Community

Contribute

We are building the most complete and modular ML platform available in the market, and we count on your support to continuously improve Hopsworks. Feel free to give us suggestions, report bugs and add features to our library anytime.

Join the community

Ask questions and give us feedback in the Hopsworks Community
Join our Public Slack Channel
Follow us on Twitter
Check out all our latest product releases

Open-Source

Hopsworks is available under the AGPL-V3 license. In plain English this means that you are free to use Hopsworks and even build paid services on it, but if you modify the source code, you should also release back your changes and any systems built around it as AGPL-V3.

Name		Name	Last commit message	Last commit date
Latest commit History 6,412 Commits
.github		.github
alerting		alerting
docs		docs
hopsworks-IT		hopsworks-IT
hopsworks-UT		hopsworks-UT
hopsworks-alert		hopsworks-alert
hopsworks-api-auth		hopsworks-api-auth
hopsworks-api		hopsworks-api
hopsworks-ca		hopsworks-ca
hopsworks-common		hopsworks-common
hopsworks-ear		hopsworks-ear
hopsworks-jwt		hopsworks-jwt
hopsworks-persistence		hopsworks-persistence
hopsworks-realm		hopsworks-realm
hopsworks-rest-utils		hopsworks-rest-utils
hopsworks-service-discovery		hopsworks-service-discovery
hopsworks-testing		hopsworks-testing
scripts		scripts
tools		tools
vector-db		vector-db
.gitignore		.gitignore
LICENSE		LICENSE
LICENSE_OF_DEPENDENCIES.md		LICENSE_OF_DEPENDENCIES.md
README.md		README.md
faces-config.NavData		faces-config.NavData
mkdocs.yml		mkdocs.yml
pom.xml		pom.xml
project-suppression.xml		project-suppression.xml
spotbugs-exclude.xml		spotbugs-exclude.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What is Hopsworks?

🚀 Quickstart

APP - Serverless (beta)

→ Go to app.hopsworks.ai

Azure, AWS & GCP

Installer - On-premise

Requirements

🎓 Documentation and API

Documentation

APIs

Tutorials

📦 Main Features

Project-based Multi-Tenancy and Team Collaboration

Development and Operations

Available on any Platform

🧑‍🤝‍🧑 Community

Contribute

Join the community

Open-Source

About

Releases 26

Packages

Contributors 39

Languages

License

logicalclocks/hopsworks

Folders and files

Latest commit

History

Repository files navigation

What is Hopsworks?

🚀 Quickstart

APP - Serverless (beta)

→ Go to app.hopsworks.ai

Azure, AWS & GCP

Installer - On-premise

Requirements

🎓 Documentation and API

Documentation

APIs

Tutorials

📦 Main Features

Project-based Multi-Tenancy and Team Collaboration

Development and Operations

Available on any Platform

🧑‍🤝‍🧑 Community

Contribute

Join the community

Open-Source

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 26

Packages 0

Contributors 39

Languages

Packages