Skip to content

Commit

Permalink
📚 update docs and unittests (#189)
Browse files Browse the repository at this point in the history
* update workflow and s3task docs

* begin updating tutorials

* remove django executor

* update tutorials and example project

* run black
  • Loading branch information
jacksund authored Jul 17, 2022
1 parent 66de06e commit 6113658
Show file tree
Hide file tree
Showing 35 changed files with 706 additions and 899 deletions.
69 changes: 61 additions & 8 deletions .do/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,16 +35,14 @@ This button below will launch a new DigitalOcean app (server+database) using a t
Once open, you can provide your Prefect API key (optional) and a secret key for Django.




<!-- button that starts up DigitalOcean app -->
<a href="https://cloud.digitalocean.com/apps/new?repo=https://github.com/jacksund/simmate/tree/main&refcode=8aeef2ea807c">
<img src="https://www.deploytodo.com/do-btn-blue.svg" alt="Deploy to DO">
</a>

> :warning: Note to developers, if you fork this repository and want this button to work for your new repo, you must update the link for this button. For more information, see [here](https://docs.digitalocean.com/products/app-platform/how-to/add-deploy-do-button/)
Steps to use Deploy on DigitalOcean:
Steps to set up servers on DigitalOcean:

1. Make sure you created a DigitalOcean account (you can use your Github account) and are signed in
2. Select the "Deploy to DigitalOcean" button above. On this new page, you'll see "Python Detected".
Expand All @@ -64,14 +62,17 @@ simmate database reset

## Manual setup (stage 1): Setting up our PostgreSQL Database

> :warning: This section is only required if you do not wish to use the automatic setup described in the section above it -- or if you need a customized setup.
First, we need to set up our Cloud database, tell Simmate how to connect to it, and build our tables.

### creating the cloud database

1. On our DigitalOcean dashboard, click the green "Create" button in the top right and then select "Database". It should bring you to [this page](https://cloud.digitalocean.com/databases/new).
2. For "database engine", select the newest version of PostgreSQL (currently 14)
2. For "database engine", select the newest version of PostgreSQL (currently v14)
3. The remainder of the page's options can be left at their default values.
4. Select **Create a Database Cluster** when you're ready.
5. For the new homepage on your cluster, there is a "Get Started" button. We will go through this dialog in the next section.

Note, this is the database **cluster**, which can host multiple databases on it (each with all their own tables).

Expand All @@ -82,7 +83,7 @@ Before we set up our database on this cluster, we are are first going to try con

1. On your new database's page, you'll see a "Getting Started" dialog -- select it!
2. For "Restrict inbound connections", this is completely optional and beginneers should skip this for now. We skip this because if you know you'll be running calculations on some supercomputer/cluster, then you'll need to add all of the associated IP addresses in order for connections to work properly. That's a lot of IP addresses to grab and configure properly -- so we leave this to advanced users.
3. "Connection details" is what we need to feed to django! Let's copy this information. As an example, here is what the details look like on DigitalOcean:
3. "Connection details" is what we need to give to Simmate/Django. Let's copy this information. As an example, here is what the details look like on DigitalOcean:
```
username = doadmin
password = asd87a9sd867fasd
Expand Down Expand Up @@ -119,41 +120,93 @@ Just like how we don't use the `(base)` environment in Anaconda, we don't want t
2. Create a new database using the "Add new database" button and name this `simmate-database-00`. We name it this way because you may want to make new/separate databases and numbering is a quick way to keep track of these.
3. In your connection settings (from the section above), switch the NAME from defaultdb to `simmate-database-00`. You will change this in your `my_env-database.yaml` file.

Additionally, we can use this approach to build a separate database for Prefect to use. Go through through these steps again where we now are configuring a database for Prefect:

1. On DigitalOcean with your Database Cluster page, select the "Users&Databases" tab.
2. Create a new database using the "Add new database" button and name this `prefect-database-00`.
3. We need to tell Prefect how to connect. You can follow the official [Prefect guides](https://orion-docs.prefect.io/concepts/database/). You can access the URL under by selecting the "Connection String" format on DigitalOcean (note, we remove the `?sslmode=require`. For example, you would set an environment variable like so...
``` bash
# added to bottom of ~/.bashrc for Ubuntu
export PREFECT_ORION_DATABASE_CONNECTION_URL="postgresql+asyncpg://doadmin:asd87a9sd867fasd@db-postgresql-nyc3-49797-do-user-8843535-0.b.db.ondigitalocean.com:5432/prefect-database-00"
```


<!--
TODO: allow prefect to be configured alongside the Simmate database:
or you can add add `prefect` entry to your `my_env-database.yaml` file.
Your final `my_env-database.yaml` file will look like this:
```
default:
ENGINE: django.db.backends.postgresql_psycopg2
HOST: db-postgresql-nyc3-49797-do-user-8843535-0.b.db.ondigitalocean.com
NAME: simmate-database-00 # WE WILL UPDATE THIS IN THE NEXT STEP
USER: doadmin
PASSWORD: asd87a9sd867fasd
PORT: 25060
OPTIONS:
sslmode: require
prefect:
ENGINE: django.db.backends.postgresql_psycopg2
HOST: db-postgresql-nyc3-49797-do-user-8843535-0.b.db.ondigitalocean.com
NAME: prefect-database-00 # WE WILL UPDATE THIS IN THE NEXT STEP
USER: doadmin
PASSWORD: asd87a9sd867fasd
PORT: 25060
OPTIONS:
sslmode: require
```
-->

### creating a connection pool

When we have a bunch of calculations running at once, we need to make sure our database can handle all of these connections. Therefore we make a connection pool which allows for thousands of connections! This "pool" works like a waitlist where the database handles each connection request in order.

1. Select the "Connection Pools" tab and then "Create a Connection Pool"
2. Name your pool `simmate-database-00-pool` and select `simmate-database-00` for the database
3. Select "Transaction" for our mode (the default) and set our pool size to **11** (or modify this value as you wish)
3. Select "Transaction" for our mode (the default) and set our pool size to **8** (or modify this value as you wish)
4. Create the pool when you're ready!
5. You'll have to update your `my_env-database.yaml` file to these connection settings. At this point your file will look similar to this (note, our NAME and PORT values have changed):
``` yaml
default:
ENGINE: django.db.backends.postgresql_psycopg2
HOST: db-postgresql-nyc3-49797-do-user-8843535-0.b.db.ondigitalocean.com
NAME: simmate-database-00-pool
NAME: simmate-database-00-pool # THIS LINE WAS UPDATED
USER: doadmin
PASSWORD: asd87a9sd867fasd
PORT: 25061
OPTIONS:
sslmode: require
```
6. Repeat these steps to create a `prefect-database-00-pool`.


### making all of our database tables

Now that we set up and connected to our database, we can now make our Simmate database tables and start filling them with data! We do this the same way we did without a cloud database:

1. In your terminal, make sure you have you Simmate enviornment activated
2. Run the following command:
```
``` bash
simmate database reset
```
3. You're now ready to start using Simmate with your new database!
4. If you want to share this database with others, you simply need to have them copy your config file: `my_env-database.yaml`. They won't need to run `simmate database reset` because you did it for them.

We need to do the same with Prefect too.

1. In your terminal, make sure you have you Simmate enviornment activated
2. Run the following command:
``` bash
prefect orion database reset
```


<br/>


## Manual setup (stage 2): Setting up a Django Website Server

If you want to host your Simmate installation as website just for you team, you can use DigitalOcean to host a Django Website server.
Expand Down
4 changes: 1 addition & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,10 @@ There is one key exception to the rules above -- and that is with `MAJOR`=0 rele
- add full-run unittests that call workflows and vasp (without emulation)

**Refactors**
- remove experimental `workflow_engine.executor`
- move contents of `configuration.django.database` to `database.utilities`
- :warning: upgraded to Prefect v2 ("Orion"). This involved the refactoring the entire `workflow_engine` module, and therefore the entire workflow library. Users should therefore go back through tutorials from the beginning to see everything that has changed. ([#185](https://github.com/jacksund/simmate/pull/185)).

> :warning: the majority of tutorials and documentation are still being updated to account for the new Prefect 2.0 version. This message will be removed when the main branch is fully updated and accurate.

> Prefect Orion is still in beta (v2.0b8), and the first stable release is expected in July 2022. However, this date is not definite, and there is a very good chance for delays. Until a stable release is made for Prefect, there will be no new Simmate release. You can stay up to date with Prefect's beta status on [the Prefect discourse page](https://discourse.prefect.io/tags/c/announcements/5/prefect-2-0). This message will be removed when a new release becomes available.
**Fixes**
Expand Down
32 changes: 21 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ conda install -c conda-forge simmate
Once installed, running a local test server is as simple as...

``` bash
# on first-time setup, you must intialize an empty database
# On first-time setup, you must intialize an empty database.
simmate database reset

# then start the server!
Expand All @@ -98,23 +98,33 @@ Again, take a look at [our main website](https://simmate.org/) if you'd like to

<!-- This is an image of the Prefect UI -->
<p align="center" style="margin-bottom:40px;">
<img src="https://raw.githubusercontent.com/PrefectHQ/prefect/master/docs/.vuepress/public/orchestration/ui/dashboard-overview2.png" height=440 style="max-height: 440px;">
<img src="https://orion-docs.prefect.io/img/ui/orion-dashboard.png" height=440 style="max-height: 440px;">
</p>

``` bash
# The command line let's you quickly run a workflow
# from a structure file (CIF or POSCAR)
simmate workflows run relaxation/Matproj --structure NaCl.cif
# from a structure file (CIF or POSCAR).
simmate workflows run relaxation.vasp.matproj --structure NaCl.cif
```

``` yaml
# Workflows can also be ran from YAML-based configuration
# files, such as the one shown here (named `example.yaml`).
# This would be submitted with the command:
# `simmate workflows run-yaml example.yaml`
workflow_name: relaxation.vasp.matproj
structure: NaCl.cif
command: mpirun -n 8 vasp_std > vasp.out
```
``` python
# Python let's you run workflows within scripts and
# it also enables advanced setting configurations.
# Simply load the workflow you'd like and run it!

from simmate.workflows.relaxation import Matproj_workflow
from simmate.workflows.relaxation import Relaxation__Vasp__Matproj as workflow

status = Matproj_workflow.run(structure="NaCl.cif")
state = workflow.run(structure="NaCl.cif")
result = workflow.result()
```


Expand Down Expand Up @@ -173,14 +183,14 @@ structure.add_oxidation_state_by_guess()
4. _**Ease of Scalability.**_ At the beginning of a project, you may want to write and run code on a single computer and single core. But as you run into some intense calculations, you may want to use all of your CPU and GPU to run calculations. At the extreme, some projects require thousands of computers across numerous locations, including university clusters (using SLURM or PBS) and cloud computing (using Kubernetes and Docker). Simmate can meet all of these needs thanks to integration with [Dask](https://github.com/dask/dask) and [Prefect](https://github.com/PrefectHQ/prefect):
```python
# To run the tasks of a single workflow in parallel, use Dask.
from prefect.executors import DaskExecutor
workflow.executor = DaskExecutor()
status = workflow.run(...)
from prefect.task_runners import DaskTaskRunner
workflow.task_runner = DaskTaskRunner()
state = workflow.run(...)

# To run many workflows in parallel, use Prefect.
# Once you configure Prefect, you simply switch
# from using "run" to "run_cloud"
status = workflow.run_cloud(...)
prefect_flow_run_id = workflow.run_cloud(...)

# You can use different combinations of these two parallelization strategies as well!
# Using Prefect and Dask, we can scale out accross various computer resources
Expand Down
5 changes: 3 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,8 @@ markers = [
"blender: requires blender installed",
"pymatgen: runs a pymatgen-compatibility test",
"vasp: requires vasp installed",
"prefect_db: requires access to the prefect database"
"prefect_db: requires access to the prefect database",
"slow: test is slow (>30s) and unstable in in the CI",
]

# By default, we only want to run unmarked tests. The simplest way to do this
Expand All @@ -27,7 +28,7 @@ markers = [
# migration folders.
# I manually remove -m when testing coverage, but am unsure if there's a better
# way to do this.
addopts = "--no-migrations -m 'not blender and not pymatgen and not vasp'"
addopts = "--no-migrations -m 'not blender and not pymatgen and not vasp and not slow'"

# There are a number of warnings that are expected when running our tests.
# We remove these from our output for clarity.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
)


@pytest.mark.slow
@pytest.mark.prefect_db
@pytest.mark.django_db
def test_neb(sample_structures, tmpdir, mocker):
Expand Down
1 change: 0 additions & 1 deletion src/simmate/configuration/django/settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,6 @@
"simmate.website.core_components.apps.CoreComponentsConfig",
"simmate.website.third_parties.apps.ThirdPartyConfig",
"simmate.website.workflows.apps.WorkflowsConfig",
"simmate.website.workflow_execution.apps.WorkflowExecutionConfig",
#
# These are built-in django apps that we use for extra features
"django.contrib.admin",
Expand Down
4 changes: 2 additions & 2 deletions src/simmate/configuration/example_project/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,10 @@ your conda envirnment:
``` bash
# replace "my_new_project" with the name of your project
cd my_new_project
pip install -e my_new_project
pip install -e .
```

2. Make sure this install worked by running this line in python:
2. Make sure this install worked by running these lines in python:

``` bash
# You may need to restart your terminal/Spyder for this to work
Expand Down
10 changes: 6 additions & 4 deletions src/simmate/configuration/example_project/example_app/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,17 +7,19 @@
Whenever building new tables, be sure to start from the classes located in our
`simmate.database.base_data_types` module. These let you automatically add useful
columns and features -- rather than creating everything from scratch. Read
the [base_data_types documentation](https://jacksund.github.io/simmate/simmate/database/base_data_types.html)
for more information.
columns and features -- rather than creating everything from scratch.
For more information and advanced guides, be sure to read through our
[base_data_types documentation](https://jacksund.github.io/simmate/simmate/database/base_data_types.html)
"""

from simmate.database.base_data_types import DatabaseTable, Relaxation, table_column


# As an example here, we set up tables for a relaxation. There are two tables
# here: one for the relaxations and a second for storing each ionic step in the
# relaxation. Both of these tables
# relaxation. Both of these tables are created automatically with the create_subclasses
# method:
ExampleRelaxation, ExampleIonicStep = Relaxation.create_subclasses(
"Example",
module=__name__, # this line is required
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,12 @@
can be incorporated into our workflows (in `workflows.py`).
Below is an example of a simple VaspTask, which is used to run a single VASP
calculation. For more complex settings, it's worth looking through our
library of other examples at `simmate.calculators.vasp.tasks`.
calculation.
For more information and advanced guides, be sure to read through our
[S3Task documentation](https://jacksund.github.io/simmate/simmate/workflow_engine/supervised_staged_shell_task.html)
as well as relevent subclasses like the
[VaspTask documentation](https://jacksund.github.io/simmate/simmate/calculators/vasp/tasks/base.html)
"""


Expand Down
Original file line number Diff line number Diff line change
@@ -1,9 +1,15 @@
# -*- coding: utf-8 -*-

"""
This file is important only when your project grows and starts getting users.
This file is important as your project grows and starts getting users.
Here, you need to make sure any changes you make don't break old features and
code.
Simmate does not have full guides on writing unittests yet, so we recommend
looking through other packges such as...
- [pytest](https://docs.pytest.org/en/7.1.x/)
- [pytest-django](https://pytest-django.readthedocs.io/en/latest/)
- [django test tutorials](https://docs.djangoproject.com/en/4.0/intro/tutorial05/)
"""

from django.test import TestCase
Expand Down
4 changes: 4 additions & 0 deletions src/simmate/configuration/example_project/example_app/urls.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@
Want to make a website interface for your app? Then you'll need to fill this file
out! If you ever venture to this level, we strongly recommend you go through the
django tutorials first: https://docs.djangoproject.com/en/3.2/
COMING SOON: Simmate will automatically build out views for a workflow based
on what your database table looks like. Let us know if you are waiting on this
feature, so we can prioritize it!
"""

from django.urls import path
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,10 @@
Want to make a website interface for your app? Then you'll need to fill this file
out! If you ever venture to this level, we strongly recommend you go through the
django tutorials first: https://docs.djangoproject.com/en/3.2/
COMING SOON: Simmate will automatically build out views for a workflow based
on what your database table looks like. Let us know if you are waiting on this
feature, so we can prioritize it!
"""

from django.shortcuts import render
Expand Down
34 changes: 12 additions & 22 deletions src/simmate/configuration/example_project/example_app/workflows.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,35 +3,25 @@
"""
Build a workflow in Simmate involves piecing together our Tasks and Database
tables. So before you start editting this file, make sure you have gone through
and editted the `models.py` and `tasks.py` files.
and editted the `models.py` and `tasks.py` files.
This is the most basic Workflow that you may build. For complex workflows that
require several calculations, other python logic, and parallelization, make
sure you read through the documentation for
[the Workflow class](https://jacksund.github.io/simmate/simmate/workflow_engine/workflow.html)
"""

# This function is what we use to automatically built simple one-task workflows
from simmate.workflow_engine import s3task_to_workflow
# To access advanced functionality, we should always inherit from the base Workflow
from simmate.workflow_engine import Workflow

# Import our tables and tasks from the other files.
# Note, the format `import <name> as <newname>` simply renames the class so
# Note, the format `from <place> import <name> as <newname>` simply renames the class so
# that the table/task classes don't have the same name in this file.
from .models import ExampleRelaxation as ExampleRelaxationTable
from .tasks import ExampleRelaxation as ExampleRelaxationTask

# Now build our workflow
example_workflow = s3task_to_workflow(
# The naming convention here follows how you would import this workflow.
# We would do this with `from example_app import example_workflow`, so
# this import corresponds to the following name:
name="example_app/example_workflow",
# This line is always the same and should be left unchanged
module=__name__,
# Set the name where you want this to show up in prefect cloud
project_name="Simmate-Relaxation",
class Relaxation__Vasp__MyCustomSettings(Workflow):
# Simply set these two variables to your Task and DatabaseTable classes!
s3task=ExampleRelaxationTable,
database_table=ExampleRelaxationTable,
# this sets what should be saved to the database BEFORE the calculation
# is actually started. This helps you search your database for calculations
# that have not completed yet.
register_kwargs=["structure", "source"],
# Quick description that will be used in the website interface
description_doc_short="This is my new fancy workflow!",
)
s3task = ExampleRelaxationTask
database_table = ExampleRelaxationTable
Loading

0 comments on commit 6113658

Please sign in to comment.