Skip to content

Commit

Permalink
Updating docs.
Browse files Browse the repository at this point in the history
  • Loading branch information
rhysrevans3 committed Feb 27, 2024
1 parent 3399e22 commit b28c8e8
Show file tree
Hide file tree
Showing 32 changed files with 667 additions and 897 deletions.
4 changes: 2 additions & 2 deletions docs/source/api/stac_generator/stac_generator.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@ STAC Generator

:fa:`github` `View on Github <https://github.com/cedadev/stac-generator>`_

.. automodule:: stac_generator.core.processor
.. automodule:: stac_generator.core.extraction_method
:members:

.. autoclass:: stac_generator.core.generator.BaseGenerator

.. automodule:: stac_generator.core.collection_describer
.. automodule:: stac_generator.core.baker
:members:
110 changes: 0 additions & 110 deletions docs/source/collection_descriptions/building_a_workflow.rst

This file was deleted.

130 changes: 0 additions & 130 deletions docs/source/collection_descriptions/collection_descriptions.rst

This file was deleted.

20 changes: 7 additions & 13 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,11 @@ change the source of the files, the output of the metadata and the processing ch
which extracts the metadata. The framework leverages a modular, plugin architecture
to allow users to modify the workflow to fit their needs.

The process expects a stream of "assets" (an asset being a file, zarr object, etc.).
The process expects a stream of "messages" for which the recipes can be run against.
The source of this stream is configured with `input plugins <stac_generator/inputs>`_
which could be as simple as listing directories on a file system or using message
queues as part of a complex ingest system. The `generators <generators>`_ operate on this stream and
pass to `output plugins <stac_generator/outputs>`_. The output is at the level
of an "asset" so higher level aggregated objects may require an aggregation step.
pass to `output plugins <stac_generator/outputs>`_.

These outputs are also configurable so could dump to the terminal (for debugging), file,
a data store (postgres, elasticsearch, etc.) or even a message queue for onward processing.
Expand All @@ -36,22 +35,17 @@ in a certain space and time.
Generators
==========

The different generators are designed to extract different levels of metadata to build the assets, items, and collections of the STAC Catalog.
The different generators are designed to extract different levels of metadata to build the items, and collections of the STAC Catalog.

.. list-table::
:header-rows: 1

* - Name
- Description
* - :ref:`Asset Generator <stac_generator/generators:asset>`
- Generates STAC Assets via extraction methods specified in the :ref:`colelction descriptions <collection_descriptions/collection_descriptions:collection descriptions>`
focusing on file metadata (name, location, size, etc.)
* - :ref:`Item Generator <item_generator/generators:item>`
- Generates STAC Items via extraction methods specified in the :ref:`colelction descriptions <collection_descriptions/collection_descriptions:collection descriptions>`
focusing on aggregation from asset metadata.
* - :ref:`Collection Generator <stac_generator/generators:collection>`
- Generates STAC Collections via extraction methods specified in the :ref:`colelction descriptions <collection_descriptions/collection_descriptions:collection descriptions>`
focusing on aggregation from item metadata.
* - :ref:`Item Generator <item_generator/plugins/generators/item>`
- Generates STAC Items via extraction methods specified in the :ref:`colelction descriptions <recipe/recipes>`.
* - :ref:`Collection Generator <stac_generator/plugins/generators/collection>`
- Generates STAC Collections via extraction methods specified in the relivant :ref:`recipe <recipe/recipes>`.



Expand Down
81 changes: 81 additions & 0 deletions docs/source/recipes/building_a_workflow.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
******************
Building an Recipe
******************

Building an STAC catalog workflow consists of 4 mains steps:
1. Write a :ref:`recipe <recipes/recipes>` file for each STAC level to describe the workflow
2. Test the workflow on a subset of data
3. Index that subset of data to check it works as expected
4. Index full dataset

Parts 1 and 2 will likely go round in a loop, whilst you are developing the
workflow file, with several iterations until you are happy.

1. Write a recipe
=================

A basic recipe consists of up to 5 sections:
1. ``paths``
2. ``type``
3. ``id``
4. ``extraction_methods``
5. ``member_of``

An example recipe can be found :ref:`here <recipes/recipes:Example Recipe>`

The extraction methods section describes how the facets are extracted from the data.

To check your recipe works as expected, you will need to run it.

2. Running the recipe on a subset of data
=========================================

To run your workflow, you will need to create a config file.
This will define an input path and output to standard out.

Example configuration
---------------------

.. include:: ../stac_generator/user_guide/example_config.rst

You should choose a filepath with a relatively small number of files to
make iteration quick and allow you to make tweaks.

You can then run your workflow using:

``stac_generator -c <path_to_config_file>``

.. program-output:: stac_generator -h

.. note::

It is likely that this will be an iterative process to make sure that the correct
facets are extracted and the output is as desired.

3. Indexing the data
====================

This step is as simple as changing your output plugin to point to the final destination.

Here is an example for the stac-fastapi output making use of additional kwargs:

.. code-block::
- name: stac_fastapi
api_url: <API_URL>
verify: False
Once this works as expected...

4. Indexing the full dataset
============================

This is done by increasing the scope of the input plugin.
In the example we used the path ``/badc/faam/data/2005/b069-jan-05``. If our
description file covered ``/badc/faam/data`` we could now expand our input to cover
``/badc/faam/data``.

.. note::

The higher up the tree you put the input, the longer it will take. You might
wish to consider splitting the run into smaller segments and running in parallel.
Loading

0 comments on commit b28c8e8

Please sign in to comment.