From 8d163dca25b922ebdd0800fedb6e126534a1f842 Mon Sep 17 00:00:00 2001 From: Virginia Morales Date: Thu, 1 Aug 2024 16:24:25 +0200 Subject: [PATCH] Update documentation Signed-off-by: Virginia Morales --- docs/add_new_data.rst | 34 ++++++ docs/calculations_via_requests.rst | 100 +++++++++++++++++ docs/general_workflow.rst | 169 +++++++++++++++++++++++++++++ docs/index.rst | 37 ++++++- docs/user-guide.rst | 75 ++++++++++++- 5 files changed, 408 insertions(+), 7 deletions(-) create mode 100644 docs/add_new_data.rst create mode 100644 docs/calculations_via_requests.rst create mode 100644 docs/general_workflow.rst diff --git a/docs/add_new_data.rst b/docs/add_new_data.rst new file mode 100644 index 00000000..c9488851 --- /dev/null +++ b/docs/add_new_data.rst @@ -0,0 +1,34 @@ +HOW TO ADD NEW DATA +=================== + +Adding new hazards +------------------ + +In ``src/physrisk/kernel/hazards.py``, all hazards are cataloged, classified as ACUTE, CHRONIC, or UNKNOWN, and designated as either parameter-based or event-based. To add a new hazard, create a new class within this file and specify its type. + +Additionally, complete the onboarding process for the new hazard in the hazard program. This step ensures that the hazard and its data are collected in the bucket and included in the inventory used by PhysRisk to calculate impacts and risk measures. + +Adding new vulnerability models +------------------------------- + +In the ``src/physrisk/vulnerability_models`` folder, files correspond to each vulnerability model (e.g., ``chronic_heat_models``, ``power_generating_asset_models``, ``real_estate_models``, ``thermal_power_generation_models``). Each file contains classes for each type of hazard, with separate classes for default and stress_test calculations. The ``DictBasedVulnerabilityModelsFactory``, which inherits from the ``VulnerabilityModelsFactory`` class in ``src/physrisk/kernel/vulnerability_model.py``, is used to handle the different vulnerability models, whether they are default or stress_test models. +In the ``DictBasedVulnerabilityModelsFactory`` class there is a ``vulnerability_models`` method which retrieves the corresponding vulnerability models using the methods ``get_default_vulnerability_models`` and ``get_stress_test_vulnerability_models``, implemented in ``src/physrisk/kernel/calculation.py`` + +To add a vulnerability model for a new asset type, create a new file with the corresponding vulnerability classes for each hazard type. Additionally, create a JSON file in the ``src/physrisk/datas/static/vulnerability`` folder. This JSON file should include the vulnerability curves for these models, detailing ``impact_mean``, ``impact_std``, ``impact_type``, ``impact_units``, ``intensity``, ``intensity_units``, and ``location`` for each type of event (hazard) and asset. + +For adding a vulnerability model for an existing asset, create a new class in the relevant file. All classes must inherit from either a class in ``src/physrisk/kernel/vulnerability_model.py`` or another class in the same file that inherits from a class in ``vulnerability_model.py``. These classes must include at least a constructor, a ``get_data_requests`` method, and a ``get_impact`` method. + +- The ``get_data_requests`` method returns an ``HazardDataRequest`` object, which stores all necessary information to request data from the bucket for calculating impacts and risk measures. +- The ``get_impact`` method returns an ``ImpactDistrib`` object, which contains "Impact distributions specific to an asset" and is used for calculating ``impact_bins_explicit``, ``mean_impact``, ``stddev_impact``, ``above_mean_stddev_impact`` and ``to_exceedance_curve``. + +To include the new vulnerability model, update either ``get_default_vulnerability_models`` or ``get_stress_test_vulnerability_models`` as appropriate. If introducing a new calculation method, add a new method and integrate it into ``DictBasedVulnerabilityModelsFactory``. + + +Adding new risk models +---------------------- + +In ``src/physrisk/risk_models/risk_models.py`` there are three classes that implement risk models: ``RealEstateToyRiskMeasures`` (which calculates risk measures using exceedance curves), ``ThermalPowerPlantsRiskMeasures`` (which calculates risk measures using mean intensity and percentiles). Additionally, in ``src/physrisk/risk_models/generic_risk_model.py``, the ``GenericScoreBasedRiskMeasures`` class showcases how different approaches can be combined for calculating risk scores, using vulnerability models for some hazards and direct hazard indicators for others. This generic implementation serves as an example, blending elements from both real estate and Jupiter exposure calculations, without a predefined use case for the measures. + +Moreover, similar to the vulnerability models, a factory class ``DefaultMeasuresFactory`` has been implemented in ``src/physrisk/kernel/calculation.py`` to select the appropriate risk model based on the ``use_case_id``, in order to do so in this class there is a method called ``calculators`` which makes use of ``get_default_risk_measure_calculators`` , ``get_default_risk_measure_calculators`` and ``get_default_risk_measure_calculators``, implemented in ``src/physrisk/kernel/calculation.py``. + +To add new risk models, you need to create a new class in the risk_models file that implements the calculations for the new model. diff --git a/docs/calculations_via_requests.rst b/docs/calculations_via_requests.rst new file mode 100644 index 00000000..510794ab --- /dev/null +++ b/docs/calculations_via_requests.rst @@ -0,0 +1,100 @@ +CONTAINER AND REQUEST USAGE GUIDE +================================= + +In addition to invoking methods directly, as detailed in the general_workflow.md guide, you can perform various actions or calculations through requests. For examples of this approach, see ``tests/risk_models/risk_models_AK_test.py`` and ``tests/risk_models/risk_models_test.py``. + +Process +------- + +1. Create a Container + + First, create a ``Container`` object and configure it by overriding its default providers with custom ones. Here’s an example: + + .. code-block:: python + + # Define custom factories for hazard and vulnerability models + class TestHazardModelFactory(HazardModelFactory): + def hazard_model(self, interpolation: str = "floor", provider_max_requests: Dict[str, int] = ...): + return ZarrHazardModel( + source_paths=get_default_source_paths(), reader=reader + ) + + class TestVulnerabilityModelFactory(VulnerabilityModelsFactory): + def vulnerability_models(self): + return DictBasedVulnerabilityModels( + { + ThermalPowerGeneratingAsset: [ + ThermalPowerGenerationAqueductWaterStressModel() + ] + } + ) + + # Register custom providers in the container + container.override_providers( + hazard_model_factory=providers.Factory(TestHazardModelFactory) + ) + container.override_providers( + config=providers.Configuration(default={"zarr_sources": ["embedded"]}) + ) + container.override_providers(inventory_reader=ZarrReader()) + container.override_providers(zarr_reader=ZarrReader()) + container.override_providers( + vulnerability_models_factory=providers.Factory(TestVulnerabilityModelFactory) + ) + + You can include any list of vulnerability models in the configuration. If none are provided, default models will be used. + +2. Create a Requester + + After setting up the container, call ``container.requester()`` to obtain an instance of ``Requester``. This object includes the following attributes configured from the container: + ``hazard_model_factory: HazardModelFactory, vulnerability_models_factory: VulnerabilityModelsFactory, inventory: Inventory, inventory_reader: InventoryReader, reader: ZarrReader, colormaps: Colormaps`` and a ``measures_factory: RiskMeasuresFactory`` + +3. Call the Method and Obtain a Response + + The ``Requester`` class has a main method that calls different methods based on the ``request_id`` provided. + + Here is an example of how to call a method using the ``get`` method: + + .. code-block:: python + + res = requester.get(request_id="get_asset_impact", request_dict=request_dict) + + You can assign the following values to ``request_id``: + + - ``get_hazard_data``: Returns intensity curves for the selected hazards, years, and scenarios. + - ``get_hazard_availability``: Returns the hazards stored in the inventory. + - ``get_hazard_description``: Returns the description assigned to a specific hazard. + - ``get_asset_exposure``: Calculates the exposure of a given asset for a hazard, exposure measure, scenario, and year. + - ``get_asset_impact``: Returns risk measures or impacts based on parameters provided in ``request_dict``. + - ``get_example_portfolios``: Returns a JSON with assets and their respective details such as class, type, location, latitude, and longitude. + + The structure of ``request_dict`` depends on the method you are calling. For example, for the ``get_asset_impact`` method, ``request_dict`` might look like this: + + .. code-block:: python + + def create_assets_json(assets: Sequence[ThermalPowerGeneratingAsset]): + assets_dict = { + "items": [ + { + "asset_class": type(asset).__name__, + "type": asset.type, + "location": asset.location, + "longitude": asset.longitude, + "latitude": asset.latitude, + } + for asset in assets + ], + } + return assets_dict + + request_dict = { + "assets": create_assets_json(assets=assets), + "include_asset_level": False, + "include_measures": True, + "include_calc_details": False, + "model_kind": ModelKind.STRESS_TEST, + "years": years, + "scenarios": scenarios, + } + + Finally, the ``get`` method calls the appropriate methods corresponding to the ``request_id`` with the necessary parameters and returns the response as a JSON object with the result. diff --git a/docs/general_workflow.rst b/docs/general_workflow.rst new file mode 100644 index 00000000..b1115d80 --- /dev/null +++ b/docs/general_workflow.rst @@ -0,0 +1,169 @@ +GENERAL WORKFLOW +================ + + +Impact calculation process +========================== + +To calculate impacts for a specific asset, scenario, year, vulnerability model, and hazard, use the following function: +``calculate_impacts(assets, hazard_model, vulnerability_model, scenario, year) -> Dict[ImpactKey, List[AssetImpactResult]]`` + + +First part: calculating intensity curves +---------------------------------------- +(calculate intensities and return_periods for acute hazards, parameters and definitions for chronic hazards) + +1. Request Hazard Data: + + For each asset, generate requests for hazard data and obtain responses: + + .. code-block:: python + + asset_requests, responses = _request_consolidated(hazard_model, model_asset, scenario, year) + + ``asset_requests`` include one or more data requests needed to compute ``VulnerabilityDistrib`` and ``HazardEventDistrib`` for the asset: + + .. code-block:: python + + HazardDataRequest(self.hazard_type, asset.longitude, asset.latitude, scenario=scenario, year=year, indicator_id=self.indicator_id) + + responses are obtained from: + + .. code-block:: python + + hazard_model.get_hazard_events(requests) + + +2. Process Hazard Events: + + If hazards are acute (events) or chronic (parameters), the responses are processed differently: + Acute Hazards: Responses include periods, intensities, units, and paths. + Chronic Hazards: Responses include parameters, definitions, units, and paths. + +3. Retrieve Data: + + For Acute Hazards: + + .. code-block:: python + + hazard_data_provider = self.hazard_data_providers[hazard_type] + intensities, return_periods, units, path = hazard_data_provider.get_data(longitudes, latitudes, indicator_id, scenario, year, hint, buffer) + + For Chronic Hazards: + + .. code-block:: python + + hazard_data_provider = self.hazard_data_providers[hazard_type] + parameters, definitions, units, path = hazard_data_provider.get_data(longitudes, latitudes, indicator_id, scenario, year, hint, buffer) + + .. code-block:: python + + get_data(self, longitudes: List[float], latitudes: List[float], *, indicator_id: str, scenario: str, year: int, hint: Optional[HazardDataHint] = None, buffer: Optional[int] = None) + + The ``get_data`` method retrieves hazard data for given coordinates. + +4. Determine Data Path: + + Build the path for data retrieval: + + .. code-block:: python + + path = self._get_source_path(indicator_id=indicator_id, scenario=scenario, year=year, hint=hint) + + get_source_path(SourcePath) provides the source path mappings. + +5. Retrieve Curves: + + If buffer is None, use: + + .. code-block:: python + + values, indices, units = self._reader.get_curves(path, longitudes, latitudes, self._interpolation) + + If buffer is specified (The ``buffer`` variable is used to specify an area of a given size, as indicated by this variable, instead of using a single point): + + .. code-block:: python + + values, indices, units = self._reader.get_max_curves( + path, + [ + ( + Point(longitude, latitude) + if buffer == 0 + else Point(longitude, latitude).buffer( + ZarrReader._get_equivalent_buffer_in_arc_degrees(latitude, buffer) + ) + ) + for longitude, latitude in zip(longitudes, latitudes) + ], + self._interpolation + ) + +6. Data Retrieval Functions: + + .. code-block:: python + + get_curves(self, set_id, longitudes, latitudes, interpolation="floor") + + Get Curves: Retrieves intensity curves for each coordinate pair. Returns intensity curves, return periods, and units. + + First, it constructs the path used to select the corresponding data in the bucket. From this data, it extracts the transformation matrix, coordinate system, data units, and return periods or indices (``index_values``). Next, it converts the geographic coordinates to image coordinates. Then, it interpolates the data based on the specified interpolation method. + + If the interpolation method is ``"floor"``, it converts ``image_coords`` to integer values using the floor function and adjusts coordinates for wrapping around the dataset dimensions. It retrieves the data values using ``z.get_coordinate_selection``, then reshapes and returns the data along with ``index_values`` and ``units``. + + For other interpolation methods (``"linear"``, ``"max"``, ``"min"``), it calls ``_linear_interp_frac_coordinates`` to perform the specified interpolation. Finally, it returns the interpolated results along with ``index_values`` and ``units``. + + .. code-block:: python + + get_max_curves(self, set_id, shapes, interpolation="floor") + + Get Max Curves: Retrieves the maximum intensity curves for given geometries. Returns maximal intensity curves, return periods, and units. + + First, it constructs the path used to locate the corresponding data in the bucket, similar to the ``get_curves`` method. From this data, it extracts the transformation matrix, coordinate system, data units, and index values (``index_values``). It then computes the inverse of the affine transformation matrix and applies it to the input geometries, transforming them into the coordinate system of the dataset. + + Next, it generates a ``MultiPoint`` for each shape by creating a grid of points within the shape's bounding box and intersecting these points with the shape to retain only those points that lie within the shape. If the intersection of a shape with the grid points is empty, it falls back to using the centroid of the shape as a single point. + + For the ``"floor"`` interpolation method, it converts the transformed coordinates to integer values using the floor method, retrieves the corresponding data values, and reshapes the data. For other interpolation methods (``"linear"``, ``"max"``, ``"min"``), it combines the transformed shapes with the multipoints and computes the fractional coordinates for interpolation. + + Finally, it calculates the maximum intensity values for each shape by grouping the points corresponding to each shape and finding the maximum value for each return period. The method then returns the maximum intensity curves, return periods, and units. + + +Second part: applying a vulnerability model to obtain impacts +------------------------------------------------------------- + +When applying a chronic-type vulnerability model, the impact is calculated using the model's ``get_impact`` method. This method will return an ``ImpactDistrib`` object, which includes ``impact_bins``, ``impact_type``, ``path``, and ``prob`` (i.e., it provides the impact distribution along with the hazard data used to infer it). This result is then stored in an ``AssetImpactResult`` object, together with the hazard_data (which consists of the intensity curves obtained previously). The ``AssetImpactResult`` is subsequently saved in the results dictionary, associated with an ``ImpactKey`` that comprises the ``asset``, ``hazard_type``, ``scenario``, and ``year``. + +On the other hand, for acute-type vulnerability models, the impact is calculated using the ``get_impact_details`` method of the model. This method returns an ``ImpactDistrib`` object, a ``VulnerabilityDistrib`` object (which includes ``impact_bins``, ``intensity_bins``, and ``prob_matrix``), and a ``HazardEventDistrib`` object (which contains ``intensity_bin_edges`` and ``prob``). In other words, it provides the impact distribution along with the vulnerability and hazard event distributions used to infer it. This information is stored in an ``AssetImpactResult`` object, which is then added to the results dictionary with an ``ImpactKey``. + + +Risk measures calculation process +================================= + +To calculate risk measures for a specific asset, scenario, year, vulnerability model, and hazard, use the following function: +``def calculate_risk_measures(self, assets: Sequence[Asset], prosp_scens: Sequence[str], years: Sequence[int]):`` + +1. Calculate all impacts + + First, using the ``_calculate_all_impacts`` method, the impacts for the specific hazard, asset, and vulnerability model are calculated for all the years and scenarios. This method uses ``_calculate_single_impact``, which calculates each impact using the ``calculate_impacts`` method previously described. + +2. Calculate risk measure + + For each asset, scenario, year, and hazard, the corresponding impact is used to determine the risk measures according to the selected calculation method. + + The impact of the historical scenario is chosen as the ``base impact``, and ``risk measures`` are calculated using the ``calc_measure`` function. + + In the default use case, the ``calc_measure`` method defined in the ``RealEstateToyRiskMeasures`` class performs calculations differently depending on whether the hazard is chronic heat or another type. The difference between the two methods is that ``calc_measure_cooling`` uses ``mean impacts`` for calculations, while ``calc_measure_acute`` uses ``exceedance curves``. In both cases, a ``Measure`` object is returned, which contains a ``score`` (REDFLAG, HIGH, MEDIUM, LOW), ``measures_0`` (future_loss), and a ``definition``. + + For cooling hazards: it calculates the change in mean impact between historical and future scenarios. It assigns a risk score based on the future cooling levels and the change compared to predefined thresholds, returning a ``Measure`` object with the assigned score and future cooling value. + + For acute hazards: it calculates the potential loss based on a 100-year return period by comparing historical and future loss values derived from exceedance curves. It assigns a risk score based on future loss levels and the change in loss relative to predefined thresholds, returning a ``Measure`` object with the assigned score and future loss value. + + + For the stress_test use case, the ``calc_measure`` function, in the ``ThermalPowerPlantsRiskMeasures``, creates a ``StressTestImpact`` object to obtain the percentiles (norisk, p50, p75, p90), which are used to evaluate the impact based on its ``mean_intensity``. This method also returns a ``Measure`` object with a ``score`` (HIGH, MEDIUM, LOW, NORISK, NODATA), ``measures_0`` (mean_intensity), and a ``definition``. + + + For the generic use case, in the the ``GenericScoreBasedRiskMeasures`` class, the calc_measure method calculates risk scores differently based on whether the impact distribution is necessary or if underlying hazard data can be used instead. To generate the scores, bounds are defined for each hazard type. + + When using hazard data: it compares hazard parameters to the predefined threshold bounds. It returns a score based on the severity of the hazard, or NODATA if the parameter is invalid. + + Otherwise: the method calculates two impact measures from historical and future data. It then determines the score category based on whether these measures fall within predefined ranges and returns a Measure object with the score and the first measure value. \ No newline at end of file diff --git a/docs/index.rst b/docs/index.rst index ebc910ca..b3fbb60a 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -62,6 +62,34 @@ This website contains the documentation for Physrisk, a calculation engine for P ^^^ API reference derived from code. +.. grid:: 2 + :gutter: 1 + + .. grid-item-card:: General Workflow + :link: general_workflow.html + :text-align: center + + :octicon:`workflow;5em;sd-text-info` + ^^^ + Overview of the general workflow in Physrisk. + + .. grid-item-card:: Add New Data + :link: add_new_data.html + :text-align: center + + :octicon:`database;5em;sd-text-info` + ^^^ + Instructions on how to add new data to the system. + + .. grid-item-card:: Calculations via Requests + :link: calculations_via_requests.html + :text-align: center + + :octicon:`inbox;5em;sd-text-info` + ^^^ + Guide to perform calculations through requests. + + Physical Risk and Resilience ============================= @@ -98,9 +126,12 @@ Contents :maxdepth: 2 getting-started - methodology - user-guide - api/physrisk + methodology + user-guide + api/physrisk + general_workflow + add_new_data + calculations_via_requests Indices and tables ================== diff --git a/docs/user-guide.rst b/docs/user-guide.rst index 62820d72..af4d01d4 100644 --- a/docs/user-guide.rst +++ b/docs/user-guide.rst @@ -5,9 +5,76 @@ If you are looking for the methodology of Physrisk or information about the diff The following sections document the structures and conventions of Physrisk and assumes some familiarity with the calculation of Physical Climate Risk (see `methodology document `_ introduction). +Introduction to Physrisk +------------------------ +Physic comprises: -.. toctree:: - :maxdepth: 1 +* A :code:`HazardModel` that retrieves *hazard indicators* for different locations. +* :code:`VulnerabilityModels` that assess the vulnerability of assets to different climate hazards. :code:`VulnerabilityModels` use hazard indicators requested from the :code:`HazardModel` to calculate the *impact* of a hazard on a collection of assets. +* Financial models that use the impacts calculated by the :code:`VulnerabilityModels` to calculate risk measures and scores. - user-guide/introduction - user-guide/vulnerability_config +:code:`VulnerabilityModels` request hazard indicators using an :code:`indicator_id` (e.g. 'flood_depth' for inundation, 'max_speed' for wind). It is the responsibility of the :code:`HazardModel` to select the source of the hazard indicator data. + +Note that units of the quantity are provided to the :code:`VulnerabilityModel` by the :code:`HazardModel`. + +Hazard indicator data sets +------------------------- +The :code:`HazardModel` retrieves hazard indicators in a number of ways and can be made composite in order to combine different ways of accessing the data. At time of writing the common cases are that: + +1. Hazard indicator data is stored in `Zarr `_ format (in an arbitrary Zarr store, although S3 is a popular choice). +2. Hazard indicator data is retrieved via call to an external API. This is mainly used when combining commercial data to the public-domain. + +In case 1, hazard indicators are stored as three dimensional arrays. The array is ordered :math:`(z, y, x)` where :math:`y` is the spatial :math:`y` coordinate, :math:`x` is the spatial :math:`x` coordinate and :math:`z` is an *index* coordinate. The *index* takes on different meanings according to the type of data being stored. + +Indicators can be either: + +* Acute (A): the data comprises a set of hazard intensities for different return periods. In this case *index* refers to the different return periods. +* Parametric (P): the data comprises a set of parameters. Here *index* refers to the different parameters. The parameters may be single values, or *index* might refer to a set of thresholds. Parametric indicators are used for chronic hazards. + +As mentioned above, :code:`VulnerabilityModels` only specify the identifier of the hazard indicator that is required, as well as the climate scenario ID and the year of the future projection. This means that hazard indicator ID uniquely defines the data. For example, a vulnerability model requesting 'flood depth' could have data returned from a variety of data sets, depending on how the :code:`HazardModel` is configured. But + ++-----------------------+-------------------------------+---------------------------------------+ +| Hazard class | Indicator ID (type) | Description | ++=======================+===============================+=======================================+ +| CoastalInundation, | flood_depth (A) | Flood depth (m) for available | +| PluvialInundation, | | return periods. This is unprotected | +| RiverineInundation | | depth. | +| +-------------------------------+---------------------------------------+ +| | sop (P) | Standard of protection | +| | | (as return period in years). | ++-----------------------+-------------------------------+---------------------------------------+ +| Fire | fire_probability (P) | Annual probability that location | +| | | is in a wildfire zone. | ++-----------------------+-------------------------------+---------------------------------------+ +| Heat | mean_degree_days/above/index | Mean mean-temperature degree days per | +| | (P) | year above a set of temperature | +| | | threshold indices. | ++-----------------------+-------------------------------+---------------------------------------+ +| Drought | months/spei/12m/below/index | Mean months per year where the 12 | +| | (P) | month SPEI index is below a set of | +| | | indices. | ++-----------------------+-------------------------------+---------------------------------------+ +| Wind | max_speed | Maximum 1 minute sustained wind speed | +| | (A) | for available return periods. | ++-----------------------+-------------------------------+---------------------------------------+ +| Subsidence | susceptability (P) | Score (1-5) based on soils’ clay | +| | | content. | +| +-------------------------------+---------------------------------------+ +| | land_subsidence_rate (P) | Land subsidence rate | +| | | (millimetres/year). | ++-----------------------+-------------------------------+---------------------------------------+ +| Landslide | susceptability (P) | Score (1-5) based on characteristics | +| | | of the terrain combined with daily | +| | | maximum precipitation (per return | +| | | period). | ++-----------------------+-------------------------------+---------------------------------------+ +| WaterStress | water_stress (P) | Ratio of water demand and water | +| | | supply. | ++-----------------------+-------------------------------+---------------------------------------+ +| HighFire | fwiX (P) | Daily probabilities of high forest | +| | | fire danger in Europe. | ++-----------------------+-------------------------------+---------------------------------------+ +| ChronicWind | windX (P) | Gridde annual probability of severe / | +| | | extreme convective windstorms (defined| +| | | as wind gusts \> X m/s). | ++-----------------------+-------------------------------+---------------------------------------+