Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(cloning): add brief, hld and development plan #551

Merged
merged 14 commits into from
Oct 19, 2021

Conversation

kgajowy
Copy link
Contributor

@kgajowy kgajowy commented Oct 1, 2021

Substitute this line for a meaningful title for your changes

Overview

Please write a description. If the PR is hard to understand, provide a quick explanation of the code.

Designs

Link to the related design prototypes (if applicable)

Testing instructions

Please explain how to test the PR: ID of a dataset, steps to reach the feature, etc.

Feature relevant tickets

Link to the related task manager tickets


Checklist before submitting

  • Meaningful commits and code rebased on develop.
  • If this PR adds feature that should be tested for regressions when
    deploying to staging/production, please add brief testing instructions
    to the deploy checklist (docs/deployment-checklist.md)
  • Update CHANGELOG file

@vercel
Copy link

vercel bot commented Oct 1, 2021

This pull request is being automatically deployed with Vercel (learn more).
To see the status of your deployments, click below or on the icon next to each commit.

marxan – ./app

🔍 Inspect: https://vercel.com/vizzuality1/marxan/BorWNx1bQC97nuAgr7kV5iF7fTF3
✅ Preview: https://marxan-git-docs-download-upload-close-vizzuality1.vercel.app

marxan-storybook – ./app

🔍 Inspect: https://vercel.com/vizzuality1/marxan-storybook/3dsGDCG2t7KJPZYw3Hz8WFmQWdYE
✅ Preview: https://marxan-storybook-git-docs-download-upload-close-vizzuality1.vercel.app

@kgajowy kgajowy changed the title docs(cloning): add brief and hld draft docs(cloning): add brief, hld and development plan Oct 7, 2021
docs/download-upload-clone/brief.md Outdated Show resolved Hide resolved
# Release 2

* project's:
* planning area: selected GADM & grid config settings
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

selected entire gadm or its gids?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gids.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* as the client agreed, pre-defined resources do not need to be described 
  with unique identifier shared across possible instances of MarxanCloud 
  (Databases) - if any shared resource with imported ID is not available, 
  import should be rejected

or event no GID is needed - just particular ID of given region specification from our DB.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, not sure if we could use the ID of a given region, GIDs of an area are the same across instances; ids aren't.

docs/download-upload-clone/development-plan.md Outdated Show resolved Hide resolved
# Release 3

* project's:
* custom grid, settings and generated PU with geometries
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't generated pu with geometries already included in the above release?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in case of custom PU grids, all we may have is the source shapefile (if we decide to store it), its name (we would need to store it if we want to use it), and the PU geometries persisted to db from the shapefile we originally processed; there would be no other settings involved here, AFAICT.
the PU geometries in this case would not be generated, strictly speaking, but persisted from the shapefile input (after any cleanup such as st_makevalid() as for other custom uploads such as custom features)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the PU geometries in this case would not be generated, strictly speaking, but persisted from the shapefile input
we don't know yet but this may be true - as comparing diff (if shapefile changed) may be troublesome (but if we would even compare MD5, then using existing PU-persistence version could pay off on speed factor... but not really sure we shall focus on micro-optimizations now)

docs/download-upload-clone/development-plan.md Outdated Show resolved Hide resolved
Comment on lines +4 to +6
* as the client agreed, pre-defined resources do not need to be described
with unique identifier shared across possible instances of MarxanCloud
(Databases) - if any shared resource with imported ID is not available,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, do they have the same id across instances? And it's unique in the instance, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a bit tricky.
in the current implementation, we cannot rely on stable db-generated UUIDs as proxies unique ids of datasets that are imported platform-wide via ETL pipelines, because the UUIDs would different between instances, yet the data may be exactly the same (for example, same release of GADM data from upstream).

we would need to add and handle some basic metadata such as unique dataset name/id (can be set when creating new ETL pipelines if these are included in the core repo), and version (this could also be part of the unique id, since we won't need to care if source is using GADM dataset vN and target is using GADM dataset vM - we'll need to consider them as two fully different datasets)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, we may want to keep track of version separately, in case it may be ok to allow importing data where the same ETL-imported dataset is present as in the source instance, but in a later version. unlikely, but probably worth using an extra field for this.

# Release 4

* scenario's:
* IUCN categories (protected areas) & threshold
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we will need (in a different release) custom protected areas too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added 0ecea36

Comment on lines +11 to +13
* existing data converted from shapefiles should be converted back to
shapefile format (currently the platform doesn't keep the uploaded files,
rather consumes them)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to confirm that we aren't dropping any data while consuming that may not be relevant for use in the instance but may be useful on import - none I can think of, but may need some cross checking.

* when declaring file to import, system should accept the options to:
* run scenarios with settings as per normal run once import is finished,
as long as the solutions are not included in the imported file
* export operation is asynchronous, should block the usage of project/scenarios
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, we shouldn't start an export operation if there are any running jobs for the project or any of its scenarios.

likely also if there are failed jobs (garbage in, garbage out - or rather in case of exports or clones, garbage out, garbage in)...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hotzevzl this may be tricky. If user already uploaded (for example) different cost surface (which failed), effectively previous one MAY still be valid.

As checking this may not be trivial... not sure if we should include such requirement? May it be worth discussing with client?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blocking export added: 95ef596

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run scenarios with settings as per normal run once import is finished, 
    as long as the solutions are not included in the imported file

What is the need to run marxan if the outputs are not saved?
Jen mentioned that some users won't want to run their imported projects, and keep the original results. is this the case for that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, that's correct, as you say - if users have solutions in the archive being imported and they want to preserve them, we need to make sure Marxan is not run after importing the archive

* run scenarios with settings as per normal run once import is finished,
as long as the solutions are not included in the imported file
* export operation is asynchronous, should block the usage of project/scenarios
* import operation is asynchronous
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likewise, we should block any operations on a project or any of its scenarios until the whole project and its scenarios are fully imported.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* existing shapes/data that came from `shapefile` - should it be converted
"back" by the system, or should be store the original uploaded
`shapefiles` and keep their reference?
* exported zip format/shape
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i should mention sqlite here 😉


* project's:
* custom grid, settings and generated PU with geometries
* custom grid's shapefile (restored, not original)
Copy link
Contributor

@Dyostiq Dyostiq Oct 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we export shapefiles?
just because we already have a way to import them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but then we need an ability to import geojson or shapefile, whereas we don't need to export shapefile – unless, we want to allow users to use exported data in other environment than ours

Copy link
Contributor Author

@kgajowy kgajowy Oct 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

most of our jobs base on shapefiles, so it should help a little.

May be the phrasing isn't best - the plan only meant to point which "feature" we should care in each release. How exactly import/export on each of those will work is a subject of each feature.

Anyway, as long as we want to allow users to work on those parts (for example, modify Project Area in qGIS), we shall serve shapefiles or geojsons.

@kgajowy
Copy link
Contributor Author

kgajowy commented Oct 13, 2021

Added note about public project cloning: #571 (comment)


# User actions

In all these use cases, the core operation being described is that of cloning a project or a scenario: this may be done within the platform itself by clicking on a “Clone project” or “Clone scenario” button, or (in the case of projects) as a two-step process, first downloading an archive file containing all the data needed to successively upload the same project to the same or another Marxan Cloud instance, and secondly actually importing the archive.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be sure that we are aligned on this (from user perspective):

Scenarios:

  • Only cloning is available.
  • User will click on a "duplicate" button and the scenario will get duplicated on the same project without further interaction.

Project:

  • Three possible actions: Clone, Export, Import
  • Clone: User clicks on a button, the project gets duplicated in their account without further interaction.
  • Export: User clicks on a button and gets a file.
  • Import: User uploads a file to the platform and a project is created within their account.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd even add this text here verbatim for extra clarity of the process from the user point of view (it's described in the paragraph but I think this breakdown makes things clearer at a glance for non-technical readers... if any)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dhakelila sounds good to me, would you mind to add a commit with this clarification to brief?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added 56aed4f

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! sorry, I'm seeing your message just now. :(
Thanks for adding it!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it contemplated that the user can change the name of the cloned project/scenario in order to save it?

},
"planningAreaGeometry": {
"uuid": "uuid",
"[comment]": "may be shapefile or whatever version wants",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd use just a markdown wrapping an example, a JSON schema if you want to describe the schema, or json5 if you're going to make comments.

"description": "description",
"type": "marxan",
"hasResults": false,
"executeRunOnceImported": false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

numberOfRuns and boundaryLengthModifier are fields defined in scenario entity and not present in marxanSettings.

* While Scenario implementation phases aim at delivering full flow with no
customizations first, Project includes all custom features first to avoid risk
of having issues with underlying scenarios later.
* All phases (not before each single) should be preceded with spikes related to relevant points from HLD document (like shapefiles handling)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, because of spikes, we may get some changes in the plan, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats right.


* scenario's:
* custom (uploaded) features
* allow to `execute run` when uploading if there were no previous results
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there were previous results, remember that it might be important for the user to keep those.
This needs to be checked with the client, tho. But it shouldn't be a blocker to keep going with the plan.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, forget. I just realized that is in the previous line. :)

@kgajowy kgajowy merged commit 5229442 into develop Oct 19, 2021
@kgajowy kgajowy deleted the docs/download-upload-close branch October 19, 2021 11:20
hotzevzl added a commit that referenced this pull request Oct 19, 2021
* docs(cloning): add brief and hld draft

* docs(cloning): add license spike; fix typo

* Update docs/download-upload-clone/brief.md

Co-authored-by: Dyostiq <[email protected]>

* Update docs/download-upload-clone/development-plan.md

Co-authored-by: Dyostiq <[email protected]>

* docs(cloning): remove redundant release

* Update docs/download-upload-clone/brief.md

Co-authored-by: andrea rota <[email protected]>

* docs(cloning): add custom protected areas to release plan

* docs(cloning): export should be denied when something is in progress

* docs(cloning): project/scenario restrictions when import is ongoing

* docs(cloning): add note about cloning public project

* docs(cloning): export dataset version

* docs(cloning): add v1 archive content example

* docs(cloning): add archive type

* docs(cloning): explicit describe user actions

Co-authored-by: Dyostiq <[email protected]>
Co-authored-by: andrea rota <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants