Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

263 move data contribution guide #264

Merged
merged 15 commits into from
Jul 16, 2024
Merged
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 12 additions & 0 deletions docs/data/modules/ROOT/nav.adoc
Original file line number Diff line number Diff line change
@@ -1,16 +1,28 @@
// Data management menu
* xref:index.adoc[Data Management]
** xref:data-description.adoc[Data Description]
** xref:data-types.adoc[Data Types]
*** xref:data-types.adoc#_data_formats[Data Formats]
*** xref:data-types.adoc#_feel_data[{feelpp} Data]
** xref:data-storage.adoc[Data Storage]

*** xref:data-storage.adoc#_girder[Girder]
// Contribute data to Girder menu
**** xref:girder/README.adoc#_file_access_options[File access options]
**** xref:girder/python_scripts.adoc[Python scripts examples]
***** xref:girder/python_scripts.adoc#_with_user_and_password[With user and password]
***** xref:girder/api_keys.adoc#_using_api_keys[Using API keys]

*** xref:data-storage.adoc#_mongodb[MongoDB]
*** xref:data-storage.adoc#_github[GitHub]
** xref:data-visualisation.adoc[Data Visualization]
*** xref:data-visualisation.adoc#_paraview[Paraview]
*** xref:data-visualisation.adoc#_paraview_web[Paraview-Web]
*** xref:data-visualisation.adoc#_vtj.js[Vtk.js]



// Data management plan menu
* xref:plan/index.adoc[Data Management Plan]
** xref:plan/basics.adoc[Basics Aspects]
** xref:plan/fair.adoc[Fair Data]
Expand Down
1 change: 1 addition & 0 deletions docs/data/modules/ROOT/pages/girder/README.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
include::data:ROOT:partial$girder/README.adoc[]
1 change: 1 addition & 0 deletions docs/data/modules/ROOT/pages/girder/api_keys.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
include::data:ROOT:partial$girder/api_keys.adoc[]
1 change: 1 addition & 0 deletions docs/data/modules/ROOT/pages/girder/python_scripts.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
include::data:ROOT:partial$girder/python_scripts.adoc[]
69 changes: 69 additions & 0 deletions docs/data/modules/ROOT/partials/girder/README.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
== File access options

A big advantage of girder is that files can be accessed either with a nice user
interface (in a web browser) or more programmatically (via CLI or programming
interfaces).
This is true for both upload and download.

=== Web browser UI

If you are new to Girder, you should start with the web user interface to
discover the service.
You are encouraged to read the documentation
link:http://girder.readthedocs.io/[documentation] and especially the
link:http://girder.readthedocs.io/en/latest/user-docs.html[user guide].
For example, using a web browser, you can reach
link:https://girder.math.unistra.fr/#collections[our server].

|===
| image:girder/girder_web_ui.png[Web user interface,100%]
|===

NOTE: This is quite straightforward and it should illustrate important concepts
and help better visualize the following section.

=== Python API

To download or upload a file using python, we have two options.
We can either use a login system (user + password) or an API key.
- With the user/password, we will need to provide a user and a password to
access files, just like a human would do using the UI.
- With an API key, the script only needs an API key.
This means we do not need to have an account on the Girder server.

In both cases, similar pieces of information are required:

- an *address* (URL): to reach the server,
- a *file ID*: to tell which file/directory we want to manipulate,
- either a *user/password* pair or an *API key*: to grant access to the
required file(s).

=== {feelpp} remotedata tool

The application `feelpp_remotedata` can also be used to upload/download data from Girder, both files and entire folders.
Some examples of its usage follow, and additional information can be found xref:user:using:tools/remotedata.adoc[here].

.Download one file into a specific directory
----
feelpp_remotedata --download "girder:{file:5b1f8707b0e9570499f66bd0}" --data $HOME/mydir
----
.Download one folder
----
feelpp_remotedata --download "girder:{folder:<folder-id>}"
----
.Upload one file/folder (requires authentication)
----
feelpp_remotedata --upload "girder:{folder:<a folder id>}" --data $HOME/mydata
----

A Python interface to these functionalities is also available.

.Download two files from Girder using Python
[source,python]
----
import feelpp as fpp
app = fpp.Environment(["myapp"],config=fpp.localRepository(""))
sm_csv_names = fpp.download( "girder:{file:[<file1-id>,<file2-id>]}", worldComm=app.worldCommPtr())
# It is possible to download an entire folder in zip format
# sm_csv_zipped_folder = fpp.download( "girder:{folder:<folder-id>}", worldComm=app.worldCommPtr())
----
79 changes: 79 additions & 0 deletions docs/data/modules/ROOT/partials/girder/api_keys.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
= Using API keys

An API key grants access to a certain set of files, with various permissions.
It is very useful and quite convenient to use.
Since it is nothing more than a character string, one could think of it as a
long and pseudorandom special password.

NOTE: To learn how to set API keys, read
link:http://girder.readthedocs.io/en/latest/user-guide.html#api-keys[this part]
of the documentation.

== Download

To use an API key to download a file, see this script:

[source, python]
----
#!/usr/bin/env python3


# We need the girder client module.
import girder_client

# First, we initiate the client with the girder server address.
gc = girder_client.GirderClient(apiUrl='https://girder.math.unistra.fr/api/v1')

# We authenticate using only the API key
gc.authenticate(apiKey='KEY') # <1>

# We download the file using its file ID. The path indicates where the file
# should be written (the full file name should be included at the end of the path)
gc.downloadFile(fileId='FILEID', path='PATH') # <2>

----

<1> *KEY* is the only needed information to authenticate.

<2> *FILEID* should be replaced by the actual Girder file ID and *PATH* should
be the path where to store the results, including the desired file name and
extension.


== Upload

To upload using an API key:

[source, python]
----
#!/usr/bin/env python3


# We need the girder client module.
import girder_client

# First, we initiate the client with the girder server address.
gc = girder_client.GirderClient(apiUrl='https://girder.math.unistra.fr/api/v1')

# We authenticate using only the API key
gc.authenticate(apiKey='KEY') # <1>

# /!\ This is mandatory: we have to open the file in read mode before
# uploading it
f = open('PATH', 'r') # <2>

# Now we can upload the file <3>
gc.uploadFile(parentId='PID', stream=f, name="NAME", size=SIZE, parentType='TYPE')

----

<1> *KEY* is the only needed information to authenticate.

<2> *PATH* should be replaced by the full path to the file to read.
*r* stands for "read mode".

<3> *PID* should be replaced by the parent directory ID (on the Girder server).
*f* is the read stream defined previously .
*NAME* should be replaced by the desired file name (on the Girder server).
*SIZE* should be replaced by the file size (in bytes).
*TYPE* is either *folder*, *user*, or *collection*.
134 changes: 134 additions & 0 deletions docs/data/modules/ROOT/partials/girder/python_scripts.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
= Python scripts

Here, we provide some python scripts to access files on Girder.
To be able to use them, please install the girder client python module (read how
link:http://girder.readthedocs.io/en/latest/python-client.html[here].
)

== With user and password

=== What not to do

WARNING: The following script is only meant as a first approach to understand
how the module works.
You should not use it because it requires your password to be written in clear
in the script.
Use the interactive version instead.

[source, python]
----
#!/usr/bin/env python3


# We need the girder client module.
import girder_client

# First, we initiate the client with the girder server address.
gc = girder_client.GirderClient(apiUrl='https://girder.math.unistra.fr/api/v1')

# We authenticate using the username and the password.
# /!\ This is for learning purpose.
# For security reasons, you should never put your password in the script.
gc.authenticate(username='USER', password='PASSWORD') # <1>

# We download the file using its file ID. The path indicates where the file
# should be written (the file name should be included at the end of the path)
gc.downloadFile(fileId='FILEID', path='PATH') # <2>

----
<1> *USER* should be replaced by the Girder user name,
*PASSWORD* by the corresponding password
<2> *FILEID* should be replaced by the actual Girder file ID and *PATH* should
be the path where to store the results, including the desired file name and
extension.

WARNING: If you don't supply the file name, the system will not warn you, it
will *automatically generate one*, which could be confusing !

IMPORTANT: Remember not to use this script. Try the interactive one instead.


=== Interactive download

Here, we use a modified version of the _authenticate_ function to use
interactive login.
This means the password will be prompted for at run time.

IMPORTANT: This implies the script can not be used in a fully automated way,
because each execution of the script will require the user of the script to be
present to type the password. For a safe and automatic access, use the API keys.

[source, python]
----
#!/usr/bin/env python3


# We need the girder client module.
import girder_client

# First, we initiate the client with the girder server address.
gc = girder_client.GirderClient(apiUrl='https://girder.math.unistra.fr/api/v1')

# We authenticate using the username, the password will be typed at runtime
gc.authenticate(username='USER', interactive=True) # <1>

# We download the file using its file ID. The path indicates where the file
# should be written (the full file name should be included at the end of the path)
gc.downloadFile(fileId='FILEID', path='PATH') # <2>

----
<1> *USER* should be replaced by the Girder user name, and because of the
_interactive=True_ argument, the password will be prompted for at runtime.
<2> *FILEID* should be replaced by the actual Girder file ID and *PATH* should be the path where to store the results, including the desired file name and extension.

TIP: We can even prompt the user to type *both the user name and the password*
by providing *only* the _interactive=True_ argument !
This is a better solution when multiple users are likely to use the script only
once or a handful of times each.

WARNING: If you don't supply the file name, the system will not warn you, it
will *automatically generate one*, which could be confusing !



=== Interactive upload

To upload a file, only a few changes are required.

[source, python]
----
#!/usr/bin/env python3


# We need the girder client module.
import girder_client

# First, we initiate the client with the girder server address.
gc = girder_client.GirderClient(apiUrl='https://girder.math.unistra.fr/api/v1')

# We authenticate using the username, the password will be typed at runtime
gc.authenticate(username='USER', interactive=True) # <1>


# /!\ This is mandatory: we have to open the file in read mode before
# uploading it
f = open('PATH', 'r') # <2>

# Now we can upload the file <3>
gc.uploadFile(parentId='PID', stream=f, name="NAME", size=SIZE, parentType='TYPE')

----

<1> *USER* should be replaced by the Girder user name, and because of the
_interactive=True_ argument, the password will be prompted for at runtime.

<2> *PATH* should be replaced by the full path to the file to read.
*r* stands for "read mode".

<3> *PID* should be replaced by the parent directory ID (on the Girder server).
*f* is the read stream defined previously .
*NAME* should be replaced by the desired file name (on the Girder server).
*SIZE* should be replaced by the file size (in bytes).
*TYPE* is either *folder*, *user*, or *collection*.

NOTE: We should try and find a way to get the file size automatically.
Loading