Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs fixes and improvements #102

Merged
merged 7 commits into from
Jun 27, 2020
Merged

Docs fixes and improvements #102

merged 7 commits into from
Jun 27, 2020

Conversation

sappelhoff
Copy link
Contributor

@sappelhoff sappelhoff commented Jun 22, 2020

related to #97.

I am collecting several smaller fixes while going through the commands @bpoldrack recommended.

Questions that come up regarding --mode export

These questions may be answered and then used to improve the tutorial.

  • what would I have to do to have my data exported in a folder with the name of the dataset (as in datalad create DATASET_NAME) ... as opposed to having all files being loaded directly into the OSF? --> currently not possible, but may be a feature in the future (see Docs fixes and improvements #102 (comment))

  • the .datalad directory seems to be exported as well, ... how can I suppress this? (I may want to suppress writing the .datalad directory if using BIDS, and not wanting to add a .bidsignore with a single line .datalad) --> probably just have to live with it ... CAN be solved (but only in hacky ways) Docs fixes and improvements #102 (comment)

  • Given the hypothetical case that I already HAVE an OSF repository ... how do I add an OSF remote to a local datalad dataset to then just do git annex export HEAD --to NAME_OF_REMOTE ... without the OSF repo creation step? --> it's possible, but not yet convenient (future feature) Docs fixes and improvements #102 (comment)

  • when I update my data locally and then repeat git annex export HEAD --to NAME_OF_REMOTE ... will it overwrite my data, creating new OSF versions? --> unknown as of now: Docs fixes and improvements #102 (comment)

Questions that come up regarding --mode annex

@bpoldrack
Copy link
Member

what would I have to do to have my data exported in a folder with the name of the dataset (as in datalad create DATASET_NAME) ... as opposed to having all files being loaded directly into the OSF?

For that we need to (re-)introduce a path option for the special remote. We discussed this during the brainhack and ended up not having it. Ultimately the argument was, that you can have a subproject (component) instead of a folder within your project and connect the special remote to that component instead. I do think that's a bit cleaner that way, but I don't see why we shouldn't allow for more flexibility. There's no "technical" reason. Ping @mih.

the .datalad directory seems to be exported as well, ... how can I suppress this? (I may want to suppress writing the .datalad directory if using BIDS, and not wanting to add a .bidsignore with a single line .datalad)

I don't think that's currently possible (except you create a dedicated commit w/ .datalad removed and export it). git annex export only allows to specify a subdir to restrict the export to (like: git annex export master:subdir --to myremote).

@sappelhoff
Copy link
Contributor Author

Thanks @bpoldrack - I updated my OP (and added new questions).

Also stumbled over an issue while trying out using OSF credentials instead of an OSF token:

$ datalad create-sibling-osf sandbox --mode annex
[ERROR  ] 401 Client Error: Unauthorized for url: https://api.osf.io/v2//nodes/ [models.py:raise_for_status:941] (HTTPError) 

I defined OSF_USERNAME and OSF_PASSWORD through:

the password and username are both correct and I can log in with them. Yet I still get an error from datalad-osf as pasted above. Any clues why?

@bpoldrack
Copy link
Member

Given the hypothetical case that I already HAVE an OSF repository ... how do I add an OSF remote to a local datalad dataset to then just do git annex export HEAD --to NAME_OF_REMOTE ... without the OSF repo creation step?

For now:
git annex initremote NAME_OF_REMOTE type=external externaltype=osf encryption=none autoenable=true project=EXISTING_PROJECT exporttree=yes

  • If you wnat an annex-type remote instead, just leave out the exporttreeoption
  • EXISTING_PROJECT takes the URL to the project. However, the only important thing is, that it ends with the OSF ID, therefore you can also just pass the ID of your project (or component)

Eventually datalad create-sibling-osf should be enhanced to be able to do that more conveniently.

@bpoldrack
Copy link
Member

when I update my data locally and then repeat git annex export HEAD --to NAME_OF_REMOTE ... will it overwrite my data, creating new OSF versions?

To be honest - I'm not sure, whether this generates new versions from the point of view of OSF, But I'd suspect it does. We need to make sure that's the case indeed.

@bpoldrack
Copy link
Member

when I pushed my datalad dataset to an OSF project via --mode annex, how can I clone or install my data from that OSF project (say, on a different machine, or after deleting locally)

Not yet. PR #100 will allow for that.
As of now, it's just an annex store. However, thanks to the autoenabling (see option to initremote in above post), any clone of the repo you pushed there, will know about this data source and be able to get data from it. So, if you publish/update your dataset on github (or anywhere else), whoever clones from there and has datalad-osf installed will have access (assuming public access of your project or proper credentials, of course)

@bpoldrack
Copy link
Member

the password and username are both correct and I can log in with them. Yet I still get an error from datalad-osf as pasted above. Any clues why?

There was some trouble on some machines in that regard (including mine). Current master should work better. You could install/update via pip install git+http://github.com/datalad/datalad-osf and report whether that fixes the issue in your case.

@sappelhoff
Copy link
Contributor Author

Thanks a lot for all of these answers @bpoldrack. I'll finish up this PR later and mark it "ready for review".

@sappelhoff
Copy link
Contributor Author

You could install/update via pip install git+http://github.com/datalad/datalad-osf and report whether that fixes the issue in your case

I am on the dev version (most recent commit on master), and I still get this issue.

@sappelhoff sappelhoff marked this pull request as ready for review June 24, 2020 13:35
@sappelhoff sappelhoff changed the title [WIP] Docs fixes and improvements Docs fixes and improvements Jun 24, 2020
@sappelhoff
Copy link
Contributor Author

hey @all-contributors please add @sappelhoff for doc userTesting

@allcontributors
Copy link
Contributor

@sappelhoff

I've put up a pull request to add @sappelhoff! 🎉

@sappelhoff
Copy link
Contributor Author

I finished my changes for now, ready to get a review and improve the changes, or just to merge 🙂

I also took the liberty to add myself to the contributors, following the contributing guide

'OSF_TOKEN', or both 'OSF_USERNAME' and 'OSF_PASSWORD'. If neither of these
is defined, the tool will fall back to the datalad credential manager and
inquire for credentials interactively.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FTR: When used with DataLad, it supports queries of DataLad's credential management and makes the definition of environment variables unnecessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After #95

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does FTR mean? Should I replace my sentence with yours?

Note that the ``-s NAME_OF_REMOTE>`` flag is used to specify how ``git`` internally refers to your OSF project with the name `OSF_PROJECT_NAME`.
It would be completely fine to use `OSF_PROJECT_NAME` also as a value for the ``-s`` flag.

You can later on list your remotes from the command line using the ``git remote -v`` command.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ATM this names refers to the special remote, not a Git remote IIRC.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still listed by git remote -v. Just w/o any details.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a command to list special remotes that I should refer to instead?

Should I adjust the text to make it more clear that this is not a normal remote, but a special remote?


.. code-block:: bash

export OSF_TOKEN=YOUR_TOKEN_FROM_OSF.IO

We are now going to use datalad to create a sibling dataset on OSF with name `osf` - this will create a new project called `OSF_PROJECT_NAME` on the OSF account associated with the OSF token in `$OSF_TOKEN`.
We are now going to use datalad to create a sibling dataset on OSF with name `OSF_PROJECT_NAME`.
This will create a new project called `OSF_PROJECT_NAME` on the OSF account associated with the OSF token in `$OSF_TOKEN`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the rational to switch from osf to OSF_PROJECT_NAME in the example. Do you envision the need to have multiple different OSF projects for the same dataset as a common case?

Having something simple and uniform, such as osf makes a lot of sense. Especially in the case of a hierarchy of nested datasets, where one would want to be able to do a datalad push --to osf --recursive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you envision the need to have multiple different OSF projects for the same dataset as a common case?

no, I don't think so,

But I would prefer to let users know (also within the example) that it's up to their discretion which name they want to use for their remote and the OSF project. --> after all, it doesn't matter whether it's called osf or something else.

But reading this again, it seems like I mixed up some things and it rather should be something like:

We are now going to use DataLad to create a sibling dataset on OSF as a "special remote". Within git-annex, we will refer to the special remote with the name $NAME_OF_REMOTE, while the project that will be created on the OSF account associated with the $OSF_TOKEN will be called $OSF_PROJECT_NAME.

@mih
Copy link
Member

mih commented Jun 25, 2020

@sappelhoff thx for the PR. I made a bunch of comments. We also still seem to have an issue with testing of PRs from other forks.

@mih
Copy link
Member

mih commented Jun 26, 2020

FTR: I am implementing git remote support in create-sibling-osf via #100. This requires API changes and we may want to investigate how the docs need to be bent in that light.

@mih
Copy link
Member

mih commented Jun 26, 2020

#105 brings in a credential helper

@sappelhoff
Copy link
Contributor Author

Super cool that development does not stop with the Brainhack 👍

re: my PR, I see two options

  1. get it merged ... and then improve docs further step by step as new features get integrated
  2. shut it down and wait with overhauling the docs until the main features have been integrated

given that I already invested some work, I am of course in favor of 1. - but I am interested to hear your opinions!

@mih
Copy link
Member

mih commented Jun 27, 2020

@sappelhoff Will merge and go for (1)... and hoping that you will have another round of paying with it, once #100 is completed (hopefully later today).

@mih mih merged commit a0ca891 into datalad:master Jun 27, 2020
@sappelhoff sappelhoff deleted the docs-fixes branch June 27, 2020 17:28
@sappelhoff
Copy link
Contributor Author

hoping that you will have another round of paying with it, once #100 is completed (hopefully later today).

sure, looking forward to it! Just ping me :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants