-
Notifications
You must be signed in to change notification settings - Fork 128
Guide to using DEA Notebooks with git
To be able to write your own analyses and contribute back to the dea-notebooks
repository, follow this guide to get started with git
, a version-control software designed to help track changes to files and collaborate with multiple users on a project. Using git
is the recommended workflow for working with dea-notebooks
as it makes it easy to stay up to date with the latest versions of functions and code, and makes it impossible to lose your work.
The guide is currently split into two sections:
Note: Alternatively, the Github website can be used to upload and modify the DEA Notebooks repository directly. This can be a simple way to get started with git (see the Guide to using DEA Notebooks with the Github website here).
On launch, the sandbox is pre-populated with notebooks from the master
branch of the dea-notebooks
repository.
However, for development and review, you'll need your own copy of the dea-notebooks
repository that won't be overwritten by the Sandbox's launch behavior.
Warning: If you do not create a new copy of the
dea-notebooks
repository, any changes you make to notebooks or code will be lost each time you launch the Sandbox.
To get your own copy of the repository:
-
Open a terminal from the JupyterLab Launcher (click the "+" New Launcher button on the top-left of JupyterLab, then click "Other > Terminal")
-
Make a new directory to work in by typing:
mkdir dev
-
You should see the
dev
directory appear in the file structure. Enter the new directory by typing:cd dev
-
Clone the notebooks repository by typing:
git clone https://github.com/GeoscienceAustralia/dea-notebooks.git
-
Enter the notebooks directory by typing:
cd dea-notebooks
If you just cloned the repository, you should have an up-to-date set of notebooks.
However, it is worth using git pull
regularly to keep your repository up to date.
First we check out the develop
branch (the default branch for recent updates to the repository), then pull in any recent changes:
git checkout develop
git pull
-
Start a new branch (using
develop
as the base) by typing the command below (replacebranch_to_work_on
with a name of your choice)git checkout -b branch_to_work_on develop
-
See which files you've changed by typing:
git status
-
See the changes to notebooks by clicking the
git
extension button in the notebook -
Add the files you want to commit by typing
git add
followed by the filenames you want to add or--all
to add everything -
Commit the files with a message by typing:
git commit -m "Simple commit message"
-
Push your changes by typing
git push
- If this is the first push from this branch, you'll need to type:
git push --set-upstream origin branch_to_work_on
-
Enter your Github username and password to complete the push. If you do not have a Github account, create one here.
Troubleshooting: If this is your first time using
dea-notebooks
, you will probably receive aremote: Permission to GeoscienceAustralia/dea-notebooks.git denied
error. To resolve this, an existing member ofdea-notebooks
will need to invite you to the repository (Settings > Collaborators and Teams > Search by username, full name or email address > Add Collaborator
)
Troubleshooting: If you get the error message:
remote: Invalid username or password. fatal: Authentication failed for 'https://github.com/GeoscienceAustralia/dea-notebooks.git/'
, follow the guide to creating a personal access token here (only the "repo" tickbox/scope needs to be selected in order to push todea-notebooks
). Once you have generated a token (a string of letters and numbers), save the token in a safe location (if you lose this you can regenerate it again). Now you cangit push
and when git asks for your user name and password in the sandbox, enter your GitHub username e.g.BexDunn
and instead of entering your GitHub password, enter your token string e.g.3i4htrou3fgffgyy45tysiduhg6779yho87rtiouhihrego7wery
:
git push
Username: your_username
Password: your_token
Work can be added to the develop
branch using a "pull request". A pull request will take all changes made to a branch and merge this into another branch (typically, develop
). Because we don't always want all changes on a branch to be merged in, we need to first create a temporary pull request branch containing only the changes we want.
-
Avoid merge conflicts later on by getting the latest version of the
develop
branch:git checkout develop git pull
-
Create a new temporary branch that is an exact copy of
develop
. This is where the files you want to publish will be placed (changetemp_branch_name
to a simple name of your choice that describes the changes you are making):git checkout -b temp_branch_name develop
-
Copy the files (e.g.
Scripts/dea_datahandling.py
) you want to publish from your main branch (e.g.branch_to_work_on
that we created above) to this new temporary branch:git checkout branch_to_work_on -- Scripts/dea_datahandling.py
-
The new file (e.g.
Scripts/dea_datahandling.py
) will be added to the staging area of temp_branch_name, check using:git status
-
Commit the new file using:
git commit -m 'your_message'
-
Push the temporary branch up to Github so we can create a pull request:
git push --set-upstream origin temp_branch_name
-
When you’ve pushed your changes and are ready for feedback, visit the pull requests page on the DEA Notebooks repository:
https://github.com/GeoscienceAustralia/dea-notebooks/pulls
-
The name of the temporary branch you just pushed should appear in yellow. Click the green "Compare and make pull request" button to the right.
-
This will take you to the 'Open pull request' page.
-
Add a description of your changes, and select people you’d like to review it
-
To make changes/updates/edits to someone else's pull request, first check out the branch you want to edit (e.g.
pull_request_branch
):git checkout --track origin/pull_request_branch
-
Commit and push any changes you make, which will become part of the open pull request
-
You can update your branch by checking it out, then typing
git pull
- You should regularly do this for the
develop
branch
- You should regularly do this for the
-
To get the latest updates from the
develop
branch back into your own branch, follow the steps below (this can also be useful to check for conflicts between your branch anddevelop
):git checkout develop git pull git checkout branch_to_work_on git merge develop
-
When the text message pops up to give the merge commit a name, it will be in the Vi editor. To accept the default message, hit
Esc
on your keyboard, followed by:wq
The above instructions should work for setting up Git on both the DEA Sandbox and the NCI (via the Virtual Desktop Infrastructure or VDI). However, there are some important things to keep in mind:
- To get started with
dea-notebooks
using git on the NCI, the first step is to clone this repository to a suitable location. This will most likely be a location you can access on the VDI, so you can easily work with your notebooks. Note that this repo is likely to become quite large, so make sure you have enough space in the location you clone the repository to (i.e. probably not your home directory, but a directory on/g/data/
should be perfect). - If you haven't used Git on the VDI before, you will need to set up some SSH keys before you will be able to clone the repository. To set up the SSH keys, follow the instructions here to generate an ssh key pair in your home directory on the vdi. You will need to generate the key, register it with your
ssh
agent, and then add the newly generated id_rsa.pub public key to your GitHub account.
Note: Some VDI users have encountered an issue with the Firefox browser freezing when attempting to launch JupyterLab. This is associated with having a large Git repository (e.g. DEA Notebooks) in JupyterLab's working directory which leads to issues with the
jupyterlab_git
extension. A workaround for this issue is to disable this extension before launching JupyterLab: to do this, add the following line to the bash script you use to launch JupyterLab, e.g.:
module use /g/data/v10/public/modules/modulefiles
module load dea
jupyter serverextension disable jupyterlab_git
jupyter lab
Updating this wiki: If you notice anything incorrect or out of date in this wiki, please feel free to make an edit!
License: All code in this repository is licensed under the Apache License, Version 2.0. Digital Earth Australia data is licensed under the Creative Commons by Attribution 4.0 license.
Contact: If you need assistance with any of the Jupyter Notebooks or Python code in this repository, please post a question on the Open Data Cube Discord chat or on the GIS Stack Exchange using the open-data-cube
tag (you can view previously asked questions here). If you would like to report an issue with any notebook, you can file one on Github.