-
-
Notifications
You must be signed in to change notification settings - Fork 17
Introduction to Keeping Confidential Information Safe on GitHub: GitHub secrets and .env files
About to share your Jupyter Notebook/ Python script on GitHub? Hold up before you press that button!
If your script does not contain confidential information such as keys, tokens, your username, etc., go ahead and share your awesome work! However, if you do indeed have confidential information in your script, there are a few extra steps you should take to keep your information safe.
The 2 main methods I will share below are 1) GitHub Secrets and 2) using .env files.
Note: There may be other ways to do the same thing. These are just what I used when completing my Hack for LA project. Feel free to do your own research online!
How GitHub Secrets works is similar to adding a value to a variable. Once you assign a value to a variable using GitHub Secrets, no one will be able to see what the value of the variable is, even if they have access to the repository. From there, you can access the value in your GitHub Secret variable in your Python script by doing some setup in your .yaml workflow file and importing libraries. See below to get a better idea of how it works and how to set it up.
If the repository you are uploading your files to is not created by you, you may have to get additional permissions to access and edit settings.
- Ensure that the menu option "Settings" is available to you when you are in the repository:
- Click "Settings" and scroll down until you see the option "Secrets and variables" under "Security" in the left menu:
- Click the dropdown option for "Secrets and variables" and click "Actions"
- Click the green button "New repository secret" to add a variable (Name) and value (Secret):
- Type in variable name and secret value, click "Add secret", and voila, our newly created variable should now appear under "Repository Secrets".
FYI, here are the rules GitHub has for secret names (See source for more information)
- Names can only contain alphanumeric characters ([a-z], [A-Z], [0-9]) or underscores (_). Spaces are not allowed.
- Names must not start with the GITHUB_ prefix.
- Names must not start with a number.
- Names are case insensitive.
- Names must be unique at the level they are created at.
- In a .yaml file (creates workflow), assign the secret (referencing secret name) to a name. See example below:
Note: "GitHub Actions can only read a secret if you explicitly include the secret in a workflow." (Source: GitHub Secrets Documentation for more information)
-
Import os library with
import os
in Python script. -
In your Python script, create a variable and assign your secret value by referencing the names you assigned to your secret variables in the .yaml file. For example, if my GitHub Secret variable name is
API_KEY_GITHUB_PROJECTBOARD_DASHBOARD
and I named it the same in my .yaml file, and the variable I want to use to call the value contained within the secret variable is namedGitHub_token
, below is how I would create myGitHub_token
variable:
GitHub_token = os.environ["API_KEY_GITHUB_PROJECTBOARD_DASHBOARD"]
In this case, I can now use my confidential API key without worrying about anyone finding out what it is by using the 'GitHub_token' variable throughout my Python script where appropriate.
Sometimes, the confidential information that you want to add to GitHub Secrets may not just be a simple string or number. In my case, it was in a semi-structured format - JSON. To add the JSON key to GitHub Secrets, I had to first encode it into a string. This is how I went about it using the base64
library:
- Import
json
andbase64
libraries using:
import json
import base64
-
Open the JSON file using
variable_name = open('nameoffile.json')
-
Access the content of the JSON file using
content = json.load(variable_name)
-
At this point, the value stored in the
content
variable is in the form of a dictionary. -
Convert the dictionary into a string using
string = json.dumps(content)
-
From here, execute the following to obtain your base64 encoded key:
string_bytes = string.encode('ascii')
base64_bytes = base64.b64encode(string_bytes)
base64_key_string = base64_bytes.decode('ascii')
- Now you can output the value in
base64_key_string
and put it into GitHub Secrets.
If you need to reverse the base64 encoding to get the original content in the JSON file, you can do the following:
base64_bytes = base64_key_string.encode('ascii')
string_bytes = base64.b64decode(base64_bytes)
string = string_bytes.decode('ascii')
json_content = json.loads(string)
- Encoding and Decoding using Base64 (Text tutorial)
- Python Tips and Tricks: Base64 String Encoding and Decoding (Video tutorial - more details)
- Using JSON in your GitHub Actions when authenticating with GCP
- What you NEED to know about Base64 (Context)
Resources:
- Purpose of .env files: A Gentle Introduction to .env Files
.env files are basically text files that contain your confidential information in a format similar to the one below:
See example .env file template I created for people to create their own .env files to run my Jupyter Notebook code:
As you might have gathered by now, .env files are created as a means to keep your sensitive information separate from your codebook. .env files are stored locally in your system, which you can access on your computer to run your Jupyter notebook/ .ipynb file smoothly by putting both files in the same location. By doing so, you can freely share your code with others without worrying about your confidential information being compromised. By giving people a template, users can create their own .env file to run with the notebook/.ipynb file you just shared with them.
Now that you have a general idea on what .env files are for and what it looks like, let's get on with the steps to create such a file.
There are 2 methods, one using "Notepad" and the other using "Visual Studio Code".
*Note: Since I only have a Windows computer, steps below apply to Windows PC and laptops - general idea should be the same for Mac though):
- Click (Start button) and search for "Notepad"
Screenshot using Windows 11
- Click on the "Notepad" app to open it and create your keys and values. Something interesting to know here is that string values do not need quotation marks to indicate they are strings. Just type in the value as is.
Example:
animal = squirrel
key = lkmlmwefoeib234
- To save the text file as a .env file:
- Select File > Save As
- In the pop-up window, name your file ".env" and select file type as "Save as type: All files".
- Click "Save". If you check your file in the folder, you should see it is an env file now.
-
Same as the notepad method, click the "Start" button and search for "Visual Studio Code".
-
Open up Visual Studio Code, go to File > Open Folder, and select the folder with the local copy of the repository you would like to add the .env file to. Alternatively, you can clone the repository in Visual Studio Code.
-
Now, go to File > New Text File.
- Add in your information as described above for notepad.
Example:
animal = squirrel
key = lkmlmwefoeib234
- Select File > Save As
- Name file as ".env" and "Save as type: All files"
- Click save and you should see the following file in your folder in the left panel:
You have successfully created a .env file.
If your .env file was not already saved in the folder where you access or open up your .ipynb file using Jupyter Notebook, please do so. Once that is done, your notebook should be able to access the information in your .env file and run. This is similar to how you need to save dataset files in the same directory to open them using code in your Jupyter Notebook codebook.
To access the information in your .env file in your Jupyter notebook Python script, you have to:
- Import the
dotenv
andos
library and runload_dotenv()
import os
from dotenv import load_dotenv
load_dotenv()
- Create your variable and reference the relevant key in the .env file to assign the value to the variable.
Example:
secret_animal = os.getenv("animal")
In this tutorial, you have learned how to create and use GitHub Secrets and .env files to keep sensitive/confidential information safe while easily being able to share your Python script in a GitHub repository for automation and workflows or with co-workers. JSON format keys can easily be converted into strings and back into JSON format using base64
encoding, where we can add the former to our GitHub Secrets or .env files for use and use the JSON key in its original format in our Python script by converting it back using base64.