1. Getting started

Caution

This wiki is incomplete and will be the official documentation for the new version of the script to be released in 2024.

In this section, we will learn

How to install Python using the Anaconda or Miniconda distributions.
How to install GrainSizeTools.
How to use Python through Jupyter Notebook or JupyterLab.
How to interact with the script and import data.

Step 1. Install Python for data science

GrainSizeTools requires installing Python 3, the Python scientific libraries NumPy SciPy, Pandas, Matplotlib, and JupyterLab. If you have no previous experience with Python, we recommend downloading and installing the Anaconda Python distribution as it contains all the necessary scientific packages (> 5 GB disc space). There are versions for Windows, MacOS and Linux.

https://docs.anaconda.com/free/anaconda/install/

Tip

If you have limited disk space, there is a distribution called Miniconda which installs only the Python packages you need. If you prefer to go to this route, click here for detailed instructions.

Once Anaconda is installed, launch the Anaconda Navigator and you will see that you have installed various scientifically oriented Integrated Development Systems (IDEs), including JupyterLab and the Jupyter Notebook. Clicking on any of these will open the corresponding application in your default web browser. The GrainSizeTools documentation is written assuming that you will be working using Jupyter Notebooks.

Tip

Using a dedicated application to work with Jupyter notebooks

If you prefer to use a dedicated application instead of opening Jupyter Notebooks in your browser, there are several alternatives. Here we will mention two free alternatives:

JupyterLab desktop: https://github.com/jupyterlab/jupyterlab-desktop/releases

This is a cross-platform desktop application for JupyterLab. It is the same application that opens in the browser but in an encapsulated application. You can find the user guide at the following link https://github.com/jupyterlab/jupyterlab-desktop/blob/master/user-guide.md. If you are a beginner, this is the easy road.

Visual Studio Code (a.k.a. Vscode): https://code.visualstudio.com/

This is a free code editor that can be used with various programming languages including Python and supports Jupyter Notebooks via extensions. As an advantage over vanilla JupyterLab, it has a handy variable browser. More detailed instructions on how to use Jupyter Notebooks in Vscode at the following link https://code.visualstudio.com/docs/datascience/jupyter-notebooks

Note that both applications require Python to be installed on your operating system, i.e. it does not exempt you from the step of installing Python using Anaconda or any other distribution.

Step 2. Download GrainSizeTools

Once Python is installed, the next step is to download GrainSizeTools. Click on the download link below (there is also a direct link on the GrainSizeTools website).

https://github.com/marcoalopez/GrainSizeTools/releases

and download the latest version of the script by clicking on the zip file as shown below

TODO-> figure

Unzip the file and save the GrainSizeTools folder to a location of your choice. The GrainSizeTools folder contains various Python files (.py), a folder named DATA with a CSV file inside, and various Jupyter Notebooks (.ipynb files) that are templates for doing different types of grain size data analysis.

TODO -> figure

Step 3. Understanding Jupyter Notebooks

To improve the reproducibility of the grain size studies, we suggest working with the Jupyter Notebooks templates provided in the script, especially if you have no previous experience with the Python language. A Jupyter Notebook is a document that can contain executable code, equations, visualisations and narrative text. This is ideal for generating reports and including them as supplementary material in your publications so that any researcher can reproduce your results.

TODO -> Figure showing a Jupyter Notebook

Figure X. An example of a Jupyter Notebook with code, equations (using Latex), visualizations and narrative text.

JupyterLab and VSCode with the Jupyter Notebook extension are the next generation of the Classic Jupyter Notebook application interface, providing an easy-to-use environment focused on data science.

Figure X. The JupyterLab interface...TODO. More info at https://jupyterlab.readthedocs.io/en/latest/user/interface.html#the-jupyterlab-interface

Note

Explaining how Jupyter Notebooks work in detail is beyond the scope of this wiki page. Fortunately, there are excellent tutorials available. To familiarize yourself with how Jupyter Notebooks work, we recommend the following tutorials:

https://www.youtube.com/watch?v=HW29067qVWk This is a video by Corey Schafer that explains in a clear, entertaining and concise way how to install and use a Jupyter Notebook, the usage part starts at about 4:20. For the tutorial, Corey uses the classic Jupyter Notebook application instead of JupyterLab, but it works similarly.
TODO

Step 4: Understanding the script structure and the workflow

The script consists of seven Python files, listed below, which must be in the same directory.

GrainSizeTools.py: This file imports all the modules needed for the script to work. This is the only executed Python file in the Jupyter Notebook templates.
averages.py: This module contains a set of functions for calculating different types of averages and margins of error.
plot.py: This module contains a set of functions for generating different types of ad hoc plots used by the script.
stereology.py: This module contains a set of functions to approximate true grain size distributions from sectional measurements using stereological methods.
piezometric_database.py: This file contains the database of piezometers the script uses.
template.py: This file contains the default (Matplotlib) parameters used by the script to generate plots.
get.py: This file contains the welcome message of the script.

They also contain a folder called DATA, where you should place the grain size data you want to quantify, and various Jupyter Notebook templates. There are three different notebooks depending on the type of study you want to perform: (i) quantification of grain size distributions (grainsize_pop_template.ipynb), (ii) approximation of grain size distributions using stereological methods (stereology_template.ipynb), and (iii) paleopiezometry (paleopiezometry_template.ipynb). These notebooks contain all the necessary instructions on how to use them through practical examples. The user will only have to change some parameters to adapt them to their study case and delete everything that is not needed. Once this is done, all the analyses and figures will be contained in a single folder, and all the procedures in a single fully reproducible document that can be exported to other formats (PDF, html, etc.) if needed. If you use this workflow, you should ideally copy the entire GrainSizeTool folder (<<1 MB), i.e. the code, the subfolder structure and the notebooks, for each grain size study you conduct.

Important

The GrainSizeTools script is not intended to deal with microscopic images but to quantify and visualize grain size populations and estimate stresses via paleopiezometers. It is, therefore, necessary to measure the grain diameters or the cross-sectional of the grains in advance and store them in a txt/csv/excel file. For this task, we strongly recommend using the *ImageJ application or one of its different flavours (see here). ImageJ-type applications are public domain image processing programs widely used in scientific research and run on all operating systems. This wiki includes a short tutorial on how to measure the areas of the grain profiles with ImageJ. The combined use of ImageJ and GrainSizeTools script is intended to ensure that all data processing steps are done through free and open-source programs/scripts that run under any operating system. If you are dealing with EBSD data instead, we encourage you to use the MTEX toolbox for grain reconstruction.

Once you have everything installed, open JupyterLab or Jupyter Notebooks and in the file manager go to the address where the script and templates are located. Open the Jupyter template you want to use and follow the instructions in the template.

TODO

Importing data using the Pandas library

The way we propose to import the data to be analysed by the script is to use Pandas, which is the de facto standard Python library for data analysis and manipulation of tabular data (including CSV, Excel or text files). The library includes several tools for reading files and handling missing data. Once the GrainSizeTools script is run, all the Pandas methods are imported as pd and available once you write in a cell pd.. TODO

TODO

Tip

Although you can use the get_filepath() function to get a file path through a file selection window as follows

>>> filepath = get_filepath()

for reproducibility it is best to specify the file path explicitly in the notebook.

TODO

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1. Getting started

Step 1. Install Python for data science

Step 2. Download GrainSizeTools

Step 3. Understanding Jupyter Notebooks

Step 4: Understanding the script structure and the workflow

Importing data using the Pandas library

Clone this wiki locally