[Bug]: Information missing from all_stats file for some grains #1028

Marina1595 · 2024-11-27T16:10:24Z

Checklist

Re-run analysis with topostats process --core 1.
Describe the bug.
Include the configuration file.
Copy of the output.
The exact command that failed. This is what you typed at the command line, including any options.
TopoStats version, this is reported by topostats --version
Operating System and Python Version

Describe the bug

While processing the image files, some grains were masked but not traced, resulting in their data being absent in the all_statistics files. For instance, in image 002, there were two grains, but data for only one of them was available in the all_stats file. Similarly, in image 003, three grains were present, but data for only two grains appeared in the all_stats file. However, this issue was not consistent across all images. In some cases, all grains in an image were successfully masked, traced, and included in the all_stats file. For example, image 011 had two grains, both of which were masked, traced, and recorded in the all_stats file.

The error showed in conda is:

Disordered tracing of grain 0 failed. Consider raising an issue on GitHub. Error:
Traceback (most recent call last): line 326, in _delete_pixel_subit1
    self.p7, self.p8, self.p9, self.p6, self.p2, self.p5, self.p4, self.p3 = self.get_local_pixels_binary(
ValueError: not enough values to unpack (expected 8, got 5)

Config file generated 2024.docx

Copy of the output

Printscreen of all_stats file with image 2 and 3

Printscreen of all_stats file with image 11

Include the configuration file

Config file generated 2024.docx

# Config file generated 2024-11-24 11:34:54
# # For more information on configuration and how to use it:
# https://afm-spm.github.io/TopoStats/main/configuration.html
base_dir: C:\Users\marin\TopoStats\Training_material\test_dpi # Directory in which to search for data files
output_dir: C:\Users\marin\TopoStats\Training_material\test_dpi # Directory to output results to
log_level: info # Verbosity of output. Options: warning, error, info, debug
cores: 2 # Number of CPU cores to utilise for processing multiple files simultaneously.
file_ext: .spm # File extension of the data files.
loading:
  channel: Height # Channel to pull data from in the data files.
filter:
  run: true # Options : true, false
  row_alignment_quantile: 0.5 # lower values may improve flattening of larger features
  threshold_method: std_dev # Options : otsu, std_dev, absolute
  otsu_threshold_multiplier: 1.0
  threshold_std_dev:
    below: 10.0 # Threshold for data below the image background
    above: 1.0 # Threshold for data above the image background
  threshold_absolute:
    below: -1.0 # Threshold for data below the image background
    above: 1.0 # Threshold for data above the image background
  gaussian_size: 1.0121397464510862 # Gaussian blur intensity in px
  gaussian_mode: nearest # Mode for Gaussian blurring. Options : nearest, reflect, constant, mirror, wrap
  # Scar remvoal parameters. Be careful with editing these as making the algorithm too sensitive may
  # result in ruining legitimate data.
  remove_scars:
    run: false
    removal_iterations: 2 # Number of times to run scar removal.
    threshold_low: 0.250 # lower values make scar removal more sensitive
    threshold_high: 0.666 # lower values make scar removal more sensitive
    max_scar_width: 4 # Maximum thickness of scars in pixels.
    min_scar_length: 16 # Minimum length of scars in pixels.
grains:
  run: true # Options : true, false
  # Thresholding by height
  threshold_method: std_dev # Options : std_dev, otsu, absolute, unet
  otsu_threshold_multiplier: 1.0
  threshold_std_dev:
    below: 10.0 # Threshold for grains below the image background
    above: 1.0 # Threshold for grains above the image background
  threshold_absolute:
    below: -1.0 # Threshold for grains below the image background
    above: 0.8 # Threshold for grains above the image background
  direction: above # Options: above, below, both (defines whether to look for grains above or below thresholds or both)
  # Thresholding by area
  smallest_grain_size_nm2: 200 # Size in nm^2 of tiny grains/blobs (noise) to remove, must be > 0.0
  absolute_area_threshold:
    above: [1300, 10500] # above surface [Low, High] in nm^2 (also takes null)
    below: [null, null] # below surface [Low, High] in nm^2 (also takes null)
  remove_edge_intersecting_grains: true # Whether or not to remove grains that touch the image border
  unet_config:
    model_path: null # Path to a trained U-Net model
    grain_crop_padding: 2 # Padding to apply to the grain crop bounding box
    upper_norm_bound: 5.0 # Upper bound for normalisation of input data. This should be slightly higher than the maximum desired / expected height of grains.
    lower_norm_bound: -1.0 # Lower bound for normalisation of input data. This should be slightly lower than the minimum desired / expected height of the background.
  vetting:
    class_conversion_size_thresholds: null # Class conversion size thresholds, list of tuples of 3 integers and 2 integers, ie list[tuple[tuple[int, int, int], tuple[int, int]]] eg [[[1, 2, 3], [5, 10]]] for each region of class 1 to convert to 2 if smaller than 5 nm^2 and to class 3 if larger than 10 nm^2.
    class_region_number_thresholds: null # Class region number thresholds, list of lists, ie [[class, low, high],] eg [[1, 2, 4], [2, 1, 1]] for class 1 to have 2-4 regions and class 2 to have 1 region. Can use None to not set an upper/lower bound.
    class_size_thresholds: null # Class size thresholds (nm^2), list of tuples of 3 integers, ie [[class, low, high],] eg [[1, 100, 1000], [2, 1000, None]] for class 1 to have 100-1000 nm^2 and class 2 to have 1000-any nm^2. Can use None to not set an upper/lower bound.
    nearby_conversion_classes_to_convert: null # Class conversion for nearby regions, list of tuples of two-integer tuples, eg [[[1, 2], [3, 4]]] to convert class 1 to 2 and 3 to 4 for small touching regions
    class_touching_threshold: 5 # Number of dilation steps to use for detecting touching regions
    keep_largest_labelled_regions_classes: null # Classes to keep the only largest regions for, list of integers eg [1, 2] to keep only the largest regions of class 1 and 2
    class_connection_point_thresholds: null # Class connection point thresholds, [[[class_1, class_2], [min, max]]] eg [[[1, 2], [1, 1]]] for class 1 to have 1 connection point with class 2
grainstats:
  run: true # Options : true, false
  edge_detection_method: binary_erosion # Options: canny, binary erosion. Do not change this unless you are sure of what this will do.
  cropped_size: -1 # Length (in nm) of square cropped images (can take -1 for grain-sized box)
  extract_height_profile: true # Extract height profiles along maximum feret of molecules
disordered_tracing:
  run: true # Options : true, false
  min_skeleton_size: 10 # Minimum number of pixels in a skeleton for it to be retained.
  pad_width: 5 # Pixels to pad grains by when tracing
  mask_smoothing_params:
    gaussian_sigma: 4 # Gaussian smoothing parameter 'sigma' in pixels.
    dilation_iterations: 4 # Number of dilation iterations to use for grain smoothing.
    holearea_min_max: [0, null] # Range (min, max) of a hole area in nm to refill in the smoothed masks.
  skeletonisation_params:
    method: topostats # Options : zhang | lee | thin | topostats
    height_bias: 0.6 # Percentage of lowest pixels to remove each skeletonisation iteration. 1 equates to zhang.
  pruning_params:
    method: topostats # Method to clean branches of the skeleton. Options : topostats
    max_length: 10.0 # Maximum length in nm to remove a branch containing an endpoint.
    height_threshold: # The height to remove branches below.
    method_values: mid # The method to obtain a branch's height for pruning. Options : min | median | mid.
    method_outlier: mean_abs # The method to prune branches based on height. Options : abs | mean_abs | iqr.
nodestats:
  run: true # Options : true, false
  node_joining_length: 7.0 # The distance in nanometres over which to join nearby crossing points.
  node_extend_dist: 14.0 # The distance in nanometres over which to join nearby odd-branched nodes.
  branch_pairing_length: 20.0 # The length in nanometres from the crossing point to pair and trace, obtaining FWHM's.
  pair_odd_branches: false # Whether to try and pair odd-branched nodes. Options: true and false.
  pad_width: 5 # Pixels to pad grains by when tracing (should be the same as disordered_tracing).
ordered_tracing:
  run: true
  ordering_method: nodestats # The method of ordering the disordered traces.
  pad_width: 5 # Pixels to pad grains by when tracing (should be the same as disordered_tracing).
splining:
  run: true # Options : true, false
  method: "rolling_window" # Options : "spline", "rolling_window"
  rolling_window_size: 20.0e-9 # size in nm of the rolling window.
  spline_step_size: 7.0e-9 # The sampling rate of the spline in metres.
  spline_linear_smoothing: 5.0 # The amount of smoothing to apply to linear features.
  spline_circular_smoothing: 5.0 # The amount of smoothing to apply to circular features.
  spline_degree: 3 # The polynomial degree of the spline.
curvature:
  run: true # Options : true, false
  colourmap_normalisation_bounds: [-0.5, 0.5] # Radians per nm to normalise the colourmap to.
plotting:
  run: true # Options : true, false
  style: topostats.mplstyle # Options : topostats.mplstyle or path to a matplotlibrc params file
  savefig_format: null # Options : null, png, svg or pdf. tif is also available although no metadata will be saved. (defaults to png) See https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.savefig.html
  savefig_dpi: 600 # Options : null (defaults to the value in topostats/plotting_dictionary.yaml), see https://afm-spm.github.io/TopoStats/main/configuration.html#further-customisation and https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.savefig.html
  pixel_interpolation: null # Options : https://matplotlib.org/stable/gallery/images_contours_and_fields/interpolation_methods.html
  image_set: all # Options : all, core
  zrange: [-2, 3] # low and high height range for core images (can take [null, null]). low <= high
  colorbar: true # Options : true, false
  axes: true # Options : true, false (due to off being a bool when parsed)
  num_ticks: [null, null] # Number of ticks to have along the x and y axes. Options : null (auto) or integer > 1
  cmap: null # Colormap/colourmap to use (default is 'nanoscope' which is used if null, other options are 'afmhot', 'viridis' etc.)
  mask_cmap: blue_purple_green # Options : blu, jet_r and any in matplotlib
  histogram_log_axis: false # Options : true, false
summary_stats:
  run: true # Whether to make summary plots for output data
  config: null

To Reproduce

Reproduce using file and the above config file.
Force termination of certain grains during the DNA tracing steps i.e. if grain number = 5 ; assert false

TopoStats Version

Git main branch

Python Version

3.1

Operating System

Windows

Python Packages

absl-py==2.1.0
AFMReader==0.0.1
anyio==4.6.0
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.3.0
art==6.3
asttokens==2.4.1
astunparse==1.6.3
async-lru==2.0.4
attrs==24.2.0
babel==2.16.0
beautifulsoup4==4.12.3
biopython==1.84
bleach==6.1.0
Bottleneck @ file:///C:/b/abs_816hr2khp1/croot/bottleneck_1731058648110/work
certifi==2024.8.30
cffi==1.17.1
charset-normalizer==3.3.2
cheap_repr==0.5.2
colorama==0.4.6
comm==0.2.2
contourpy==1.3.0
cycler==0.12.1
debugpy==1.8.6
decorator==5.1.1
defusedxml==0.7.1
et_xmlfile==2.0.0
exceptiongroup==1.2.2
executing==2.1.0
fastjsonschema==2.20.0
flatbuffers==24.3.25
fonttools==4.54.1
fqdn==1.5.1
gast==0.6.0
google-pasta==0.2.0
grpcio==1.66.2
h11==0.14.0
h5py==3.12.1
httpcore==1.0.6
httpx==0.27.2
idna==3.10
igor2==0.5.8
imageio==2.35.1
ipykernel==6.29.5
ipython==8.28.0
ipython-genutils==0.2.0
ipywidgets==8.1.5
isoduration==20.11.0
jedi==0.19.1
Jinja2==3.1.4
joblib==1.4.2
json5==0.9.25
jsonpointer==3.0.0
jsonschema==4.23.0
jsonschema-specifications==2023.12.1
jupyter-events==0.10.0
jupyter-highlight-selected-word==0.2.0
jupyter-lsp==2.2.5
jupyter_client==8.6.3
jupyter_contrib_core==0.4.2
jupyter_contrib_nbextensions==0.7.0
jupyter_core==5.7.2
jupyter_nbextensions_configurator==0.6.4
jupyter_server==2.14.2
jupyter_server_terminals==0.5.3
jupyterlab==4.2.5
jupyterlab_pygments==0.3.0
jupyterlab_server==2.27.3
jupyterlab_widgets==3.0.13
jupyterthemes==0.20.0
keras==3.6.0
kiwisolver==1.4.7
lazy_loader==0.4
lesscpy==0.15.1
libclang==18.1.1
llvmlite==0.43.0
loguru==0.7.2
lxml==5.3.0
Markdown==3.7
markdown-it-py==3.0.0
MarkupSafe==2.1.5
matplotlib==3.9.2
matplotlib-inline==0.1.7
mdurl==0.1.2
mistune==3.0.2
mkl-service==2.4.0
mkl_fft @ file:///C:/Users/dev-admin/mkl/mkl_fft_1730823082242/work
mkl_random @ file:///C:/Users/dev-admin/mkl/mkl_random_1730822522280/work
ml-dtypes==0.4.1
namex==0.0.8
nbclient==0.10.0
nbconvert==7.16.4
nbformat==5.10.4
nest-asyncio==1.6.0
networkx==3.3
notebook==7.2.2
notebook_shim==0.2.4
numba==0.60.0
numexpr @ file:///C:/b/abs_05o8p7bfml/croot/numexpr_1730215959182/work
numpy @ file:///C:/b/abs_c1ywpu18ar/croot/numpy_and_numpy_base_1708638681471/work/dist/numpy-1.26.4-cp310-cp310-win_amd64.whl#sha256=ebb5aa2b36d8afa5ec3231c19e5a1fc75b6d85e7db483f0fb9e77dad58469977
numpyencoder==0.3.0
openpyxl==3.1.5
opt_einsum==3.4.0
optree==0.13.0
overrides==7.7.0
packaging==24.1
pandas @ file:///C:/b/abs_9aotnvvz16/croot/pandas_1718308978393/work/dist/pandas-2.2.2-cp310-cp310-win_amd64.whl#sha256=2770820b1c01b08888f232dfafd5c214ffee1494d66958a979d587d1ec549abe
pandocfilters==1.5.1
parso==0.8.4
pillow==10.4.0
platformdirs==4.3.6
ply==3.11
prometheus_client==0.21.0
prompt_toolkit==3.0.48
protobuf==4.25.5
psutil==5.9.8
pure_eval==0.2.3
pycparser==2.22
pyfiglet==1.0.2
Pygments==2.18.0
pyparsing==3.1.4
pyspm==0.6.2
python-dateutil @ file:///C:/b/abs_3au_koqnbs/croot/python-dateutil_1716495777160/work
python-json-logger==2.0.7
pytz @ file:///C:/b/abs_6ap4tsz1ox/croot/pytz_1713974360290/work
pywin32==306
pywinpty==2.0.13
PyYAML==6.0.2
pyzmq==26.2.0
referencing==0.35.1
requests==2.32.3
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rich==13.9.2
rpds-py==0.20.0
ruamel.yaml==0.18.6
ruamel.yaml.clib==0.2.8
schema==0.7.7
scikit-image==0.24.0
scikit-learn==1.5.2
scipy==1.14.1
seaborn==0.13.2
Send2Trash==1.8.3
six @ file:///tmp/build/80754af9/six_1644875935023/work
skan==0.11.1
sniffio==1.3.1
snoop==0.4.3
soupsieve==2.6
stack-data==0.6.3
tensorboard==2.17.1
tensorboard-data-server==0.7.2
tensorflow==2.17.0
tensorflow-intel==2.17.0
tensorflow-io-gcs-filesystem==0.31.0
termcolor==2.4.0
terminado==0.18.1
threadpoolctl==3.5.0
tifffile==2024.9.20
tinycss2==1.3.0
tomli==2.0.2
toolz==1.0.0
topoly==1.0.2
-e git+https://github.com/AFM-SPM/TopoStats.git@42a764ee7dde7440309bef5b146a2876f1e53457#egg=topostats
tornado==6.4.1
tqdm==4.66.5
traitlets==5.14.3
types-python-dateutil==2.9.0.20241003
typing_extensions==4.12.2
tzdata @ file:///croot/python-tzdata_1690578112552/work
uri-template==1.3.0
urllib3==2.2.3
wcwidth==0.2.13
webcolors==24.8.0
webencodings==0.5.1
websocket-client==1.8.0
Werkzeug==3.0.4
widgetsnbextension==4.0.13
win32-setctime==1.1.0
wrapt==1.16.0

The text was updated successfully, but these errors were encountered:

ns-rse · 2024-12-11T10:43:19Z

This is likely a result of various pd.merge() where the default of an inner join is being made so if a grain is missing from one it is dropped.

We should switch this to how = "outer".

@MaxGamill-Sheffield to chase up @Marina1595 for a sample file and check that such changes solve the problem.

Marina1595 added the bug Something isn't working label Nov 27, 2024

ns-rse added the v2.3.0 label Dec 11, 2024

ns-rse added this to the v2.3.0 milestone Dec 11, 2024

ns-rse assigned MaxGamill-Sheffield Dec 11, 2024

MaxGamill-Sheffield mentioned this issue Dec 11, 2024

Returns grainstats rows when tracing for a grain fails #1047

Merged

MaxGamill-Sheffield closed this as completed in #1047 Dec 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Information missing from all_stats file for some grains #1028

[Bug]: Information missing from all_stats file for some grains #1028

Marina1595 commented Nov 27, 2024 •

edited by ns-rse

Loading

ns-rse commented Dec 11, 2024

[Bug]: Information missing from all_stats file for some grains #1028

[Bug]: Information missing from all_stats file for some grains #1028

Comments

Marina1595 commented Nov 27, 2024 • edited by ns-rse Loading

Checklist

Describe the bug

Copy of the output

Include the configuration file

To Reproduce

TopoStats Version

Python Version

Operating System

Python Packages

ns-rse commented Dec 11, 2024

Marina1595 commented Nov 27, 2024 •

edited by ns-rse

Loading