Best Practice

In QA's Best Practice, we tell users how to use QA when annotating digital pathology data. This practice includes the preparation of slides, annotation tricks, training tricks, etc.

1. What magnification of images should be uploaded?

Theoretically, users could upload the image slides of any magnification. However, we recommend dividing the WSIs of different primitives into different magnifications to get the best segmentation result. 40X is recommended for small scales tissues like cell nuclei, lymphocytes, and glomeruli. 10X is recommended for medium scale tissue like tubules. 5X is recommended for large scale tissues like epithelium and stroma. In general, the larger the tissue scale is, the smaller the magnification is. Improper magnification would confuse the classifier.

Different layers of WSIs have different magnifications, depending on the base magnification that the image was scanned at and how the scanner stores the pyramids.

Please check OpenSlide Python documentation for more details @ https://openslide.org/api/python/#

2. When should a user click the train button?

The classifier needs 2 annotated (1 training patch + 1 testing patch) to start. We recommend annotating 4 patches to start model 1. Then the user could train the nth model when having 2^(n+1) patches in the dataset. We do not recommend training a new model after annotating every single ROI because the model would converge similarly by adding only one more ROI when having many annotations in the data set. That is why we designed our selection rectangle in the form of the 2^n, which corresponds to the logarithm growth in the requirement of annotation data.

Note:

We assume each patch is a size 256 x 256. Annotating 1 patch of 512 x 512 is equivalent to annotating 4 patches of 256 x 256.
When the digital slides are of excellent quality, we could change 2 in the 2^(n-1) to a larger integer. In this case, the user would expect the behavior of the classifier converges faster.
The future version of QA will automate this process.

3. How should the user pick which regions to annotate?

QA provided an embedding plot, helping to select patches for annotation dispersed in the model space. Selecting these patches could help the classifier performs more robustly.

Another way to choose regions is to select the false positive or negative regions of predictions to annotate. (E.g., red circles)

4. How to set numepochs in the training process?

QA employs u-net modeling. The user could set the numepoch in the config.ini file. QA also provides the options for setting num_epochs_earlystop, the early stop epoch number, so the model would stop training when reaching early stop criteria. (e.g., the model is already convergent)

5. When should the user not use QA?

Quick Annotator provides its speed efficiently by reducing the interaction time per object. As such, in cases where very few structures of interest are present, such as delineating a hand full of tumor regions on whole slide images, there is likely minimal value in employing QA. The efficiency improvement afforded by QA would not compensate for the time the user spends in setting up the tool, importing the data, training, and applying the models.

A user is annotating large epithelium regions. There is minimal value to employ QA.

In cases like these, it is likely that the slides could be annotated more efficiently by manually doing so using tools such as QuPath.

6. When should the user focus on making annotation rather than training the model?

This is a continuing discussion of the previous question. Before answering this question, we want to provide some supplemental discussion about why we have this question. The user would like to train a deep learning model giving suggestions for annotations that perfectly match up with manual results. It would be impossible to get such models within an ideal amount of time due to the limitation of annotation data and computation power capacity. Nevertheless, QA is not designed to train a perfect histologic deep learning model. However, QA's essential purposes are to help pathologists and doctors rapidly bootstrap annotations so that they could start post annotation studies (e.g., biomarker analysis). Simultaneously, these annotations could be provided as training data for future studies in the community.

AS mentioned in the Workflow Chart, the user would start to largely accept prediction results to modify with minor edition after getting a sophisticated model. Therefore, the user needs to decide when to stop using QA to train the deep learning model and start to focus on the annotation process. It is important to note that the user is the final arbiter of an acceptable annotation and always has the ability to manually adjust any pixel they are in disagreement with. When should the user stop training and focusing on the annotation?

This is the most frequent case when the user should start to focus on annotations. The user should stop training when the model performs well enough to make only minor modifications before submission. Therefore, the user would not want to spend too much time improving marginal performance. The efficiency improvement of the new model would not compensate for the time spent on training. We provide a demo of a QA's classifier that gives a near perfect suggestion, and the user only needs to make the minimal manual correction before accepting. In the below example, the user only needs to make minor modifications for the deep learning suggestions result before submitting. However, it would take too much time to train a new model while improving minimal performance.

A user is annotating colon tubules image slides.

The user should focus on making annotations when the image is obviously an outlier. QA's deep learning model will be confused if the image is dissimilar to the training data's images. These are most likely to be these cases: they are the same histologic structure of different stain levels; they are in the same stain level but different histologic structures; this image contains a pen mark, etc... In these scenarios, the classifier could possibly still give very good predictions depending on how the classifier treats these images. (These images are probably similarly enough for the computer but not for human eyes) While it is also likely that the classifier would give a very poor prediction on this specific image tile. In this case, the user would like to focus on the modification/annotations instead of training the model to improve the performance. Remember, the essential goal of QA is to bootstrap well quality annotations rapidly.
The user should focus on the annotations when the image quality is ill-conditioned. The user could possibly get very poor prediction results after making many accurate annotations. This could be due to the quality of images in the dataset. For example, the images are out of focus, and the images slides are not well stained, etc. In this case, we would recommend focusing on the annotations because the potential model improvement is very limited due to the quality of the dataset. If the user would want to explore more of this case, he/she could use some verified dataset to get some insights about QA's efficiency improvement. Another potential cause of approach worth attempting is t retrain the autoencoder. See details here in the FAQ.

Quick Annotator Wiki

QA's Wiki is complete documentation that explains to user how to use this tool and the reasons behind. Here is the catalogue for QA's wiki page:

Home:

Quick Annotator Pages

User Guide

Frequently Asked Questions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly