diff --git a/doc/under_sampling.rst b/doc/under_sampling.rst
index 13798ad78..ff502aa44 100644
--- a/doc/under_sampling.rst
+++ b/doc/under_sampling.rst
@@ -125,9 +125,9 @@ It would also work with pandas dataframe::
   >>> df_resampled, y_resampled = rus.fit_resample(df_adult, y_adult)
   >>> df_resampled.head()  # doctest: +SKIP
 
-:class:`NearMiss` adds some heuristic rules to select samples
-:cite:`mani2003knn`. :class:`NearMiss` implements 3 different types of
-heuristic which can be selected with the parameter ``version``::
+:class:`NearMiss` undersamples data based on heuristic rules to select the
+observations :cite:`mani2003knn`. :class:`NearMiss` implements 3 different
+methods to undersample, which can be selected with the parameter ``version``::
 
   >>> from imblearn.under_sampling import NearMiss
   >>> nm1 = NearMiss(version=1)
@@ -135,12 +135,14 @@ heuristic which can be selected with the parameter ``version``::
   >>> print(sorted(Counter(y_resampled).items()))
   [(0, 64), (1, 64), (2, 64)]
 
-As later stated in the next section, :class:`NearMiss` heuristic rules are
-based on nearest neighbors algorithm. Therefore, the parameters ``n_neighbors``
-and ``n_neighbors_ver3`` accept classifier derived from ``KNeighborsMixin``
-from scikit-learn. The former parameter is used to compute the average distance
-to the neighbors while the latter is used for the pre-selection of the samples
-of interest.
+
+:class:`NearMiss` heuristic rules are based on the nearest neighbors algorithm.
+Therefore, the parameters ``n_neighbors`` and ``n_neighbors_ver3`` accept either
+integers with the size of the neighbourhood to explore or a classifier derived
+from the ``KNeighborsMixin`` from scikit-learn. The parameter ``n_neighbors`` is
+used to compute the average distance to the neighbors while ``n_neighbors_ver3``
+is used for the pre-selection of the samples from the majority class, only in
+version 3. More details about NearMiss in the next section.
 
 Mathematical formulation
 ^^^^^^^^^^^^^^^^^^^^^^^^
@@ -175,19 +177,16 @@ is the largest.
    :scale: 60
    :align: center
 
-In the next example, the different :class:`NearMiss` variant are applied on the
-previous toy example. It can be seen that the decision functions obtained in
+In the next example, the different :class:`NearMiss` variants are applied on the
+previous toy example. We can see that the decision functions obtained in
 each case are different.
 
-When under-sampling a specific class, NearMiss-1 can be altered by the presence
-of noise. In fact, it will implied that samples of the targeted class will be
-selected around these samples as it is the case in the illustration below for
-the yellow class. However, in the normal case, samples next to the boundaries
-will be selected. NearMiss-2 will not have this effect since it does not focus
-on the nearest samples but rather on the farthest samples. We can imagine that
-the presence of noise can also altered the sampling mainly in the presence of
-marginal outliers. NearMiss-3 is probably the version which will be less
-affected by noise due to the first step sample selection.
+When under-sampling a specific class, NearMiss-1 can be affected by noise. In
+fact, samples of the targeted class located around observations from the minority
+class tend to be selected, as shown in the illustration below (see yellow class).
+NearMiss-2 might be less affected by noise as it does not focus on the nearest
+samples but rather on the farthest samples. NearMiss-3 is probably the version
+which will be less affected by noise due to the first step of sample selection.
 
 .. image:: ./auto_examples/under-sampling/images/sphx_glr_plot_comparison_under_sampling_003.png
    :target: ./auto_examples/under-sampling/plot_comparison_under_sampling.html
@@ -198,7 +197,7 @@ Cleaning under-sampling techniques
 ----------------------------------
 
 Cleaning under-sampling techniques do not allow to specify the number of
-samples to have in each class. In fact, each algorithm implement an heuristic
+samples to have in each class. In fact, each algorithm implements an heuristic
 which will clean the dataset.
 
 .. _tomek_links:
@@ -214,20 +213,20 @@ defined such that for any sample :math:`z`:
 
    d(x, y) < d(x, z) \text{ and } d(x, y) < d(y, z)
 
-where :math:`d(.)` is the distance between the two samples. In some other
-words, a Tomek's link exist if the two samples are the nearest neighbors of
-each other. In the figure below, a Tomek's link is illustrated by highlighting
-the samples of interest in green.
+where :math:`d(.)` is the distance between the two samples. In other words,
+a Tomek's link exists if two samples are nearest neighbors of each other,
+but belong to a different class. In the figure below, a Tomek's link is illustrated
+highlighting the samples of interest in green.
 
 .. image:: ./auto_examples/under-sampling/images/sphx_glr_plot_illustration_tomek_links_001.png
    :target: ./auto_examples/under-sampling/plot_illustration_tomek_links.html
    :scale: 60
    :align: center
 
-The parameter ``sampling_strategy`` control which sample of the link will be
+The parameter ``sampling_strategy`` controls which sample of the Tomek link will be
 removed. For instance, the default (i.e., ``sampling_strategy='auto'``) will
-remove the sample from the majority class. Both samples from the majority and
-minority class can be removed by setting ``sampling_strategy`` to ``'all'``. The
+remove the sample from the majority class. However, both the samples from the majority
+and minority class can be removed by setting ``sampling_strategy`` to ``'all'``. The
 figure illustrates this behaviour.
 
 .. image:: ./auto_examples/under-sampling/images/sphx_glr_plot_illustration_tomek_links_002.png
@@ -311,15 +310,19 @@ Condensed nearest neighbors and derived algorithms
 
 :class:`CondensedNearestNeighbour` uses a 1 nearest neighbor rule to
 iteratively decide if a sample should be removed or not
-:cite:`hart1968condensed`. The algorithm is running as followed:
+:cite:`hart1968condensed`. The algorithm runs as follows:
 
 1. Get all minority samples in a set :math:`C`.
 2. Add a sample from the targeted class (class to be under-sampled) in
    :math:`C` and all other samples of this class in a set :math:`S`.
-3. Go through the set :math:`S`, sample by sample, and classify each sample
-   using a 1 nearest neighbor rule.
-4. If the sample is misclassified, add it to :math:`C`, otherwise do nothing.
-5. Reiterate on :math:`S` until there is no samples to be added.
+3. Train a 1-KNN on `C`.
+4. Go through the samples in set :math:`S`, sample by sample, and classify each one
+   using a 1 nearest neighbor rule (trained in 3).
+5. If the sample is misclassified, add it to :math:`C`, and go to step 6.
+6. Repeat steps 3 to 5 until all observations in `S` have been examined.
+
+The final dataset is `S`, containing all observations from the minority class and
+those from the majority that were miss-classified by the successive 1-KNN algorithms.
 
 The :class:`CondensedNearestNeighbour` can be used in the following manner::
 
@@ -329,14 +332,29 @@ The :class:`CondensedNearestNeighbour` can be used in the following manner::
   >>> print(sorted(Counter(y_resampled).items()))
   [(0, 64), (1, 24), (2, 115)]
 
-However as illustrated in the figure below, :class:`CondensedNearestNeighbour`
-is sensitive to noise and will add noisy samples.
+However, as illustrated in the figure below, :class:`CondensedNearestNeighbour`
+is sensitive to noise and may select noisy samples.
+
+In an attempt to remove noisy observations, :class:`OneSidedSelection`
+will first find the observations that are hard to classify, and then will use
+:class:`TomekLinks` to remove noisy samples :cite:`hart1968condensed`.
+:class:`OneSidedSelection` runs as follows:
 
-In the contrary, :class:`OneSidedSelection` will use :class:`TomekLinks` to
-remove noisy samples :cite:`hart1968condensed`. In addition, the 1 nearest
-neighbor rule is applied to all samples and the one which are misclassified
-will be added to the set :math:`C`. No iteration on the set :math:`S` will take
-place. The class can be used as::
+1. Get all minority samples in a set :math:`C`.
+2. Add a sample from the targeted class (class to be under-sampled) in
+   :math:`C` and all other samples of this class in a set :math:`S`.
+3. Train a 1-KNN on `C`.
+4. Using a 1 nearest neighbor rule trained in 3, classify all samples in
+   set :math:`S`.
+5. Add all misclassified samples to :math:`C`.
+6. Remove Tomek Links from :math:`C`.
+
+The final dataset is `S`, containing all observations from the minority class,
+plus the observations from the majority that were added at random, plus all
+those from the majority that were miss-classified by the 1-KNN algorithms. Note
+that differently from :class:`CondensedNearestNeighbour`, :class:`OneSidedSelection`
+does not train a KNN after each sample is missclassified. It uses the one KNN
+to classify all samples from the majority in 1 pass. The class can be used as::
 
   >>> from imblearn.under_sampling import OneSidedSelection
   >>> oss = OneSidedSelection(random_state=0)
@@ -344,8 +362,8 @@ place. The class can be used as::
   >>> print(sorted(Counter(y_resampled).items()))
   [(0, 64), (1, 174), (2, 4404)]
 
-Our implementation offer to set the number of seeds to put in the set :math:`C`
-originally by setting the parameter ``n_seeds_S``.
+Our implementation offers the possibility to set the number of observations
+to put at random in the set :math:`C` through the parameter ``n_seeds_S``.
 
 :class:`NeighbourhoodCleaningRule` will focus on cleaning the data than
 condensing them :cite:`laurikkala2001improving`. Therefore, it will used the
diff --git a/imblearn/under_sampling/_prototype_selection/_condensed_nearest_neighbour.py b/imblearn/under_sampling/_prototype_selection/_condensed_nearest_neighbour.py
index 738110cae..93302860a 100644
--- a/imblearn/under_sampling/_prototype_selection/_condensed_nearest_neighbour.py
+++ b/imblearn/under_sampling/_prototype_selection/_condensed_nearest_neighbour.py
@@ -47,7 +47,10 @@ class CondensedNearestNeighbour(BaseCleaningSampler):
         be used.
 
     n_seeds_S : int, default=1
-        Number of samples to extract in order to build the set S.
+        Number of samples from the majority class to add randomly to the set
+        with all minority observations before training the first KNN model. In
+        the original implementation is 1, but more samples can be added with this
+        parameter.
 
     {n_jobs}
 
@@ -70,13 +73,13 @@ class CondensedNearestNeighbour(BaseCleaningSampler):
     -----
     The method is based on [1]_.
 
-    Supports multi-class resampling. A one-vs.-rest scheme is used when
+    Supports multi-class resampling. A one-vs.-one scheme is used when
     sampling a class as proposed in [1]_.
 
     References
     ----------
-    .. [1] P. Hart, "The condensed nearest neighbor rule,"
-       In Information Theory, IEEE Transactions on, vol. 14(3),
+    .. [1] P. Hart, "The condensed nearest neighbor rule",
+       in Information Theory, IEEE Transactions on, vol. 14(3),
        pp. 515-516, 1968.
 
     Examples
@@ -124,7 +127,7 @@ def _validate_estimator(self):
         else:
             raise ValueError(
                 f"`n_neighbors` has to be a int or an object"
-                f" inhereited from KNeighborsClassifier."
+                f" inherited from KNeighborsClassifier."
                 f" Got {type(self.n_neighbors)} instead."
             )
 
@@ -168,7 +171,8 @@ def _fit_resample(self, X, y):
                 # Check each sample in S if we keep it or drop it
                 for idx_sam, (x_sam, y_sam) in enumerate(zip(S_x, S_y)):
 
-                    # Do not select sample which are already well classified
+                    # Do not select samples which are already well classified
+                    # (or were already selected -randomly- to be part of C)
                     if idx_sam in good_classif_label:
                         continue
 
@@ -177,7 +181,7 @@ def _fit_resample(self, X, y):
                         x_sam = x_sam.reshape(1, -1)
                     pred_y = self.estimator_.predict(x_sam)
 
-                    # If the prediction do not agree with the true label
+                    # If the prediction does not agree with the true label
                     # append it in C_x
                     if y_sam != pred_y:
                         # Keep the index for later
@@ -191,9 +195,9 @@ def _fit_resample(self, X, y):
                         # fit a knn on C
                         self.estimator_.fit(C_x, C_y)
 
-                        # This experimental to speed up the search
-                        # Classify all the element in S and avoid to test the
-                        # well classified elements
+                        # This is experimental to speed up the search
+                        # Classify all the elements in S and avoid testing the
+                        # correctly classified elements
                         pred_S_y = self.estimator_.predict(S_x)
                         good_classif_label = np.unique(
                             np.append(idx_maj_sample, np.flatnonzero(pred_S_y == S_y))
diff --git a/imblearn/under_sampling/_prototype_selection/_nearmiss.py b/imblearn/under_sampling/_prototype_selection/_nearmiss.py
index ec3f33cfe..0050c96df 100644
--- a/imblearn/under_sampling/_prototype_selection/_nearmiss.py
+++ b/imblearn/under_sampling/_prototype_selection/_nearmiss.py
@@ -36,20 +36,24 @@ class NearMiss(BaseUnderSampler):
 
     n_neighbors : int or estimator object, default=3
         If ``int``, size of the neighbourhood to consider to compute the
-        average distance to the minority point samples.  If object, an
-        estimator that inherits from
-        :class:`~sklearn.neighbors.base.KNeighborsMixin` that will be used to
-        find the k_neighbors.
-        By default, it will be a 3-NN.
+        average distance to the minority samples. If object, an estimator
+        that inherits from :class:`~sklearn.neighbors.base.KNeighborsMixin`
+        that will be used to find the k_neighbors. By default, it considers
+        the 3 closest neighbours.
 
     n_neighbors_ver3 : int or estimator object, default=3
-        If ``int``, NearMiss-3 algorithm start by a phase of re-sampling. This
-        parameter correspond to the number of neighbours selected create the
-        subset in which the selection will be performed.  If object, an
-        estimator that inherits from
+        NearMiss version 3 starts by a phase of under-sampling where it selects
+        those observations from the majority class that are closest neighbors
+        to the minority class.
+
+        If ``int``, indicates to the number of neighbours to be selected in
+        the first step. The subset in which the selection will be performed.
+        If object, an estimator that inherits from
         :class:`~sklearn.neighbors.base.KNeighborsMixin` that will be used to
-        find the k_neighbors.
-        By default, it will be a 3-NN.
+        find the k_neighbors. By default, the 3 closest neighbours to the
+        minority observations will be selected.
+
+        Only used in version 3.
 
     {n_jobs}
 
@@ -75,7 +79,7 @@ class NearMiss(BaseUnderSampler):
     References
     ----------
     .. [1] I. Mani, I. Zhang. "kNN approach to unbalanced data distributions:
-       a case study involving information extraction," In Proceedings of
+       a case study involving information extraction", in Proceedings of
        workshop on learning from imbalanced datasets, 2003.
 
     Examples
@@ -125,7 +129,7 @@ def _selection_dist_based(
             Associated label to X.
 
         dist_vec : ndarray, shape (n_samples, )
-            The distance matrix to the nearest neigbour.
+            The distance matrix to the nearest neighbor.
 
         num_samples: int
             The desired number of samples to select.
@@ -133,7 +137,7 @@ def _selection_dist_based(
         key : str or int,
             The target class.
 
-        sel_strategy : str, optional (default='nearest')
+        sel_strategy : str, default='nearest'
             Strategy to select the samples. Either 'nearest' or 'farthest'
 
         Returns
@@ -169,13 +173,13 @@ def _selection_dist_based(
             reverse=sort_way,
         )
 
-        # Throw a warning to tell the user that we did not have enough samples
-        # to select and that we just select everything
+        # Raise a warning to tell the user that there were not enough samples
+        # to select from and thus, that all samples will be selected
         if len(sorted_idx) < num_samples:
             warnings.warn(
                 "The number of the samples to be selected is larger"
                 " than the number of samples available. The"
-                " balancing ratio cannot be ensure and all samples"
+                " balancing ratio cannot be ensured and all samples"
                 " will be returned."
             )
 
diff --git a/imblearn/under_sampling/_prototype_selection/_one_sided_selection.py b/imblearn/under_sampling/_prototype_selection/_one_sided_selection.py
index 305abec0b..84daa6195 100644
--- a/imblearn/under_sampling/_prototype_selection/_one_sided_selection.py
+++ b/imblearn/under_sampling/_prototype_selection/_one_sided_selection.py
@@ -41,11 +41,14 @@ class OneSidedSelection(BaseCleaningSampler):
         nearest neighbors. If object, an estimator that inherits from
         :class:`~sklearn.neighbors.base.KNeighborsMixin` that will be used to
         find the nearest-neighbors. If `None`, a
-        :class:`~sklearn.neighbors.KNeighborsClassifier` with a 1-NN rules will
+        :class:`~sklearn.neighbors.KNeighborsClassifier` with a 1-NN rule will
         be used.
 
     n_seeds_S : int, default=1
-        Number of samples to extract in order to build the set S.
+        Number of samples from the majority class to add randomly to the set
+        with all minority observations before training the first KNN model. In
+        the original implementation is 1, but more samples can be added with this
+        parameter.
 
     {n_jobs}
 
@@ -71,7 +74,7 @@ class OneSidedSelection(BaseCleaningSampler):
     References
     ----------
     .. [1] M. Kubat, S. Matwin, "Addressing the curse of imbalanced training
-       sets: one-sided selection," In ICML, vol. 97, pp. 179-186, 1997.
+       sets: one-sided selection", in ICML, vol. 97, pp. 179-186, 1997.
 
     Examples
     --------
@@ -150,8 +153,9 @@ def _fit_resample(self, X, y):
                 C_x = _safe_indexing(X, C_indices)
                 C_y = _safe_indexing(y, C_indices)
 
-                # create the set S with removing the seed from S
-                # since that it will be added anyway
+                # create the set S with all samples of the current class
+                # except those in the seed from S
+                # since they were added to C_x already
                 idx_maj_extracted = np.delete(idx_maj, sel_idx_maj, axis=0)
                 S_x = _safe_indexing(X, idx_maj_extracted)
                 S_y = _safe_indexing(y, idx_maj_extracted)
diff --git a/imblearn/under_sampling/_prototype_selection/_tomek_links.py b/imblearn/under_sampling/_prototype_selection/_tomek_links.py
index c3d84b61a..4d9f05cbc 100644
--- a/imblearn/under_sampling/_prototype_selection/_tomek_links.py
+++ b/imblearn/under_sampling/_prototype_selection/_tomek_links.py
@@ -54,7 +54,7 @@ class TomekLinks(BaseCleaningSampler):
 
     References
     ----------
-    .. [1] I. Tomek, "Two modifications of CNN," In Systems, Man, and
+    .. [1] I. Tomek, "Two modifications of CNN", in Systems, Man, and
        Cybernetics, IEEE Transactions on, vol. 6, pp 769-772, 1976.
 
     Examples
@@ -91,10 +91,10 @@ def is_tomek(y, nn_index, class_type):
         ----------
         y : ndarray of shape (n_samples,)
             Target vector of the data set, necessary to keep track of whether a
-            sample belongs to minority or not.
+            sample belongs to minority class.
 
         nn_index : ndarray of shape (len(y),)
-            The index of the closes nearest neighbour to a sample point.
+            Index with the closest nearest neighbour to a sample.
 
         class_type : int or str
             The label of the minority class.
@@ -102,21 +102,24 @@ def is_tomek(y, nn_index, class_type):
         Returns
         -------
         is_tomek : ndarray of shape (len(y), )
-            Boolean vector on len( # samples ), with True for majority samples
+            Boolean vector of len( # samples ), with True for majority samples
             that are Tomek links.
         """
         links = np.zeros(len(y), dtype=bool)
 
-        # find which class to not consider
+        # find which class not to consider
         class_excluded = [c for c in np.unique(y) if c not in class_type]
 
-        # there is a Tomek link between two samples if they are both nearest
-        # neighbors of each others.
+        # there is a Tomek link between two samples if they are nearest
+        # neighbors of each other, and from a different class.
         for index_sample, target_sample in enumerate(y):
             if target_sample in class_excluded:
                 continue
 
             if y[nn_index[index_sample]] != target_sample:
+                # corroborate that they are neighbours of each other:
+                # (if A's closest neighbour is B, but B's closest neighbour
+                # is C, then A and B are not a Tomek link)
                 if nn_index[nn_index[index_sample]] == index_sample:
                     links[index_sample] = True