Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Augmenting only some classes in instance segmentation dataset that was annotated using the COCO format #7

Closed
engmubarak48 opened this issue Aug 28, 2019 · 34 comments

Comments

@engmubarak48
Copy link

engmubarak48 commented Aug 28, 2019

Hi,

I really thank you for this amazing work/research/tool. it eases a lot of work that would have consumed a lot of time for everyone.

I wanted to try augmenting an instance segmentation data that was annotated via coco format. but I want to augment only a couple of classes, instead of augmenting all classes. Is such an option available in your tool.

Thanks.

@joheras
Copy link
Owner

joheras commented Aug 28, 2019

Hi,

I am not sure if I understand what you are trying to do. Do you want to ignore images containing the other classes? What happens if an image contains classes that you are interested in and classes that you want to ignore?

Could you provide a small example showing what you want to do?

Best,

@engmubarak48
Copy link
Author

engmubarak48 commented Aug 28, 2019

Hi thanks for your reply.

Assume I have a dataset of images that have 4 classes, each class consists of images ( see the attached figure). And I want to augment images in only three classes (in my case, class 1,2, & 4).

Also, assume that data is annotated COCO format ( JSON like). In COCO formatted data, there are categories, IDs, and names specified for each class. So is there a way to exclude images of one class/category/ from augmenting.

download

@joheras
Copy link
Owner

joheras commented Aug 28, 2019

Hi,

Just checking, your images containing class 4 do not contain class 3, do they?

In any case, that feature is not available in CLoDSa, so feel free to implement it and let me know if you need help. I can also implement it by myself, but it can take me some time.

Best.

@engmubarak48
Copy link
Author

@joheras thanks,

Yes, of course, I am assuming images containing class 3 to not be in other classes.

Anyhow, thanks.

@joheras
Copy link
Owner

joheras commented Aug 28, 2019

Ok, I think that I might have a new version of the library including such a feature by the end of the week.

@engmubarak48
Copy link
Author

Okay thanks,

I will leave it for you then... the reason I highlighted to you is; this could help solve the problem of unbalanced data. which causes models to overfit to overly-represented classes. this can also solve a lot of industrial problems where it is difficult to acquire images from some classes. so it is worth implementing. and let me know if you need any assistance in this.

@joheras
Copy link
Owner

joheras commented Aug 29, 2019

I think that the new functionality is ready, you can find it if you update clodsa to the version 1.2.33.

There is a notebook explaining the new functionality in:
https://github.com/joheras/CLoDSA/blob/master/notebooks/CLODSA_Instance_Segmentation_Ignore_Classes.ipynb

You can also run it in Colab:
https://colab.research.google.com/github/joheras/CLoDSA/blob/master/notebooks/CLODSA_Instance_Segmentation_Ignore_Classes.ipynb

Let me know if that was what you were thinking about.

@engmubarak48
Copy link
Author

Great work, I was also reading your code. specifically the one for coco format annotation. And I realized you are using "cv2.findContours". But sometimes., some images might have small contours (as in my case.. e.g. very small bacteria shapes) which might be difficult for "cv2.findContours" to extract or get contours. which then leads the code to throw errors. So, in such cases skipping that image would be okay, and that is what I am doing now. When I finish I will send a pull request for modifications so that someone else doesn't get frustrated with such errors.

@joheras
Copy link
Owner

joheras commented Aug 29, 2019

Great. Thanks.

@joheras joheras closed this as completed Aug 29, 2019
@engmubarak48
Copy link
Author

I think that the new functionality is ready, you can find it if you update clodsa to the version 1.2.33.

There is a notebook explaining the new functionality in:
https://github.com/joheras/CLoDSA/blob/master/notebooks/CLODSA_Instance_Segmentation_Ignore_Classes.ipynb

You can also run it in Colab:
https://colab.research.google.com/github/joheras/CLoDSA/blob/master/notebooks/CLODSA_Instance_Segmentation_Ignore_Classes.ipynb

Let me know if that was what you were thinking about.

I have tried the COCO augmentation, but I am facing a problem that I can figure out so far. when I try to ignore such masks that have "None" contours the saved "annotation" file doesn't contain annotations of the new augmented images. In other words, images are augmented correctly but the annotation array in the generated coco file is empty. please let me know if you have encountered such a problem.

Thanks

@joheras
Copy link
Owner

joheras commented Sep 2, 2019

I have not encountered such a problem. Could you send me a small dataset where I can reproduce the problem?

Best

@engmubarak48
Copy link
Author

Here it @joheras
test.zip

Thanks

@joheras
Copy link
Owner

joheras commented Sep 2, 2019

In order to reproduce the issue. Could you tell me the class that you ignore and the augmentation techniques that you are using? If you are using a Jupyter notebook for the augmentation process, it would be easier if you send it to me.

@joheras joheras reopened this Sep 2, 2019
@engmubarak48
Copy link
Author

engmubarak48 commented Sep 2, 2019

Hi @joheras

I modified the "cocoLinearInstanceSegmentationAugmentor.py" such that it runs when 'cnts' is empty. the small change I made that ran for me is below:

for (j, transformer) in enumerate(transformers):
    (newimage, newmasklabels) = transformer.transform(image, maskLabels)

    cv2.imwrite(outputPath + str(j) + "_" + name, newimage)
    newSegmentations = []
    for (mask, label) in newmasklabels:
        cnts = cv2.findContours(mask.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
        cnts = cnts[0] if imutils.is_cv2() else cnts[1]
        try:
        #if cnts is not None:
           #print(cnts)
            #print(cnts[0])
            #print(cnts[0][0])
           #print(cnts[0][1])
            segmentation = [[x[0][0], x[0][1]] for x in cnts[0]]
            # Closing the polygon
            segmentation.append(segmentation[0])

            newSegmentations.append((label, cv2.boundingRect(cnts[0]), segmentation, cv2.contourArea(cnts[0])))
        except:
            continue
    allNewImagesResult.append((str(j) + "_" + name, (w, h), newSegmentations))

output.zip

Also, I have attached a zip file containing the jupyter notebook, outputted images, and annotations.

If you open the annotation file, you can see that the 'annotation' list at the end is empty.

In the modified script, I only added try/except. using "if cnts is not None:" produces another problem

anyhow, all the problem arises from finding contours.....

@joheras
Copy link
Owner

joheras commented Sep 2, 2019

You have not added the zip file.

@engmubarak48
Copy link
Author

You have not added the zip file.

can you refresh the page once more

@joheras
Copy link
Owner

joheras commented Sep 2, 2019

Now, I see it. I will check this issue tomorrow and let you know.
Best

@engmubarak48
Copy link
Author

Now, I see it. I will check this issue tomorrow and let you know.
Best

Great....

Thanks.

@joheras
Copy link
Owner

joheras commented Sep 3, 2019

I think that I have fixed the problem. Could you check whether the attached file is the correct result?
output.zip

The fixed version is in GitHub and also can be installed via pip

@engmubarak48
Copy link
Author

Thanks let me check

@engmubarak48
Copy link
Author

engmubarak48 commented Sep 3, 2019

I think that I have fixed the problem. Could you check whether the attached file is the correct result?
output.zip

The fixed version is in GitHub and also can be installed via pip


_RemoteTraceback Traceback (most recent call last)
_RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/mubarak/miniconda3/envs/logo-detection/lib/python3.7/site-packages/joblib/externals/loky/process_executor.py", line 418, in _process_worker
r = call_item()
File "/home/mubarak/miniconda3/envs/logo-detection/lib/python3.7/site-packages/joblib/externals/loky/process_executor.py", line 272, in call
return self.fn(*self.args, **self.kwargs)
File "/home/mubarak/miniconda3/envs/logo-detection/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 567, in call
return self.func(*args, **kwargs)
File "/home/mubarak/miniconda3/envs/logo-detection/lib/python3.7/site-packages/joblib/parallel.py", line 225, in call
for func, args, kwargs in self.items]
File "/home/mubarak/miniconda3/envs/logo-detection/lib/python3.7/site-packages/joblib/parallel.py", line 225, in
for func, args, kwargs in self.items]
File "/home/mubarak/miniconda3/envs/logo-detection/lib/python3.7/site-packages/clodsa/augmentors/cocoLinearInstanceSegmentationAugmentor.py", line 47, in readAndGenerateInstanceSegmentation
if len(cnts)>0:
TypeError: object of type 'NoneType' has no len()
"""

The above exception was the direct cause of the following exception:

TypeError Traceback (most recent call last)
in
----> 1 augmentor.applyAugmentation()

~/miniconda3/envs/logo-detection/lib/python3.7/site-packages/clodsa/augmentors/cocoLinearInstanceSegmentationAugmentor.py in applyAugmentation(self)
87 (self.outputPath, self.transformers, self.imagesPath, self.dictImages[x],
88 self.dictAnnotations[x],self.ignoreClasses)
---> 89 for x in self.dictImages.keys())
90
91 data = {}

~/miniconda3/envs/logo-detection/lib/python3.7/site-packages/joblib/parallel.py in call(self, iterable)
932
933 with self._backend.retrieval_context():
--> 934 self.retrieve()
935 # Make sure that we get a last message telling us we are done
936 elapsed_time = time.time() - self._start_time

~/miniconda3/envs/logo-detection/lib/python3.7/site-packages/joblib/parallel.py in retrieve(self)
831 try:
832 if getattr(self._backend, 'supports_timeout', False):
--> 833 self._output.extend(job.get(timeout=self.timeout))
834 else:
835 self._output.extend(job.get())

~/miniconda3/envs/logo-detection/lib/python3.7/site-packages/joblib/_parallel_backends.py in wrap_future_result(future, timeout)
519 AsyncResults.get from multiprocessing."""
520 try:
--> 521 return future.result(timeout=timeout)
522 except LokyTimeoutError:
523 raise TimeoutError()

~/miniconda3/envs/logo-detection/lib/python3.7/concurrent/futures/_base.py in result(self, timeout)
430 raise CancelledError()
431 elif self._state == FINISHED:
--> 432 return self.__get_result()
433 else:
434 raise TimeoutError()

~/miniconda3/envs/logo-detection/lib/python3.7/concurrent/futures/_base.py in __get_result(self)
382 def __get_result(self):
383 if self._exception:
--> 384 raise self._exception
385 else:
386 return self._result

TypeError: object of type 'NoneType' has no len()

I don't know what you have done, but with the same data and jupyter notebook, I am getting that error. which is clear. I tried to put another "If statement" to skip when the cnts is 'None', but then I get the below error

"""
Traceback (most recent call last):
File "/home/mubarak/miniconda3/envs/logo-detection/lib/python3.7/site-packages/joblib/externals/loky/process_executor.py", line 418, in _process_worker
r = call_item()
File "/home/mubarak/miniconda3/envs/logo-detection/lib/python3.7/site-packages/joblib/externals/loky/process_executor.py", line 272, in call
return self.fn(*self.args, **self.kwargs)
File "/home/mubarak/miniconda3/envs/logo-detection/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 567, in call
return self.func(*args, **kwargs)
File "/home/mubarak/miniconda3/envs/logo-detection/lib/python3.7/site-packages/joblib/parallel.py", line 225, in call
for func, args, kwargs in self.items]
File "/home/mubarak/miniconda3/envs/logo-detection/lib/python3.7/site-packages/joblib/parallel.py", line 225, in
for func, args, kwargs in self.items]
File "/home/mubarak/miniconda3/envs/logo-detection/lib/python3.7/site-packages/clodsa/augmentors/cocoLinearInstanceSegmentationAugmentor.py", line 48, in readAndGenerateInstanceSegmentation
segmentation = [[x[0][0], x[0][1]] for x in cnts[0]]
File "/home/mubarak/miniconda3/envs/logo-detection/lib/python3.7/site-packages/clodsa/augmentors/cocoLinearInstanceSegmentationAugmentor.py", line 48, in
segmentation = [[x[0][0], x[0][1]] for x in cnts[0]]
IndexError: invalid index to scalar variable.
"""

@joheras
Copy link
Owner

joheras commented Sep 3, 2019

Is the same dataset that you sent me previously? I am not able to reproduce such error.

@engmubarak48
Copy link
Author

Is the same dataset that you sent me previously? I am not able to reproduce such error.

yes, the same data, and the same jupyter code I sent you yesterday.... can we check together if you are available now.

@engmubarak48
Copy link
Author

As far as I understood, you added this line of code. which I have already tried before.

len(cnts)>0:

@joheras
Copy link
Owner

joheras commented Sep 3, 2019 via email

@engmubarak48
Copy link
Author

Of course, I have used the modified one. but the error persists

@joheras
Copy link
Owner

joheras commented Sep 4, 2019

Hi,
I have tried it in Colab and it also works:
https://colab.research.google.com/drive/1r-YQgrye7ynD0Ay4XW5VrqUJ9H_pANmF

Could it be an issue related to your version of OpenCV? It seems that it works fine with version 3.4 of OpenCV, but maybe you are using a different version.

@engmubarak48
Copy link
Author

Hi,

Let me check the version of OpenCV

Thanks

@engmubarak48
Copy link
Author

I am using OpenCV version '4.1.0'. Do you think this is the problem? I will downgrade, and get back to you if the problem still persists.

@joheras
Copy link
Owner

joheras commented Sep 4, 2019 via email

@engmubarak48
Copy link
Author

Nice, I was about to try. But good to hear that you figure out the problem.

@engmubarak48
Copy link
Author

Very interesting, it working now. i will check the annotation results if it is as expected.

But so far no problem. but we should put some-kind exception or version check in the code. so that when there is a particular version it gives the expected results from contours.

Thanks.

@joheras
Copy link
Owner

joheras commented Sep 4, 2019

It is fixed now for version 4 of OpenCV.

@joheras joheras closed this as completed Sep 4, 2019
@engmubarak48
Copy link
Author

engmubarak48 commented Sep 6, 2019

Hi @joheras

Results are very interesting. Even updating to the latest version by referencing to this issue #8. The bounding boxes are mislocated and sometimes my ground-through annotations are totally gone after augmenting.

for example, look at the below images. Image one is the ground-through image and the second one is the augmented image. i can say it is a huge difference.

could you please confirm the output results of the images I have shared with you earlier. maybe we can reproduce the bug.

download1

download2

NB: this image is part of the ignored classes. And it consists of one type class (e.g. class 3). In ignoring some classes, I was expecting nothing to happen to such classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants