-
-
Notifications
You must be signed in to change notification settings - Fork 16.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to print the TP,FP,FN,TN in the terminal? #1251
Comments
Hello @ZwNSW, thank you for your interest in our work! Please visit our Custom Training Tutorial to get started, and see our Jupyter Notebook , Docker Image, and Google Cloud Quickstart Guide for example environments. If this is a bug report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you. If this is a custom model or data training question, please note Ultralytics does not provide free personal support. As a leader in vision ML and AI, we do offer professional consulting, from simple expert advice up to delivery of fully customized, end-to-end production solutions for our clients, such as:
For more information please visit https://www.ultralytics.com. |
@ZwNSW TP and FP vectors are computed here: Lines 250 to 319 in c8c5ef3
|
@glenn-jocher Thanks for your answer.When I run the code you gave, there was an error like this. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
@ZwNSW did you manage to find out the values of tp, fp and fn? |
@rita9410 TP and FP vectors are computed here: Lines 250 to 319 in c8c5ef3
|
@glenn-jocher ok, but how can I print these variables by each class? |
@rita9410 YOLOv5 TP and FP vectors are computed here: Lines 250 to 319 in c8c5ef3
They don't print out by default, you'd have to introduce some custom code to see them.
tpc.shape
Out[3]: (3444, 10)
fpc.shape
Out[4]: (3444, 10)
tpc[-1]
Out[5]: array([138, 124, 105, 91, 80, 66, 54, 38, 22, 9])
fpc[-1]
Out[6]: array([3306, 3320, 3339, 3353, 3364, 3378, 3390, 3406, 3422, 3435]) So at 0.5 iou and 0.001 confidence threshold, for class 0, dataset inference results in 138 TPs and 3306 FPs. |
@glenn-jocher, is it a way to print the TPs and FPs at a certain confidence? I mean I'd like to know the number of TPs and FPs at 0.5 iou and 0.60 confidence. Is it possible? And what does the 3444 means in the tcp.shape output? |
@dariogonle the example you copied already prints them out at 10 different IoUs from 0.5 to 0.95. Results are evaluated at the --conf you supply. |
@glenn-jocher thank you for your response, but I would like the following: When I do a test I get the following Precision and Recall. I'd like to know the number of TP and FP used to calculate that Precision and Recall. The precision and recall given are for a certain confidence (the one that maximizes the F1), 0.75 in this case. When I run this test (default conf-thres = 0.001) I get the following TPs and FPs. So the supposed precision, for iou=0.5, should be => P = 262/(262+1984) = 0.11, but in the output the precision is 0.89. If I print the n_l variable in ap_per_class I get 284, so the recall should be R = 262/284 = 0.92, but the output recall is 0.78. I was wondering how recall and precision is calculated. I guess that the number of TPs and FPs shown in the second image are for all confiances, but the precision and recall shown in the first image is for a certain level of confidence (0.75 in this case, because it is the value that maximizes the F1). That's why I asked you if there is a way to output the TPs and FPs for a certain confidence. I'm sure that I'm missing something, but I can't figure it out. Thank you in advance. |
@dariogonle see metrics.py for P and R computation. Lines 19 to 79 in 5c32bd3
|
@glenn-jocher I understand how precision and recall are calculated but I don't understand the number of TPs and FPs that I get. If R = TP/Targets then TP = R*Target = 0.78 * 284 = 222. Then if P = TP / (TP + FP) then FP = 27. With this theoretical values for TPs and FPs it is possible to get a precision of 0.89 and a recall of 0.78. But the output of tpc[-1] and fpc[-1] does not match with this values. |
@dariogonle FPs and TPs are computed in the code I provided above on L53 and L54 |
@ZwNSW @dariogonle good news 😃! Your original issue may now be fixed ✅ in PR #5727. This PR explicitly computes TP and FP from the existing Labels, P, and R metrics: TP = Recall * Labels
FP = TP / Precision - TP These TP and FP per-class vectors are left in val.py for users to access if they want: Line 240 in 36d12a5
To receive this update:
Thank you for spotting this issue and informing us of the problem. Please let us know if this update resolves the issue for you, and feel free to inform us of any other issues you discover or feature requests that come to mind. Happy trainings with YOLOv5 🚀! |
@glenn-jocher |
@a227799770055 we're working on better results introspection tools to allow you to see the worst performing images in a validation set. This isn't near to release yet but should be rolled out over the next few months. I'll add a note to allow for sorting by different metrics like FN, FP, mAP etc. |
@glenn-jocher is there any progress on this? |
Hii, it looks like theres no progress in here, but i want to ask some question because recently i just got this issue and i dont know how to solve it. I already print the TP and FP to the terminal but it looks like the number of TP and FP printed on the terminal doesnt add up with the label in the prediction image. The number of TP and FP is lower than the label present in the prediction image, even theres cases where one of the class i have got soo many FP but the FP number i got in terminal is 0. Is there anyone know about this? Thankyou in Advance |
@Rsphyxs hello! It's possible that there's a discrepancy between the TP and FP values you computed and the labels in the prediction images due to differences in how the metrics are computed. It's also possible that your implementation of printing TP and FP is not accounting for all cases. One way to investigate this further would be to manually compare the labels in the prediction images with the computed TP and FP to see if there are any mismatches. Additionally, you can try checking your implementation of printing TP and FP for any errors or bugs. Hope this helps! Let us know if you have any more questions. |
Thankyou so much for your reply @glenn-jocher . I already check my implementation of printing the TP and FP because i follow your guide for it and i dont think theres issue on it. But i have follow up question, i am newbie in object detection so theres several things i probably didnt know, but i realized that FP value is based on TP and TP got its value from recall * labels right? but i just realized the class that i mention got 0 TP and FP even though theres a label prediction both TP and FP label for its class, and when i checked the corresponding class have 0 Recall, how is that possible? Since what i know is recall is how much TP the class got from all the right label? Below is image i attached to give more information regarding this issue. Thankyou in advance again. For addition information, i test several images with different scenario, and some of it got a right result for TP and FP but the several others dont match up between the labels in prediction images and in the result Labeled image for validation (Sorry i have to censored the face cause its personal) |
@Rsphyxs, it's possible for a class to have a 0 recall value even if there are labels and predictions for that class. Recall is the proportion of true positive predictions out of all the positive ground truth instances, which means that if a class has false negatives (i.e. missed detections), the recall value could be 0 even if there are true positive and false positive detections for that class. In your case, it appears that the class you mentioned has false negatives and no true positives, which is resulting in a 0 recall value. This could explain why the computed TP and FP values differ from the labels in some cases. To investigate this further, you could try comparing the labels and predictions for that class in the validation set to see if there are any missed true positive instances. Additionally, you can try tweaking the model hyperparameters or augmentation strategies to improve the detection performance for that class. Hope this helps! Let us know if you have any more questions or concerns. |
@glenn-jocher , Thanks for the reply again, but i guess theres some mistake since in the RAM class from the image that i given. They have 1 TP (which is in the hand of the person with 0.6 Confidence Score) and 1 FP and 1 FN hence the P and R should be 0.5 right? My first assumption why the ram doesnt detected as an TP is because the IoU threshold is too large so i try to lower the number but still got the same result? am i missing something? |
@Rsphyxs, thanks for bringing this to my attention. I apologize for any confusion caused earlier. You're correct that a class with 1 TP and 1 FP should have precision and recall values of 0.5. It's possible that other factors are affecting the TP values for this class, such as the object's size or the location of the bounding box. Additionally, it's possible that there are bugs or inaccuracies in the TP and FP calculation code that could be contributing to the mismatch between the predicted and computed values. To investigate this further, you could try manually examining the images and labels to see if there are any mismatches or inaccuracies in the detection results. Additionally, you can try tweaking the model's hyperparameters or adjusting the IoU threshold to see if this improves the detection performance for this class. Hope this helps! Let us know if you have any more questions or concerns. |
Hi @glenn-jocher , Can i ask a question in the Precision Recall formulation, can u explain this line of code cause i dont really get it Thankyou in advance! |
Hi @Rsphyxs, Sure, I'd be happy to explain the line of code you're seeing. The line of code you posted is a thresholding operation that sets predicted bounding boxes as true positives (TPs) or false positives (FPs) based on the Intersection over Union (IoU) overlap with the ground truth bounding boxes. The Regarding the question you asked in the chat, yes the result looks valid based on the values being printed. However, please note that the values depend on many factors such as the chosen IoU threshold and the accuracy of the model's predictions. Hope this helps! Let me know if you have any more questions or if you need further clarification. |
@glenn-jocher Thats really helpful thankyou so much with your answer. But i want to ask again cause i just got this problem when im running test with val.py. Because i want to test one of my class which is named RAM with 10 images with the result like this |
@Rsphyxs hello! I'm glad that my previous answer was helpful for you. Regarding your recent issue with the RAM class having 0 precision and recall when testing with a mixed image, it's possible that the calculation for precision and recall for the RAM class is being affected by the presence of other classes in the image. This could be due to:
To investigate this issue further, I recommend that you compare the ground truth labels and the predicted labels for each image, and try to identify any misclassifications or missed detections. Additionally, you can try running the test with only those images that contain RAM class objects and see how the precision and recall values change. If you're still having trouble, feel free to provide some additional information or share more details about your implementation. I'd be happy to assist you further. |
Hi @glenn-jocher , Thanks again for your respond, but i have follow up question. If the Precision and Recall is really 0, should the mAP has the 0 value too? since mAP value come from the Precision Recall? Because in the image i give you before where the Precision and Recall is 0, the mAP still got a value and not 0. So i saw the Precision-Confidence Curve graph between the test where the Precision isnt 0 And it looks like the graph is pretty similiar, so i believe the Precision and Recall it got is no 0 but just printed 0 is that possible? |
@Rsphyxs hello! Regarding your question about the mAP value when the precision and recall are both 0, it is possible for the mAP to have a non-zero value even if the precision and recall for a class are 0. The mAP value is a composite metric that takes into account the precision and recall values for all classes, so the contribution of a single class could be outweighed by the contributions of other classes. Moreover, the precision-confidence graph you shared for the class where the precision is 0 looks similar to that where the precision is non-zero, which suggests that the precision and recall for the class are not actually zero but rather the values have been truncated to zero. This could be due to a problem with the way the TP and FP values are being calculated or how the precision and recall values are being printed. If you could provide more details about the implementation, including the code that computes the TP and FP values and how you are printing the precision and recall values, it would be helpful in figuring out the problem. Hope this helps! Let me know if you have any more questions or concerns. |
Hi @glenn-jocher, for the code is actually just like what YOLOv5 use, because im using YOLOv7 which is based from YOLOv5. I already checked and its pretty much the same [EDIT] Before adding new image the index that it takes from RAM precision is 141 But after adding new image, the index is change from 141 to 818 Is there any reason why we used overall F1 score from all class and not from each classes? And is it possible to change the code so it takes the index from each class F1 max score and not from mean of all F1 scores? And if i do that, is the result still valid? |
Hi @Rsphyxs, I understand that you're facing an issue with the precision and recall values for a particular class in your YOLOv7 implementation, which is based on the YOLOv5 codebase. From the information you've provided, it seems that the max F1 score is being used to pick the precision and recall values for each class. You speculated that the 0 value in precision and recall for the RAM class might be due to the fact that the max F1 score has shifted to a higher index after adding a new image. This occurs because the F1 scores and the max F1 score are calculated over all classes and not per class. Regarding your question about why the max F1 score is used to pick the precision for each class, instead of taking the max precision value for each class, the reason is that the YOLOv5 and YOLOv7 codebases compute mAP based on max F1 score. The use of F1 score to compute mAP is due to its ability to balance both precision and recall. However, this approach may not be appropriate in all cases, and you may consider using the max precision value for each class instead. If you decide to use max precision instead of max F1 score, the results should still be valid, but it may affect the mAP score as it is calculated based on F1 score. Nevertheless, it is important to keep in mind that changing this implementation may also have unintended effects on the model's performance. Hope this helps! Let me know if you have any more questions or concerns. |
Hello @glenn-jocher Thats soo much helpful thankyou, but is there a way i can get my TP and FP without the final Precision Recall value? since i really need TP and FP from each class. Can i get it from TPC and FPC? |
Did you solve the problem? I have the same problem and there is an inconsistency between test results and confusion matrix results. |
Sadly no, for now I'm just using mAP 0.5 as a main parameter for my project |
@enesayan hello, We're sorry to hear that you are facing issues with the precision and recall values for your YOLOv5 implementation. We understand that this is an important metric for your project, and we would like to help you resolve this issue. Regarding your question, you can obtain the TP and FP values for each class by using the TPC and FPC arrays in the confusion matrix. These arrays contain the true positive count and false positive count for each class, respectively. However, if there is an inconsistency between the test results and the confusion matrix results, this could indicate a problem with how the TP and FP values are being computed. Some common issues that could cause such discrepancies include incorrect ground truth labels, incorrect predicted labels, and inconsistent naming of classes. To investigate this further, we recommend that you carefully review the ground truth and predicted labels for each image, and compare them with the labels in the confusion matrix. Additionally, you can try running more tests with different images to see if the same issue persists. If you're still having trouble, please provide more information about your implementation and the specific issues you are facing so that we can assist you better. Thank you. |
❔Question
I want to get tp to analyze my own data set, but I cannot output it after running test.py?
Additional context
The text was updated successfully, but these errors were encountered: