-
Notifications
You must be signed in to change notification settings - Fork 11.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
no accuracy metrics while training #1839
Comments
Try using custom_callbacks. Counting accuracy takes a lot of time, so don't use it every epoch. mean_average_precision_callback = modellib.MeanAveragePrecisionCallback(model, model.train(dataset_train, dataset_val, |
also add this somewhere in the model.py, for example at the end of the file
|
Hello guys, thanks for sharing the code to have accuracy metrics. I try to implement the code but I have errors :( Then, in train.py I defined the variable like you did @VtlNmnk: But when I launch my training, I have this error:
What variable do you expect for model inference ? Thanks for you time, |
I did not write to you to change anything in the train.py ) |
Hi, I am sorry but I am trying to implement your code but the dataset: Dataset in the def init part when defining the callback comes up as "Dataset not defined". Thank you for any suggestions :) |
Hi @ben975, You can use below class to calculate MAP, precision, recall for each image
Usage : prepare data set:
Create object of config and load model
|
Wow thank for the really fast reply! |
Hi @mhtarora39 im sorry but when I run it I seem to get: File "", line 1, in File "", line 28, in evaluate_model NameError: name 'compute_ap' is not defined Thanks again for your help I am new to mask rcnns |
Sorry I didn't mentioned please import compute_ap as "from mrcnn.utils import compute_ap". I am also updating above code with import let me know if you encounter any other issue i will update code accordingly |
seems to be running good thank you so much! |
Hi @VtlNmnk I have made the changes that were mentioned by you and was able to run the code without any errors, but when I start training mAP is not getting printed. |
Hi, @hardikmanek! Can you show the code how you initialize the callback? |
Hi @VtlNmnk Please find the attached screenshot. I am passing the training command as a command-line argument This is the initial code of the function copied in the last part of model.py By the way, the complete execution command for the program is: I got the mAP value after 9th Epoch which is 0.3, not sure what's wrong Thank you so much. |
Can anyone in this thread explain how to get the loss output logging that we see in hardikmanik's screenshot? I'm talking about the various losses logged to stdout - I'm only seeing the total loss and none of the rest (like rpn_class_loss, rpn_box_loss, etc..). |
hi,@VtlNmnk .i used your code but have the error model.py:
my train.py: thanks a lot!! |
set verbose = 1 |
you showed your code in the model.py that the markup is not visible ) |
Sorry for the long reply. Well, the mAP was calculated, but not after the third epoch. Perhaps there were no saved models to count it earlier? Check how often your models are saved. |
I customized the "https://github.com/matterport/Mask_RCNN.git" repository to train with my own data set, for object detection, ignoring the mask segmentation part. Now I am evaluating my results, I can calculate the MAP, but I cannot calculate the F1 score. I have this function: compute_ap, from "https://github.com/matterport/Mask_RCNN/blob/master/mrcnn/utils.py" which returns the "mAP, details, memories, overlays" for each image. The point is that I cannot apply the F1 score formula, because the variables "precision" and "recalls" are lists.
|
I have tried all the solutions I'm facing the same issue ...... any solution? |
Hello, I have tried this and I'm getting mAP value as 0 after every 3 epochs.... any suggestion why is the value of mAP zero???? |
@rupa1118 I couldn't get the loss outputs. At this point I'm suspecting it might have something to do with the tensorflow/keras version you're using - maybe some older version prints these values? However, I'm not going to change the version just for printing the loss. You could also try writing a custom callback to print these numbers, I guess (I haven't tried that - instead I put some tf.Print lines in the code to see the losses for debugging (which is ugly, but was fast)). |
@rupa1118 Does the balloon example work for you? I would try to first add the mAP to a known working example, and add the changes one at a time. Perhaps you have something wrong with the masks, as your val_loss also looks strange. |
Hello, I'm trying to reproduce your @VtlNmnk VtlNmnk code. Using TensorFlow backend. This is my model.py file ############################################################ Custom Callbacks############################################################ from keras.callbacks import Callback class MeanAveragePrecisionCallback(Callback): ############################################################
############################################################ Any help will be appreciated |
Hi @yoya93,Try checking train function in model.py and if custom_callbacks = None then please remove that None as this function will be accepting callback as parameter. |
Hello @VtlNmnk thank you very much for your response. I am currently facing another problem. Training throws me this error: Traceback (most recent call last):
|
Thank you for sharing this code, and I have a small question: I am wondering why is there only dataset_val in the mean_average_precision_callback, |
Hey, @VtlNmnk, is it available to get the accuracy during training, it would be better if we could get the accuracy during training:) |
Hello @VtlNmnk can you explain how the validation works during the training process?? and do we need to change the mean values of the image in the configuration file?? |
It’s hard to say without additional information. |
I found I had the same using running on Colab. I think it's due to the Keras version they use (2.3.1 at the time of writing this). It seems the newer versions have a model.add_metric() method for adding a tensor to the list of metrics:
Note that this is immediately after the
|
Hello, thank you very much for your code. Your code is useful to me. How can I save the mAP value from training? |
In model.py, replace self.keras_model.fit_generator(..) with h=self.keras_model.fit_generator(..) and return h (Keras history object) as a result of train method of model.py. Then in your script, after h=model.train(...), refer to h.history['loss'], h.history['mrcnn_class_loss'], etc. for plotting loss diagrams. |
By Default matterport added tensorboard callbacks during training.In your log directory you will get event files along with your weights files. You can refer that file for tensorboard or you can directly run tensorboard on that directly you will get the all the losses. Tensorboard command : tensorboard --logdir "path/to/logs" --port 8888(or your opened port) |
hi @mhtarora39 I am trying to use your code to calculate map, precision, and recall but get the following error:
This is my implementation:
Any suggestions would be appreciated |
Can anyone give me a suggestion on how to solve this problem after the -------- ? I'm trying to run a Mask R-CNN code. train_data/labelme_json/DSCN46592_json/img.png
|
Hey I have the exact same issue. Did you find a solution to this error?? |
Hi @VtlNmnk , how did you get the save_each_n_epoch argument? I couldn't find it in model.py I managed to run the balloon training with the custom callback MeanAveragePrecisionCallback, however the mAP calculation was never printed out during the training. Did I initialize the callback correctly?
|
Many thanks for this! I was able to get this working straightaway; the only change I made was to calculate the mAP at the end of every epoch for my purposes. Just a word of caution. I don't know if it's just me, but when viewing the plot on Tensorboard (which considers the epoch numbers to be 0-based as indicated by the steps on the X axes), the plot for It looks like this is due to the sequence in which the callbacks are called. In Therefore, when the I changed the sequence of the callbacks, to make the Tensorboard callback the last one. That corrected the problem, and I now see the mAP values at the correct step in the plot. # The callback to save the model checkpoint
callbacks = [keras.callbacks.ModelCheckpoint(self.checkpoint_path, verbose=0, save_weights_only=True)]
# Add custom callbacks to the list
if custom_callbacks:
callbacks += custom_callbacks
# The tensorboard callback is last so that any metric logged by the custom callbacks would be picked up as part of the same epoch
callbacks.append(keras.callbacks.TensorBoard(log_dir=self.log_dir, histogram_freq=0, write_graph=True, write_images=False)) |
Hello @VtlNmnk, Below this what I did. |
Hey! You need to use 2 configurations, created on the basis of the base one, which can be imported like this |
I wrote to you, and wrote about this earlier: use a separate instance of the configuration with BATCH_SIZE = 1 for Inference and another, separate instance, with a different name, for training. And of course, it will have BATCH_SIZE = 2 or how much your GPU can handle. |
I have the same issue , how did you solve it please ! |
model_inference = modellib.MaskRCNN(mode="inference", config=_InfConfig(), model_dir=MODEL_DIR) |
Thank you ,i tried this :
But still the same error |
This is my costumTrain : The Script is modified to train on custom Labelme Data.import os Root directory of the projectROOT_DIR = os.path.dirname(os.path.abspath(os.path.abspath(file))) In case of configurationgit clone the repo and append the path Mask_RCNN repo to use the custom configurationsgit clone https://github.com/matterport/Mask_RCNNsys.path.append("path_to_mask_rcnn")maskrrr=sys.path.append("/content/drive/MyDrive/business_card_extract_information/buisness_card_extract_information/maskrcnn/Mask-RCNN-Implementation") Download Linkhttps://github.com/matterport/Mask_RCNN/releases/download/v2.0/mask_rcnn_coco.h5if not os.path.isfile("mask_rcnn_coco.h5"): Path to trained weights fileCOCO_WEIGHTS_PATH = os.path.join(ROOT_DIR, "mask_rcnn_coco.h5") Directory to save logs and model checkpoints, if not providedthrough the command line argument --logsDEFAULT_LOGS_DIR = os.path.join(ROOT_DIR, "logs") Change it for your dataset's namesource="Dataset2021" My Model Configurations (which you should change for your own task)############################################################ class ModelConfig(Config):
class InferenceConfig(ModelConfig): ############################################################ Dataset (My labelme dataset loader)############################################################ class LabelmeDataset(utils.Dataset):
def train(dataset_train, dataset_val, model):
def test(model, image_path = None, video_path=None, savedfile=None):
############################################################ Training and Validating############################################################ if name == 'main':
|
I incorporated mean average precision as a callback (as mentioned above) for my validation dataset during training in every 5 epochs but I am getting nan values and a runtime warning as below. I cannot figure out what might be the cause as my validation dataset has only the images with the objects in it and ground truth files as shapefiles . Any help would be greatly appreciated. Thanks |
@sohinimallick @tomjordan and anyone else having this issue: I just had this same issue today, and was stuck on it for a while. Turns out, something was overriding the default value of
So just pass |
@VtlNmnk That's awsome idea but can u help me to plot the accuracy and losses seen on the output arguments. I have have a topic to submit for such parameters after 3 weeks. thanks in advance! |
|
Also I get the feeling that part of the reason lies in that the compile function predefines the loss var as None on lines 2188-2190.... correct me if I'm wrong. I appreciate any insight moving forward |
Under
model.py
we see the keras model compileAnd then the metrics for the Mask RCNN metrics are added:
But when training the model, neither
train_accuracy
norval_accuracy
are reported. How to add these metrics and report them with each epoch, alongside the mrcnn metrics?The text was updated successfully, but these errors were encountered: