-
Notifications
You must be signed in to change notification settings - Fork 19.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How does model.fit () calculate loss and acc ? Documentation will be helpful. #10426
Comments
Loss and accuracy are calculated as you train, according to the loss and metrics specified in compiling the model. Before you train, you must compile your model to configure the learning process. This allows you to specify the optimizer, loss function, and metrics, which in turn are how the model fit function knows what loss function to use, what metrics to keep track of, etc. The loss function (like binary cross entropy) documentation can be found here and the metrics (like accuracy) documentation can be found here In summary, in order for you to train (and for fit to calculate loss and acc):
|
Thanks , that is a little bit more information, but not exactly the answer to what I was asking. My question is what exactly is the formula used for calculating loss and accuracy per batch ( e.g. the numbers that get displayed during model.fit() ) . If you know where that is exactly done in the code , that answer should be right there. |
Thanks again - I have seen those code by the way. Let me try again. The following is an output from a model.fit () session . 5/125 [>.............................] - ETA: 4:35 - loss: 1.1342 - acc: 0.5250 Which exact variable holds the loss (e.g. 1.1342) and accuracy (e.g. 0.5250) ? |
I believe that the loss is stored here; similarly the accuracy is handled by the |
Thanks again. Basically my question is what mathematically the loss and accuracy displayed during model.fit() are. Is it averaged over the batch , or some other function is used to calculate the loss/acc that gets displayed on the screen. The answer is still not there, hence need to keep looking till the answer is found and document it. |
For training loss, keras does a running average over the batches. For validation loss, a conventional average over all the batches in validation data is performed. The training accuracy is the average of the accuracy values for each batch of training data during training. |
Good. Is the running average over just the batch or there is carryover from previous batch ? Would be great to see the code for that. |
The code is here, |
You are probably close. If the code ( see below) indeed is responsible for the fit() per batch loss/acc output, then the mystery is probably is how 'v' got into the logs list before entering the on_batch_end() function . This function primarily seems to be for keeping track variables for on_epoch_end () calculations.
. |
'v' gets added into the batch logs here.
|
That is great. So, it is still not clear if the per batch loss/acc is the average over just that batch or something else. |
Sorry, how do Keras calculate loss (e.g mse) when doing predict_on_batch. For example you have 10 batches, the model normally will return 10 losses |
@raymond-yuan When we choose "acc" to calculate accuracy, how does keras choose the calculating function? I found that there are several accuracy functions. |
@DayChan, check a comment to this page https://stackoverflow.com/a/59637143/5536388. |
After having anxiety by reading this. I believe the conclusion is:
Did I understand correctly? |
All good but the last point training part. I'll sum this up again + extras:
|
I tried this line of code:
And got:
See how loss and ACC are not the same. We have two values here, the 'computation' value (in this case Another option (which I think less likely) is that the return value is |
@NEGU93, I followed the code again, here is how it works:
That's how the code works. You can check that what I say is true by running this code for 2 "batches"/examples: https://colab.research.google.com/drive/12gTrW-k0TntAawVjXv-tVLp4lmM5lFXx?usp=sharing. I created this in the past to simulate how metrics and loss functions work in Numpy. I'm not sure what causes the difference in your code. |
I think it uses functions like binary_entropy , binary_accuracy etc. Is any smoothing used ? Some description in the documentation will be good ( pointers to the exact code will also help).
The text was updated successfully, but these errors were encountered: