Floating point exception #159

morusu · 2016-04-26T11:00:00Z

after thousands iterations, faster-rcnn throw a error "Floating point exception " at ./experiments/scripts/faster_rcnn_end2end.sh . I search the error saying about i/0 or i%0, anyone encountered this?

wait1988 · 2016-04-28T07:08:20Z

I encountered a similar problem.

Solving...
I0428 15:05:27.513572 6443 solver.cpp:242] Iteration 0, loss = 4.65389
I0428 15:05:27.513619 6443 solver.cpp:258] Train net output #0: loss_bbox = 0.190101 (* 1 = 0.190101 loss)
I0428 15:05:27.513628 6443 solver.cpp:258] Train net output #1: loss_cls = 3.44897 (* 1 = 3.44897 loss)
I0428 15:05:27.513635 6443 solver.cpp:258] Train net output #2: rpn_cls_loss = 0.900724 (* 1 = 0.900724 loss)
I0428 15:05:27.513643 6443 solver.cpp:258] Train net output #3: rpn_loss_bbox = 0.119607 (* 1 = 0.119607 loss)
I0428 15:05:27.513656 6443 solver.cpp:571] Iteration 0, lr = 0.001
./experiments/scripts/faster_rcnn_end2end.sh: line 57: 6443 Floating point exception(core dumped) ./tools/train_net.py --gpu ${GPU_ID} --solver models/${PT_DIR}/${NET}/faster_rcnn_end2end/solver.prototxt --weights data/imagenet_models/${NET}.v2.caffemodel --imdb ${TRAIN_IMDB} --iters ${ITERS} --cfg experiments/cfgs/faster_rcnn_end2end.yml ${EXTRA_ARGS}

weichengkuo · 2016-05-04T21:24:47Z

I got the same error and it turned out that I was feeding in empty boxes array. Filtering out roidb properly fixed my problem.

wait1988 · 2016-05-04T23:17:31Z

what does "filtering out roidb properly" mean?Would you please give us more details?

smasoudn · 2016-05-09T19:18:51Z

I've got the same error. By changing the RNG_SEED default value I get error in different iterations. Have you guys found the solution yet? @weichengkuo , I would be thankful if you please elaborate a little bit more. Where should I filter the empty boxes? Thanks!

smichalowski · 2016-05-09T19:55:27Z

take a look #65

weichengkuo · 2016-05-11T03:14:51Z

It's possible that some layer of your faster RCNN receive no boxes at some iteration. I ran into this error multiple times and it's often due to empty boxes. Filtering roidb means to remove the roidb elements that could cause this problem.

daf11865 · 2016-05-12T11:01:28Z

how to solve, please?

morusu · 2016-06-02T04:01:57Z

pad 0 the original image to reasonable aspect ratio (600*1000) will solve this problem.

LiberiFatali · 2016-06-09T03:47:25Z

@morusu So where do we need to modify to 'pad 0s the original image' ?

buaaliyi · 2016-06-20T04:22:26Z

How to fix the code to do 'pad 0 the original image', or still need to preprocess the images first?.
Can you give us an example? Thanks

morusu · 2016-06-22T09:49:56Z

@buaaliyi @LiberiFatali preprocess the images first, pad 0 to images' right-side or down-side to reasonable aspect ratio will be fine.

LiberiFatali · 2016-07-25T02:29:52Z

I got this error while using old code. This problem is solved for me by applying

def filter_roidb(roidb):
"""Remove roidb entries that have no usable RoIs."""

in https://github.com/rbgirshick/py-faster-rcnn/blob/d66cc2bff142ca07f521db06ca3e9e10dbc8df20/lib/fast_rcnn/train.py

vra · 2016-11-18T07:12:14Z

@LiberiFatali Thanks, your solution solved my problem!

fernandorovai · 2016-12-14T19:36:14Z

@vra Where did you apply the filter_roidb function? It is already called in train_net() function (fast_rcnn/train.py). I am facing the same problem as @morusu described. Suddenly my loss goes to nan (overflow encountered in exp). I am using PascalVoc dataset and have no clue about the problem. Anyone solved this issue? Thank you!

vra · 2016-12-15T05:25:56Z

Hi @fernandorovai ,
Sorry I should make it more clearly. I am using RstarCNN, which uses rgb's fast_rcnn reop in it. In fast-rcnn, there is no filter_roidb function. When I added this function in it, my problem solved.
Did you try to descend your learning rate? As far as I known, the nan problem is always related to a large learning rate.

June-Jo · 2017-01-11T10:42:16Z

@vra Hello, does it go well when you add the filter_roidb to train.py? In my case, there is the function of filter_roidb, but I have the problem of 'floating point exception'. I tried to change the learning rate and the RNG_SEED, but it does not go well.

hanjf12 · 2017-04-11T06:38:32Z

@hyunjun-jo hello,I have the same problem,too.I tried to change the learning rate and the RNG_SEED,but it does not go well,too.Have you solved the problem? thx

zqdeepbluesky · 2018-03-12T15:28:54Z

@morusu @wait1988 @weichengkuo @smasoudn @smichalowski
hi,when I train FPN on my own dataset,I met error:

I0312 16:25:25.883342 2983 sgd_solver.cpp:106] Iteration 0, lr = 0.0005
/home/zq/FPN/tools/../lib/rpn/proposal_layer.py:175: RuntimeWarning: invalid value encountered in greater_equal
keep = np.where((ws >= min_size) & (hs >= min_size))[0]
Floating point exception (core dumped)

I try to change lr from 0.001 to 0.0001,but it didn't work.I also change RNG_SEED,and it also didn't work.
I don't know how to solve it.please help me,thanks so much!

amlandas78 · 2018-05-10T10:24:53Z

Have anyone solved the problem? I get the same error at iteration 5800 while using the learning rate at 0.001 and at iteration 18800 while using 0.0001..If someone have solved the problem, please help me to solve it.

st20080675 · 2019-10-17T08:43:39Z

I have solved my 'Floating point exception (core dumped)' problem by modifying the function 'is_valid' in function 'filter_roidb' in file da-faster-rcnn-master/lib/fast_rcnn/train.py:

def filter_roidb(roidb):
"""Remove roidb entries that have no usable RoIs."""

def is_valid(entry):
    # Valid images have:
    #   (1) At least one foreground RoI OR
    #   (2) At least one background RoI
    overlaps = entry['max_overlaps']
    # added to handle empty boxes, see https://github.com/rbgirshick/py-faster-rcnn/issues/159
    not_empty = np.zeros(len(entry['max_overlaps']), dtype=bool)
    cur_boxes = entry['boxes']
    for i in range(len(not_empty)):
        if (cur_boxes[i][2] - cur_boxes[i][0] > 1 and cur_boxes[i][3] - cur_boxes[i][1] > 1):
            not_empty[i] = True

    # find boxes with sufficient overlap
    fg_inds = np.where(overlaps >= cfg.TRAIN.FG_THRESH)[0]
    # Select background RoIs as those within [BG_THRESH_LO, BG_THRESH_HI)
    bg_inds = np.where((overlaps < cfg.TRAIN.BG_THRESH_HI) &
                       (overlaps >= cfg.TRAIN.BG_THRESH_LO) & not_empty)[0]
                           
    # image is only valid if such boxes exist
    valid = len(fg_inds) > 0 or len(bg_inds) > 0
   
    return valid

morusu closed this as completed Jun 2, 2016

eragonruan mentioned this issue Dec 14, 2017

how to improve the performance in detecting small words? eragonruan/text-detection-ctpn#52

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Floating point exception #159

Floating point exception #159

morusu commented Apr 26, 2016

wait1988 commented Apr 28, 2016

weichengkuo commented May 4, 2016 •

edited

Loading

wait1988 commented May 4, 2016

smasoudn commented May 9, 2016

smichalowski commented May 9, 2016 •

edited

Loading

weichengkuo commented May 11, 2016

daf11865 commented May 12, 2016

morusu commented Jun 2, 2016

LiberiFatali commented Jun 9, 2016

buaaliyi commented Jun 20, 2016 •

edited

Loading

morusu commented Jun 22, 2016

LiberiFatali commented Jul 25, 2016

vra commented Nov 18, 2016

fernandorovai commented Dec 14, 2016 •

edited

Loading

vra commented Dec 15, 2016

June-Jo commented Jan 11, 2017

hanjf12 commented Apr 11, 2017

zqdeepbluesky commented Mar 12, 2018

amlandas78 commented May 10, 2018

st20080675 commented Oct 17, 2019 •

edited

Loading

Floating point exception #159

Floating point exception #159

Comments

morusu commented Apr 26, 2016

wait1988 commented Apr 28, 2016

weichengkuo commented May 4, 2016 • edited Loading

wait1988 commented May 4, 2016

smasoudn commented May 9, 2016

smichalowski commented May 9, 2016 • edited Loading

weichengkuo commented May 11, 2016

daf11865 commented May 12, 2016

morusu commented Jun 2, 2016

LiberiFatali commented Jun 9, 2016

buaaliyi commented Jun 20, 2016 • edited Loading

morusu commented Jun 22, 2016

LiberiFatali commented Jul 25, 2016

vra commented Nov 18, 2016

fernandorovai commented Dec 14, 2016 • edited Loading

vra commented Dec 15, 2016

June-Jo commented Jan 11, 2017

hanjf12 commented Apr 11, 2017

zqdeepbluesky commented Mar 12, 2018

amlandas78 commented May 10, 2018

st20080675 commented Oct 17, 2019 • edited Loading

weichengkuo commented May 4, 2016 •

edited

Loading

smichalowski commented May 9, 2016 •

edited

Loading

buaaliyi commented Jun 20, 2016 •

edited

Loading

fernandorovai commented Dec 14, 2016 •

edited

Loading

st20080675 commented Oct 17, 2019 •

edited

Loading