obj detection training with faster_rcnn_inception_v2_coco Argument must be a dense tensor: range(0, 3) - got shape [3], but wanted []. #3948

FalcoGer · 2018-04-11T09:04:17Z

System information

What is the top-level directory of the model you are using:
TensorFlowModels\research\object_detection
Have I written custom code (as opposed to using a stock example script provided in TensorFlow):
No, though PycocoAPI from here
The problem also appeared without those modifications I made:

diff --git a/research/object_detection/core/box_predictor.py b/research/object_detection/core/box_predictor.py
index 6a13970..298a1dc 100644
--- a/research/object_detection/core/box_predictor.py
+++ b/research/object_detection/core/box_predictor.py
@@ -392,7 +392,7 @@ class MaskRCNNBoxPredictor(BoxPredictor):
         the proposals.
     """
     spatial_averaged_image_features = tf.reduce_mean(image_features, [1, 2],
-                                                     keep_dims=True,
+                                                     keepdims=True,
                                                      name='AvgPool')
     flattened_image_features = slim.flatten(spatial_averaged_image_features)
     if self._use_dropout:
diff --git a/research/object_detection/core/losses.py b/research/object_detection/core/losses.py
index 8bc044c..70959b6 100644
--- a/research/object_detection/core/losses.py
+++ b/research/object_detection/core/losses.py
@@ -311,7 +311,7 @@ class WeightedSoftmaxClassificationLoss(Loss):
     num_classes = prediction_tensor.get_shape().as_list()[-1]
     prediction_tensor = tf.divide(
         prediction_tensor, self._logit_scale, name='scale_logit')
-    per_row_cross_ent = (tf.nn.softmax_cross_entropy_with_logits(
+    per_row_cross_ent = (tf.nn.softmax_cross_entropy_with_logits_v2(
         labels=tf.reshape(target_tensor, [-1, num_classes]),
         logits=tf.reshape(prediction_tensor, [-1, num_classes])))
     return tf.reshape(per_row_cross_ent, tf.shape(weights)) * weights
diff --git a/research/object_detection/trainer.py b/research/object_detection/trainer.py
index cf3429a..196af61 100644
--- a/research/object_detection/trainer.py
+++ b/research/object_detection/trainer.py
@@ -225,7 +225,7 @@ def train(create_tensor_dict_fn, create_model_fn, train_config, master, task,

     # Place the global step on the device storing the variables.
     with tf.device(deploy_config.variables_device()):
-      global_step = slim.create_global_step()
+      global_step = tf.train.create_global_step()

     with tf.device(deploy_config.inputs_device()):
       input_queue = create_input_queue(

OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
Windows 7 x64
TensorFlow installed from (source or binary):
Installed with pip
TensorFlow version (use command below):
1.7.0
Bazel version (if compiling from source):
CUDA/cuDNN version:
CPU version
GPU model and memory:
CPU version
Exact command to reproduce:
python train.py --pipeline_config_path="PATH_TO_MODELS/Models/faster_rcnn_inception_v2_coco/pipeline.config" --train_dir="PATH_TO_MODELS/Models/faster_rcnn_inception_v2_coco/train"

Describe the problem

trying to run the command above with tfrecords in the form of

tf_example = tf.train.Example(features=tf.train.Features(feature={
        'image/height': dataset_util.int64_feature(height),
        'image/width': dataset_util.int64_feature(width),
        'image/filename': dataset_util.bytes_feature(filename),
        'image/source_id': dataset_util.bytes_feature(filename),
        'image/encoded': dataset_util.bytes_feature(encoded_jpg),
        'image/format': dataset_util.bytes_feature(image_format),
        'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
        'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
        'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
        'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
        'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
        'image/object/class/label': dataset_util.int64_list_feature(classes),
    }))

and the faster_rcnn_inception_v2_coco model from 2018_01_28 creates the following errors:

Source code / logs

python : WARNING:tensorflow:From C:\Program Files\Python36\lib\site-packages\tensorflow\contrib\learn\python\learn\datasets\base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Use the retry module or similar alternatives.
INFO:tensorflow:Scale of 0 disables regularizer.
INFO:tensorflow:Scale of 0 disables regularizer.
INFO:tensorflow:Scale of 0 disables regularizer.
INFO:tensorflow:Scale of 0 disables regularizer.
INFO:tensorflow:depth of additional conv before box predictor: 0
Traceback (most recent call last):
  File "C:\Program Files\Python36\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 510, in _apply_op_helper
    preferred_dtype=default_dtype)
  File "C:\Program Files\Python36\lib\site-packages\tensorflow\python\framework\ops.py", line 1040, in internal_convert_to_tensor
    ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
  File "C:\Program Files\Python36\lib\site-packages\tensorflow\python\framework\constant_op.py", line 235, in _constant_tensor_conversion_function
    return constant(v, dtype=dtype, name=name)
  File "C:\Program Files\Python36\lib\site-packages\tensorflow\python\framework\constant_op.py", line 214, in constant
    value, dtype=dtype, shape=shape, verify_shape=verify_shape))
  File "C:\Program Files\Python36\lib\site-packages\tensorflow\python\framework\tensor_util.py", line 442, in make_tensor_proto
    _GetDenseDimensions(values)))
ValueError: Argument must be a dense tensor: range(0, 3) - got shape [3], but wanted [].
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "C:\Program Files\Python36\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 524, in _apply_op_helper
    values, as_ref=input_arg.is_ref).dtype.name
  File "C:\Program Files\Python36\lib\site-packages\tensorflow\python\framework\ops.py", line 1040, in internal_convert_to_tensor
    ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
  File "C:\Program Files\Python36\lib\site-packages\tensorflow\python\framework\constant_op.py", line 235, in _constant_tensor_conversion_function
    return constant(v, dtype=dtype, name=name)
  File "C:\Program Files\Python36\lib\site-packages\tensorflow\python\framework\constant_op.py", line 214, in constant
    value, dtype=dtype, shape=shape, verify_shape=verify_shape))
  File "C:\Program Files\Python36\lib\site-packages\tensorflow\python\framework\tensor_util.py", line 442, in make_tensor_proto
    _GetDenseDimensions(values)))
ValueError: Argument must be a dense tensor: range(0, 3) - got shape [3], but wanted [].
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "train.py", line 167, in <module>
    tf.app.run()
  File "C:\Program Files\Python36\lib\site-packages\tensorflow\python\platform\app.py", line 126, in run
    _sys.exit(main(argv))
  File "train.py", line 163, in main
    worker_job_name, is_chief, FLAGS.train_dir)
  File "C:\temp\ObjDetection\TensorFlowModels\research\object_detection\trainer.py", line 255, in train
    train_config.optimizer)
  File "C:\temp\ObjDetection\TensorFlowModels\research\object_detection\builders\optimizer_builder.py", line 50, in build
    learning_rate = _create_learning_rate(config.learning_rate)
  File "C:\temp\ObjDetection\TensorFlowModels\research\object_detection\builders\optimizer_builder.py", line 109, in _create_learning_rate
    learning_rate_sequence, config.warmup)
  File "C:\temp\ObjDetection\TensorFlowModels\research\object_detection\utils\learning_schedules.py", line 169, in manual_stepping
    [0] * num_boundaries))
  File "C:\Program Files\Python36\lib\site-packages\tensorflow\python\ops\array_ops.py", line 2650, in where
    return gen_math_ops.select(condition=condition, x=x, y=y, name=name)
  File "C:\Program Files\Python36\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 7112, in select
    "Select", condition=condition, t=x, e=y, name=name)
  File "C:\Program Files\Python36\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 528, in _apply_op_helper
    (input_name, err))
ValueError: Tried to convert 't' to a tensor and failed. Error: Argument must be a dense tensor: range(0, 3) - got shape [3], but wanted [].

Note: I've been using the detection api last when tensorflow version 1.4 was current. since updating i couldn't run ssd_mobilenet_v1 anymore, since the pretrained model checkpoint seems to missmatch the configuration. so I decided to try faster_rcnn and get this issue instead.

my pipeline config:

model {
  faster_rcnn {
    num_classes: 1
    image_resizer {
      keep_aspect_ratio_resizer {
        min_dimension: 600
        max_dimension: 1024
      }
    }
    feature_extractor {
      type: "faster_rcnn_resnet101"
      first_stage_features_stride: 16
    }
    first_stage_anchor_generator {
      grid_anchor_generator {
        height_stride: 16
        width_stride: 16
        scales: 0.25
        scales: 0.5
        scales: 1.0
        scales: 2.0
        aspect_ratios: 0.5
        aspect_ratios: 1.0
        aspect_ratios: 2.0
      }
    }
    first_stage_box_predictor_conv_hyperparams {
      op: CONV
      regularizer {
        l2_regularizer {
          weight: 0.0
        }
      }
      initializer {
        truncated_normal_initializer {
          stddev: 0.01
        }
      }
    }
    first_stage_nms_score_threshold: 0.0
    first_stage_nms_iou_threshold: 0.70
    first_stage_max_proposals: 100
    first_stage_localization_loss_weight: 2.0
    first_stage_objectness_loss_weight: 1.0
    initial_crop_size: 14
    maxpool_kernel_size: 2
    maxpool_stride: 2
    second_stage_box_predictor {
      mask_rcnn_box_predictor {
        fc_hyperparams {
          op: FC
          regularizer {
            l2_regularizer {
              weight: 0.0
            }
          }
          initializer {
            variance_scaling_initializer {
              factor: 1.0
              uniform: true
              mode: FAN_AVG
            }
          }
        }
        use_dropout: false
        dropout_keep_probability: 1.0
      }
    }
    second_stage_post_processing {
      batch_non_max_suppression {
        score_threshold: 0.30
        iou_threshold: 0.60
        max_detections_per_class: 100
        max_total_detections: 100
      }
      score_converter: SOFTMAX
    }
    second_stage_localization_loss_weight: 2.0
    second_stage_classification_loss_weight: 1.0
  }
}
train_config {
  batch_size: 1
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  optimizer {
    momentum_optimizer {
      learning_rate {
        manual_step_learning_rate {
          initial_learning_rate: 0.0003
# ref: https://github.com/tensorflow/models/issues/3794
#          schedule {
#            step: 0
#            learning_rate: 0.0003
#          }
          schedule {
            step: 900000
            learning_rate: 3.0e-05
          }
          schedule {
            step: 1200000
            learning_rate: 3.0e-06
          }
        }
      }
      momentum_optimizer_value: 0.90
    }
    use_moving_average: false
  }
  gradient_clipping_by_norm: 10.0
  fine_tune_checkpoint: "PATH_TO_PROJECT/Models/faster_rcnn_resnet101_coco/train/model.ckpt"
  from_detection_checkpoint: true
}

# cd							cd C:/temp/ObjDetection/TensorFlowModels/research/object_detection
# train:						python train.py --pipeline_config_path="PATH_TO_PROJECT/Models/faster_rcnn_resnet101_coco/pipeline.config" --train_dir="PATH_TO_PROJECT/Models/faster_rcnn_resnet101_coco/train"
# eval:							python eval.py --pipeline_config_path="PATH_TO_PROJECT/Models/faster_rcnn_resnet101_coco/pipeline.config" --eval_dir="PATH_TO_PROJECT/Models/faster_rcnn_resnet101_coco/eval" --checkpoint_dir="PATH_TO_PROJECT/Models/faster_rcnn_resnet101_coco/train"
# tb:							tensorboard --logdir="PATH_TO_PROJECT/Models/faster_rcnn_resnet101_coco/"
# extract graph:				python export_inference_graph.py --input_type image_tensor --pipeline_config_path "PATH_TO_PROJECT/Models/faster_rcnn_resnet101_coco/pipeline.config" --output_directory "PATH_TO_PROJECT/ExportGraph --trained_checkpoint_prefix "PATH_TO_PROJECT/Models/faster_rcnn_resnet101_coco/train/model.ckpt-REPLACETHIS"

train_input_reader {
  label_map_path: "PATH_TO_PROJECT/Data/labelMap.txt"
  shuffle: true
  tf_record_input_reader {
    input_path: "PATH_TO_PROJECT/Data/train.tfrecord"
  }
}
eval_config {
  num_examples: 8000
  max_evals: 10
  use_moving_averages: false
#  metrics_set: "coco_detection_metrics"
}
eval_input_reader {
  label_map_path: "PATH_TO_PROJECT/Data/labelMap.txt"
  shuffle: false
  num_readers: 1
  tf_record_input_reader {
    input_path: "PATH_TO_PROJECT/Data/val.tfrecord"
  }
}

PATH_TO_PROJECT of course is not really written there, but rather the path would be giving away what I'm working on, and I signed an NDA, so I replaced it here.

The text was updated successfully, but these errors were encountered:

FalcoGer · 2018-04-11T09:25:11Z

I just uninstalled tensorflow 1.7.0 and installed 1.5.0, then rerun the protoc compilation and setup.py install of the tensorflow models
the problem persists.
the same problem is also present with faster_rcnn_resnet101_coco
the same problem is also present with rfcn_resnet101_coco

ssd runs fine except for #3922 but it seems to train decently enough

elkalash · 2018-04-12T02:13:28Z

I'm actually having the same issue here,
WARNING :tensorflow:from"path"is deprecated and will be removed in a future version
:Instructions for updating:
Use the retry module or similar alternatives.
Traceback )most recent call last);
File "src/run_webcap.py",line 32 in module

FalcoGer · 2018-04-12T14:19:27Z

@elkalash the depreciate warning is not the issue here.

robieta · 2018-04-12T16:43:33Z

Hi. Could you modify object_detection\utils\learning_schedules.py to print out global_step, boundaries, and num_boundaries and report the values?

Lanbig · 2018-04-12T20:39:23Z

Got the same issue.

ValueError: Tried to convert 't' to a tensor and failed. Error: Argument must be a dense tensor: range(0, 3) - got shape [3], but wanted [].

Found the solution : #3705 (comment)

robieta · 2018-04-12T21:22:03Z

That does indeed seem to be the case. Thanks a lot for finding and linking that.

Edward1900 · 2018-04-18T09:17:38Z

I got the same problem in win10, how to solve it?

FalcoGer · 2018-04-23T12:36:40Z

@yafengwa
#3705 (comment)

mikemwx · 2018-07-26T14:27:11Z

I got the same error when tried to train faster_rcnn_resnet101 coco from model zoo.
I solved it by packing the list/range appeared in (maybe)
spatial_averaged_image_features = tf.reduce_mean(image_features, [1, 2], keep_dims=True, name='AvgPool')
with
tf.convert_to_tensor(list( """your range or list""" ),dtype=np.int64)

tensorflowbutler assigned robieta Apr 11, 2018

robieta closed this as completed Apr 12, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

obj detection training with faster_rcnn_inception_v2_coco Argument must be a dense tensor: range(0, 3) - got shape [3], but wanted []. #3948

obj detection training with faster_rcnn_inception_v2_coco Argument must be a dense tensor: range(0, 3) - got shape [3], but wanted []. #3948

FalcoGer commented Apr 11, 2018 •

edited

Loading

FalcoGer commented Apr 11, 2018 •

edited

Loading

elkalash commented Apr 12, 2018

FalcoGer commented Apr 12, 2018

robieta commented Apr 12, 2018

Lanbig commented Apr 12, 2018

robieta commented Apr 12, 2018

Edward1900 commented Apr 18, 2018

FalcoGer commented Apr 23, 2018

mikemwx commented Jul 26, 2018 •

edited

Loading

obj detection training with faster_rcnn_inception_v2_coco Argument must be a dense tensor: range(0, 3) - got shape [3], but wanted []. #3948

obj detection training with faster_rcnn_inception_v2_coco Argument must be a dense tensor: range(0, 3) - got shape [3], but wanted []. #3948

Comments

FalcoGer commented Apr 11, 2018 • edited Loading

System information

Describe the problem

Source code / logs

FalcoGer commented Apr 11, 2018 • edited Loading

elkalash commented Apr 12, 2018

FalcoGer commented Apr 12, 2018

robieta commented Apr 12, 2018

Lanbig commented Apr 12, 2018

robieta commented Apr 12, 2018

Edward1900 commented Apr 18, 2018

FalcoGer commented Apr 23, 2018

mikemwx commented Jul 26, 2018 • edited Loading

FalcoGer commented Apr 11, 2018 •

edited

Loading

FalcoGer commented Apr 11, 2018 •

edited

Loading

mikemwx commented Jul 26, 2018 •

edited

Loading