Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local rendezvous is aborting with status: NOT_FOUND: TRTEngineCacheResource not yet created while converting a saved model to trt engine #336

Open
devvaibhav455 opened this issue Apr 1, 2024 · 0 comments

Comments

@devvaibhav455
Copy link

I am trying to convert a tensorflow saved_model to tensorrt engine using the below python script.

from tensorflow.python.compiler.tensorrt import trt_convert as trt

# Conversion Parameters 
conversion_params = trt.TrtConversionParams(precision_mode=trt.TrtPrecisionMode.FP16)

input_saved_model_dir = "/home/administrator/Documents/penguin_behavior_detection/tf_onnx_trt_stuff/seagate_exported_model/saved_model"
output_saved_model_dir = "/home/administrator/Documents/penguin_behavior_detection/tf_onnx_trt_stuff/"

converter = trt.TrtGraphConverterV2(input_saved_model_dir=input_saved_model_dir, conversion_params=conversion_params)

# Converter method used to partition and optimize TensorRT compatible segments
converter.convert()

converter.summary()

# Save the model to the disk 
converter.save(output_saved_model_dir)

This is the structure of seagate_exported_model directory

.
├── checkpoint
│   ├── checkpoint
│   ├── ckpt-0.data-00000-of-00001
│   └── ckpt-0.index
├── pipeline.config
└── saved_model
    ├── assets
    ├── fingerprint.pb
    ├── saved_model.pb
    └── variables
        ├── variables.data-00000-of-00001
        └── variables.index

I get below output on the terminal

2024-04-01 20:36:22.993604: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
WARNING:tensorflow:From /home/administrator/Documents/penguin_behavior_detection/tf_onnx_trt_stuff/convert_to_trt.py:23: calling TrtGraphConverterV2.__init__ (from tensorflow.python.compiler.tensorrt.trt_convert) with conversion_params is deprecated and will be removed in a future version.
Instructions for updating:
Use individual converter parameters instead
2024-04-01 20:36:25.663756: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1928] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9528 MB memory:  -> device: 0, name: NVIDIA TITAN V, pci bus id: 0000:c1:00.0, compute capability: 7.0
2024-04-01 20:36:25.664265: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1928] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 10431 MB memory:  -> device: 1, name: NVIDIA TITAN V, pci bus id: 0000:e1:00.0, compute capability: 7.0
2024-04-01 20:36:42.381116: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 2
2024-04-01 20:36:42.381230: I tensorflow/core/grappler/clusters/single_machine.cc:361] Starting new session
2024-04-01 20:36:42.382733: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1928] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9528 MB memory:  -> device: 0, name: NVIDIA TITAN V, pci bus id: 0000:c1:00.0, compute capability: 7.0
2024-04-01 20:36:42.382881: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1928] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 10431 MB memory:  -> device: 1, name: NVIDIA TITAN V, pci bus id: 0000:e1:00.0, compute capability: 7.0
2024-04-01 20:36:47.126394: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 2
2024-04-01 20:36:47.126499: I tensorflow/core/grappler/clusters/single_machine.cc:361] Starting new session
2024-04-01 20:36:47.127907: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1928] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9528 MB memory:  -> device: 0, name: NVIDIA TITAN V, pci bus id: 0000:c1:00.0, compute capability: 7.0
2024-04-01 20:36:47.128046: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1928] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 10431 MB memory:  -> device: 1, name: NVIDIA TITAN V, pci bus id: 0000:e1:00.0, compute capability: 7.0
2024-04-01 20:36:47.655835: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:186] Calibration with FP32 or FP16 is not implemented. Falling back to use_calibration = False.Note that the default value of use_calibration is True.
2024-04-01 20:36:47.730476: W tensorflow/compiler/tf2tensorrt/segment/segment.cc:970] 

################################################################################
TensorRT unsupported/non-converted OP Report:
        - GatherV2 -> 46x
        - StridedSlice -> 35x
        - Sub -> 30x
        - Shape -> 24x
        - Cast -> 22x
        - ConcatV2 -> 19x
        - Mul -> 19x
        - ExpandDims -> 18x
        - Pack -> 17x
        - Identity -> 17x
        - Select -> 16x
        - Fill -> 15x
        - Reshape -> 15x
        - Placeholder -> 14x
        - Less -> 10x
        - Unpack -> 10x
        - Greater -> 9x
        - AddV2 -> 8x
        - Pad -> 8x
        - Switch -> 8x
        - NonMaxSuppressionV5 -> 7x
        - Minimum -> 7x
        - Merge -> 7x
        - NextIteration -> 6x
        - Enter -> 6x
        - Split -> 5x
        - Slice -> 5x
        - RealDiv -> 4x
        - Maximum -> 4x
        - Round -> 4x
        - Transpose -> 3x
        - Range -> 3x
        - NoOp -> 3x
        - Reciprocal -> 2x
        - Squeeze -> 2x
        - ResizeBilinear -> 2x
        - Exit -> 2x
        - Exp -> 2x
        - TopKV2 -> 2x
        - Tile -> 2x
        - TensorListStack -> 2x
        - TensorListReserve -> 2x
        - TensorListSetItem -> 2x
        - Where -> 1x
        - TensorListGetItem -> 1x
        - TensorListFromTensor -> 1x
        - GreaterEqual -> 1x
        - Sum -> 1x
        - LogicalAnd -> 1x
        - LoopCond -> 1x
--------------------------------------------------------------------------------
        - Total nonconverted OPs: 451
        - Total nonconverted OP Types: 50
For more information see https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#supported-ops.
################################################################################

2024-04-01 20:36:48.146815: W tensorflow/compiler/tf2tensorrt/segment/segment.cc:1298] The environment variable TF_TRT_MAX_ALLOWED_ENGINES=20 has no effect since there are only 10 TRT Engines with  at least minimum_segment_size=3 nodes.
2024-04-01 20:36:48.182719: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:799] Number of TensorRT candidate segments: 10
2024-04-01 20:36:48.224087: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:913] Replaced segment 0 consisting of 6 nodes by TRTEngineOp_000_000.
2024-04-01 20:36:48.224163: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:913] Replaced segment 1 consisting of 1993 nodes by TRTEngineOp_000_001.
2024-04-01 20:36:48.227138: W tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:916] TF-TRT Warning: Cannot replace segment 2 consisting of 6 nodes by TRTEngineOp_000_002 reason: Segment has no inputs (possible constfold failure) (keeping original segment).
2024-04-01 20:36:48.227395: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:913] Replaced segment 3 consisting of 5 nodes by TRTEngineOp_000_003.
2024-04-01 20:36:48.227442: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:913] Replaced segment 4 consisting of 4 nodes by TRTEngineOp_000_004.
2024-04-01 20:36:48.227482: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:913] Replaced segment 5 consisting of 4 nodes by TRTEngineOp_000_005.
2024-04-01 20:36:48.227535: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:913] Replaced segment 6 consisting of 25 nodes by TRTEngineOp_000_006.
2024-04-01 20:36:48.227592: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:913] Replaced segment 7 consisting of 4 nodes by TRTEngineOp_000_007.
2024-04-01 20:36:48.227635: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:913] Replaced segment 8 consisting of 4 nodes by TRTEngineOp_000_008.
2024-04-01 20:36:48.227671: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:913] Replaced segment 9 consisting of 3 nodes by TRTEngineOp_000_009.
TRTEngineOP Name                 Device        # Nodes # Inputs      # Outputs     Input DTypes       Output Dtypes      Input Shapes       Output Shapes     
================================================================================================================================================================

----------------------------------------

TRTEngineOp_000_000              device:GPU:0  6       1             1             ['float32']        ['float32']        [[1, -1, -1, 3]]   [[1, -1, -1, 3]]  

        - Const: 3x
        - Mul: 2x
        - Sub: 1x

----------------------------------------

TRTEngineOp_000_001              device:GPU:0  1931    1             2             ['float32']        ['float32', 'f ... [[1, 640, 640, 3]] [[1, 76725, 4] ...

        - AddV2: 15x
        - BatchMatMulV2: 32x
        - BiasAdd: 122x
        - ConcatV2: 2x
        - Const: 864x
        - Conv2D: 165x
        - DepthwiseConv2dNative: 94x
        - FusedBatchNormV3: 133x
        - MaxPool: 18x
        - Mean: 22x
        - Mul: 149x
        - Pack: 64x
        - Reshape: 69x
        - Sigmoid: 150x
        - Squeeze: 32x

----------------------------------------

TRTEngineOp_000_003              device:GPU:0  7       4             1             ['float32', 'f ... ['float32']        [[57600, 1], [ ... [[57600, 4]]      

        - ConcatV2: 1x
        - Const: 2x
        - Mul: 4x

----------------------------------------

TRTEngineOp_000_004              device:GPU:0  4       4             1             ['float32', 'f ... ['float32']        [[-1, 1], [-1, ... [[-1]]            

        - Mul: 1x
        - Squeeze: 1x
        - Sub: 2x

----------------------------------------

TRTEngineOp_000_005              device:GPU:0  4       4             1             ['float32', 'f ... ['float32']        [[-1, 1], [-1, ... [[-1]]            

        - Mul: 1x
        - Squeeze: 1x
        - Sub: 2x

----------------------------------------

TRTEngineOp_000_006              device:GPU:0  25      1             8             ['float32']        ['float32', 'f ... [[76725, 7]]       [[76725, 7], [ ...

        - Const: 10x
        - Reshape: 8x
        - Slice: 7x

----------------------------------------

TRTEngineOp_000_007              device:GPU:0  6       2             1             ['float32', 'f ... ['float32']        [[57600, 2], [ ... [[57600, 4]]      

        - AddV2: 1x
        - ConcatV2: 1x
        - Const: 2x
        - Mul: 1x
        - Sub: 1x

----------------------------------------

TRTEngineOp_000_008              device:GPU:0  3       1             1             ['float32']        ['float32']        [[76725, 1, 4]]    [[76725, 4]]      

        - Const: 1x
        - Reshape: 1x
        - Unpack: 1x

----------------------------------------

TRTEngineOp_000_009              device:GPU:0  3       1             2             ['float32']        ['float32', 'f ... [[1, 76725, 4]]    [[1, 76725, 1, ...

        - Const: 1x
        - ExpandDims: 1x
        - Squeeze: 1x

================================================================================================================================================================
[*] Total number of TensorRT engines: 9
[*] % of OPs Converted: 78.87% [1989/2522]

2024-04-01 20:36:49.217361: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: NOT_FOUND: TRTEngineCacheResource not yet created
2024-04-01 20:36:49.217649: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: NOT_FOUND: TRTEngineCacheResource not yet created
2024-04-01 20:36:49.217857: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: NOT_FOUND: TRTEngineCacheResource not yet created
2024-04-01 20:36:49.218032: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: NOT_FOUND: TRTEngineCacheResource not yet created
2024-04-01 20:36:49.218200: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: NOT_FOUND: TRTEngineCacheResource not yet created
2024-04-01 20:36:49.218390: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: NOT_FOUND: TRTEngineCacheResource not yet created
2024-04-01 20:36:49.218555: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: NOT_FOUND: TRTEngineCacheResource not yet created
2024-04-01 20:36:49.218810: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: NOT_FOUND: TRTEngineCacheResource not yet created
2024-04-01 20:36:49.218981: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: NOT_FOUND: TRTEngineCacheResource not yet created

Below is my environment

Python: 3.10.13
Tensorflow: 2.16.1
OS: Ubuntu 20.04
TensorRT: 8.6.1
Cuda: 12.1
nVidia driver: 530.30.02

Any help is highly appreciated. Thanks in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant