我将tensorflow训练的量化模型tflite转为onnx,再通过trtexec将onnx转为tensorRT时报出:Assertion failed: K == scale.count()错误,具体如下:
/usr/src/tensorrt/bin/trtexec --onnx=/home/aaeon/share/model/int8/tf2/model_int8_tflite_int32.onnx --saveEngine=/home/aaeon/share/model/int8/tf2/model_int8_tflite.engine
&RUNNING TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --onnx=/home/aaeon/share/model/int8/tf2/model_int8_tflite_int32.onnx --saveEngine=/home/aaeon/share/model/int8/tf2/model_int8_tflite.engine
[08/09/2022-15:44:54] [I] === Model Options ===
[08/09/2022-15:44:54] [I] Format: ONNX
[08/09/2022-15:44:54] [I] Model: /home/aaeon/share/model/int8/tf2/model_int8_tflite_int32.onnx
[08/09/2022-15:44:54] [I] Output:
[08/09/2022-15:44:54] [I] === Build Options ===
[08/09/2022-15:44:54] [I] Max batch: 1
[08/09/2022-15:44:54] [I] Workspace: 16 MB
[08/09/2022-15:44:54] [I] minTiming: 1
[08/09/2022-15:44:54] [I] avgTiming: 8
[08/09/2022-15:44:54] [I] Precision: FP32
[08/09/2022-15:44:54] [I] Calibration:
[08/09/2022-15:44:54] [I] Safe mode: Disabled
[08/09/2022-15:44:54] [I] Save engine: /home/aaeon/share/model/int8/tf2/model_int8_tflite.engine
[08/09/2022-15:44:54] [I] Load engine:
[08/09/2022-15:44:54] [I] Builder Cache: Enabled
[08/09/2022-15:44:54] [I] NVTX verbosity: 0
[08/09/2022-15:44:54] [I] Inputs format: fp32:CHW
[08/09/2022-15:44:54] [I] Outputs format: fp32:CHW
[08/09/2022-15:44:54] [I] Input build shapes: model
[08/09/2022-15:44:54] [I] Input calibration shapes: model
[08/09/2022-15:44:54] [I] === System Options ===
[08/09/2022-15:44:54] [I] Device: 0
[08/09/2022-15:44:54] [I] DLACore:
[08/09/2022-15:44:54] [I] Plugins:
[08/09/2022-15:44:54] [I] === Inference Options ===
[08/09/2022-15:44:54] [I] Batch: 1
[08/09/2022-15:44:54] [I] Input inference shapes: model
[08/09/2022-15:44:54] [I] Iterations: 10
[08/09/2022-15:44:54] [I] Duration: 3s (+ 200ms warm up)
[08/09/2022-15:44:54] [I] Sleep time: 0ms
[08/09/2022-15:44:54] [I] Streams: 1
[08/09/2022-15:44:54] [I] ExposeDMA: Disabled
[08/09/2022-15:44:54] [I] Spin-wait: Disabled
[08/09/2022-15:44:54] [I] Multithreading: Disabled
[08/09/2022-15:44:54] [I] CUDA Graph: Disabled
[08/09/2022-15:44:54] [I] Skip inference: Disabled
[08/09/2022-15:44:54] [I] Inputs:
[08/09/2022-15:44:54] [I] === Reporting Options ===
[08/09/2022-15:44:54] [I] Verbose: Disabled
[08/09/2022-15:44:54] [I] Averages: 10 inferences
[08/09/2022-15:44:54] [I] Percentile: 99
[08/09/2022-15:44:54] [I] Dump output: Disabled
[08/09/2022-15:44:54] [I] Profile: Disabled
[08/09/2022-15:44:54] [I] Export timing to JSON file:
[08/09/2022-15:44:54] [I] Export output to JSON file:
[08/09/2022-15:44:54] [I] Export profile to JSON file:
[08/09/2022-15:44:54] [I]
Input filename: /home/aaeon/share/model/int8/tf2/model_int8_tflite_int32.onnx
ONNX IR version: 0.0.7
Opset version: 13
Producer name: onnx-typecast
Producer version:
Domain:
Model version: 0
Doc string:
[08/09/2022-15:44:56] [E] [TRT] unknown_10_dequant_dequantize_scale_node: at least 4 dimensions are required for input.
[08/09/2022-15:44:56] [E] [TRT] unknown_10_dequant_dequantize_scale_node: at least 4 dimensions are required for input.
ERROR: builtin_op_importers.cpp:840 In function importDequantizeLinear:
[6] Assertion failed: K == scale.count()
[08/09/2022-15:44:56] [E] Failed to parse onnx file
[08/09/2022-15:44:56] [E] Parsing model failed
[08/09/2022-15:44:56] [E] Engine creation failed
[08/09/2022-15:44:56] [E] Engine set up failed
这是什么原因导致的啊?