Skip to content

Question about inference time improvement (TFMIC-56) #113

@ssh4

Description

@ssh4

Checklist

  • Checked the issue tracker for similar issues to ensure this is not a duplicate.
  • Provided a clear description of your suggestion.
  • Included any relevant context or examples.

Issue or Suggestion Description

Hello,

I'm running an inference task with instance segmentation, on XIAO ESP32S3 Sense.
The input for my model is 64x64x3. The model is quantized to ints, but data type is float.
That's its shape:

-- Model Inputs ---
Input 0:
  Index: 0
  Name: serving_default_images:0
  Shape: [ 1 64 64  3] (batch_size, height, width, channels)
  Data Type: <class 'numpy.float32'>
  Quantization: (0.0, 0)

--- Model Outputs ---
Output 0:
  Index: 438
  Name: PartitionedCall:1
  Shape: [ 1 16 16 32]
  Data Type: <class 'numpy.float32'>
  Quantization: (0.0, 0)
Output 1:
  Index: 486
  Name: PartitionedCall:0
  Shape: [ 1 37 84]
  Data Type: <class 'numpy.float32'>
  Quantization: (0.0, 0)

It is YOLOv8 architecture (YOLOv11 is better in size and inference, but some included OPS are not supported by the esp-tflite-micro framework).
The list of operations my model is built with:
QUANTIZE, DEQUANTIZE, PAD, CONV_2D, LOGISTIC, MUL, STRIDED_SLICE, ADD, CONCATENATION, MAX_POOL_2D, RESIZE_NEAREST_NEIGHBOR, TRANSPOSE, RESHAPE, TRANSPOSE_CONV, SOFTMAX, SUB

My first intergration is Arduino based, and inference time is ~2 seconds.
I wanted to improve the timing and ported my integration to ESP-IDF. I built the project not from examples, but from scratch. Unfortunately, the inference time remained the same, around 2 seconds.

Arduino implementation uses "idf-release_v5.4-2f7dcd86-v1/esp32s3/include/espressif__esp-tflite-micro" version.
ESP-IDF is built with espressif/esp-nn:1.1.1 and espressif/esp-tflite-micro:1.3.3~1

My question is - is it possible to improve the inference time somehow on ESP-IDF?
When I decided to move to ESP-IDF, I was really hoping that ESP-NN will improve the numbers, and some hardware utilization features will make the trick. At least something..

There are a few config options from "idf menuconfig", which seemed to me important:

  • optimization for NN functions - Optimized version
  • ESP PSRAM Mode - Octal
  • PSRAM clock speek - 80MHz
  • ESP System Settings - CPU Frequency 240 MHz

Appreciate your help and advice!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions