i.MX8MP Vision Processing

In this weeks video we cover the basics for the Basler camera integration into the BSP and how the imx8-isp is integrated into the video. Then we spend the rest of the video talking about how I tuned a generic python opencv script to make use of the onboard NN accelerator and tuned the pipeline process in order to be able to process 1080p at 30 frames per second.

Links referenced in the video:

NXP Documentation

Python Script Gist and Web Page to download the tflite model used.

If you have any additional questions or comments feel free to drop them below.


Hey @jnettlet

I am using the zeus imx-5.4.47-2.2.0.xml manifest

I’v added the following to my local.conf
IMAGE_INSTALL_append += " packagegroup-imx-ml"
TOOLCHAIN_TARGET_TASK_append += “tensorflow-lite-staticdev tensorflow-lite-dev armnn-dev onnxruntime-dev”

now i can’t seem to find vx_delegate.so in /usr/lib i reckon that i need to compile tensorflow-lite with -DLITE_XNNPACK and replace its revision with TM_VX or something?

please help me out understanding how i can communicate with the NPU

btw does it work without a special delegator?

Thanks for reaching out. Is it possible for you to use the hardknott branch? The NXP BSP for the machine learning features is much better in that release. That is the version I have used for all of my testing and videos.

Hey Jnettlet

I’m afraid this is not possible for us, we need to figure out how to launch it in zeus 5.4

i can maybe use other delegator such as nnapi or armnn but i can’t seem to understand how to add those so i can have a delegate shared object.

Thanks for the quick response

It looks like Zeus is using the older tflite integration that is not using OpenVX and delegates. I recommend you read NXP’s documentation for doing machine learning with Zeus. This is already a 2+ year old BSP and a lot has changed from the original software that was released for the iMX8MP

I’ve managed to run an example of the tensorflow-lite which uses nnapi, i managed to see the NPU in action according to the benchmark. but i noticed fallback to the cpu all of the times, and very high warming times. so i am willing to make a hardknott yocto for the sake of getting better results.

if you can please tell me how did you configurate your local.conf/bblayers.conf on hardknott to use openvx delegator with tensorflow lite?

You should just need to add IMAGE_INSTALL_append += " packagegroup-imx-ml" as you have done above. For the iMX8M that will pull in tensorflow-lite-vx-delegate

Thanks Jnettlet i will try asap

i am trying to follow your advise about hardknott but i am struggling to boot my som after i uuu the hardknott img i’ve created following this guide:GitHub - SolidRun/meta-solidrun-arm-imx8

i am building imx-image-core and i’ve added image_install_append=packagegroup-imx-ml and parallel make and num of threads, that’s it.

i flush the som using uuu with the img and the evk.

but w/e i boot it i get this troublesome output, i’ve atttached an image so you can see what i’m refereing to

It looks like u-boot is incorrectly selecting the root device that is passed to extlinux.conf…or the extlinux.conf file is incorrectly being generated.

my generated extlinux.conf:

Generic Distro Configuration file generated by OpenEmbedded

LABEL NXP i.MX Release Distro
KERNEL …/Image
APPEND root=/dev/mmcblk1p2 rootwait rw quiet console=${console} ${bootargs} rootwait rw console=${console},${baudrate}

please help

okay that is the issue there. It should be /dev/mmcblk2p1. I will have to see why NXP’s bitbake recipe is hard-coding that rather than letting u-boot determine that based on the distro-config.


so i succesfully burnt hardknott and booted it up correctly, i tested the vx_delegator but i can’t seem to understand why the benchmark process from nxp is taking a single core by 100%. isn’t the npu should remove the overhead from the cpu? i am expecting to see 0 cpu usage during interaction with the NPU. am i wrong?

There is processing that is happening besides just running the model against the data on the NPU. There is image processing, and other pipeline processing that is happening. The only thing that is offloaded to the NPU is the inferencing.