ST Edge AI Core for STM32 series
for STM32 target, based on ST Edge AI Core Technology 3.0.0
r1.2
Overview
The article describes the specificities of the command line for the STM32 target.
ST Community channel/forum for “STM32 MCUs”
Comparison with the X-CUBE-AI UI plug-in features
The stedgeai application is used as back-end by the
X-CUBE-AI UI plug-in (refer to [UM]).
In comparison with the X-CUBE-AI UI plug-in, the following high-level features are not supported:
- Extra C-code wrapper to manage multiple models. CLI manages only
one model at the time.
- Creation of a whole IDE project including the optimized
inference runtime library, AI headers files and the C-files related
to the hardware settings. CLI can only be used to generate the
specialized NN C-files. However, it allows to update an initial IDE
project, STM32CubeMX-based or proprietary source tree (see “Update an ioc-based project”
section).
- The check to know if a model fits in term of memory layout in a selected STM32 memory device. CLI reports (see ‘analyze’ command) only the main system level dimensioning metrics: ROM, RAM, MACC.. (Refer to [METRIC] for details)
- For the “Validation process on target”, as a full STM32 project is expected, it must be generated previously through the UI. Note that this project can be updated later (see “Update an ioc-based project” section). “Validation process on desktop” is fully supported through the CLI without restriction.
- Graphic visualization of the generated c-graph (including the usage of the RAM). CLI provides only a textual representation (table form) of the c-graph including a description of the tensors/operators (see ‘analyze’ command).
Supported STM32 series
The STM32 family of 32-bit microcontrollers is based on the Arm Cortex®-M processor.
| supported series | description |
|---|---|
stm32f4/stm32g4/stm32f3/stm32wb |
all STM32F4xx/STM32G4xx/STM32F3xx/STM32WBxx devices with an Arm® Cortex® M4 core and FPU support enabled (simple precision). |
stm32l4/stm32l4+ |
all STM32L4xx/STM32L4Rxx devices with an Arm® Cortex® M4 core and FPU support enabled (simple precision). |
stm32n6 |
all STM32N6xx devices with an Arm® Cortex® M55 core including or not the Neural-ART accelarator™. |
stm32l5/stm32u5/stm32h5/stm32u3 |
all STM32L5xx/STM32U5xx/STM32H5xx/STM32U3xx devices with a Arm® Cortex® M33 core and FPU support enabled (simple precision). |
stm32f7 |
all STM32F7xx devices with a Arm® Cortex® M7 core and FPU support enabled (simple precision). |
stm32h7 |
all STM32H7xx devices with a Arm® Cortex® M7 core and FPU support enabled (double precision). |
stm32l0/stm32g0/stm32c0 |
all STM32L0xx//STM32G0xx/STM32C0xx devices with an Arm® Cortex® M0+ core, w/o FPU support and w/o DSP extension. |
stm32f0 |
all STM32F0xx device with an Arm® Cortex® M0+ core, w/o FPU support and w/o DSP extension. |
stm32wl |
all STM32WLxx device with an Arm® Cortex® M4 core, w/o FPU support and with DSP extension. |
Warning
Be aware that all provided inference runtime libraries for the
different STM32 series (excluding STM32WL series) are compiled with
the FPU enabled and the hard float EABI option for
performance reasons.
STM32N6xx considerations
For an STM32N6 device without the ST Neural-ART NPU is supported similarly to the classical STM32xx series. The optimized AI network runtime library is implemented to use the M-Profile Vector Extension (MVE). Consequently, the generated C-file cannot be executed on the host machine, and the “Validation on host” feature is not supported.
A specific ST Neural-ART Compiler generates the specialized C-files for an STM32N6 device with the ST Neural-ART NPU. No simulator or emulator is provided, so the “Validation on host” feature is not supported.
Generate command extension
Specific options
--binary
When this flag is defined, the code generator forces the
generation of a binary file named
'<name>_data.bin'. This binary file contains
only the raw data of the weights and bias tensors.
- Optional
Notes:
- The
'<name>_data.c'and'<name>_data.h'files are always generated, regardless of the'--binary'flag (see “Particular network data c-file” section). Metadata such as scale factors, zero-points, and other parameters are always included in the<name>.c/.hfiles. - This option applies only for STM32 devices without the Neural ART Accelerator™. For more information, refer to the article: “ST Neural-ART - How to deploy/manage the NPU memory initializers”.
--address/--copy-weights-at
With the --binary flag,
these helper options specify the address where the weights are
located or the destination address where the weights must be copied
during initialization. This is achieved using a specific generated
'<name>_data.c'
file (refer to the “Particular
network data c-file” section). - Optional
Note:
- This option applies only for STM32 devices without the Neural ART Accelerator™. For more information, refer to the article: “ST Neural-ART - How to deploy/manage the NPU memory initializers”.
--relocatable
Short syntax: -r/--reloc
Enables the generation of a runtime loadable model (also called relocatable model). This allows to be loaded and relocated at runtime rather than being fixed at compile/link time. - Optional
The generation and management of the runtime-loadable model depend on the underlying runtime and hardware.
| STM32 series | Refer to the article for detailed guidance |
|---|---|
| Pure SW solution | “STM32 Arm® Cortex® M - Relocatable binary (or runtime loadable) model support” |
| Hardware-assisted solution | “ST Neural-ART NPU - Runtime loadable model support” |
--ihex
When used together with --relocatable and --address options, this option instructs the code generator to generate an additional output file in the Intel Hexadecimal Object File Format - Optional
Example
Generate only the network NN C-file, weights/bias parameters are provided as a binary file/object.
$ stedgeai generate -m <model_file_path> --target stm32 -o <output-directory-path> -n <name> --binary ... Generated files (8) ----------------------------------------------------------- <output-directory-path>\<name>_config.h <output-directory-path>\<name>.h <output-directory-path>\<name>.c <output-directory-path>\<name>_data.bin <output-directory-path>\<name>_data.h <output-directory-path>\<name>_data.c <output-directory-path>\<name>_data_params.h <output-directory-path>\<name>_data_params.c Creating report file <output-directory-path>\<name>_generate_report.txt ...Generate a full relocatable binary file for a STM32H7 series (refer to “Relocatable binary model support” article).
$ stedgeai generate -m <model_file_path> --target stm32h7 -o <output-directory-path> --relocatable ... Generated files (10) ----------------------------------------------------------- <output-directory-path>\<name>_config.h <output-directory-path>\<name>.h <output-directory-path>\<name>.c <output-directory-path>\<name>_data.h <output-directory-path>\<name>_data.c <output-directory-path>\<name>_data_params.h <output-directory-path>\<name>_data_params.c <output-directory-path>\<name>_rel.bin <output-directory-path>\<name>_img_rel.c <output-directory-path>\<name>_img_rel.h Creating report file <output-directory-path>\network_generate_report.txt ...Generate a relocatable binary file w/o the weights for a STM32F4 series. Weights/bias data are generated in a separated binary file (refer to “Relocatable binary model support” article)
$ stedgeai generate -m <model_file_path> --target stm32h7 -o <output-directory-path> -n <name> --relocatable --binary ... Generated files (11) ----------------------------------------------------------- <output-directory-path>\<name>_config.h <output-directory-path>\<name>.h <output-directory-path>\<name>.c <output-directory-path>\<name>_data.h <output-directory-path>\<name>_data.c <output-directory-path>\<name>_data_params.h <output-directory-path>\<name>_data_params.c <output-directory-path>\<name>_data.bin <output-directory-path>\<name>_rel.bin <output-directory-path>\<name>_img_rel.c <output-directory-path>\<name>_img_rel.h Creating report file <output-directory-path>\<name>_generate_report.txt ...
Particular network data c-file
The helper '--address' and
'--copy-weights-at' options are the convenience options
to generate a specific ai_network_data_weights_get()
function. The returned address is passed to the
ai_<network>_init() function thanks the
ai_network_params structure (refer to
[[API]][X_CUBE_AI_API]). Note that this (including the copy
function) can be fully managed by the application code itself.
If the --binary (or --relocatable)
option is passed without the '--address' or
'--copy-weights-at' arguments, following
network_data.c file is generated
#include "network_data.h"
ai_handle ai_network_data_weights_get(void)
{
return AI_HANDLE_NULL;
}Example of generated network_data.c file with the
--binary and --address 0x810000
options.
#include "network_data.h"
#define AI_NETWORK_DATA_ADDR 0x810000
ai_handle ai_network_data_weights_get(void)
{
return AI_HANDLE_PTR(AI_NETWORK_DATA_ADDR);
}Example of generated network_data.c file with the
--binary, --address 0x810000 and
--copy-weights-at 0xD0000000 options.
#include <string.h>
#include "network_data.h"
#define AI_NETWORK_DATA_ADDR 0x81000
#define AI_NETWORK_DATA_DST_ADDR 0xD0000000
ai_handle ai_network_data_weights_get(void)
{
memcpy((void *)AI_NETWORK_DATA_DST_ADDR, (const void *)AI_NETWORK_DATA_ADDR,
AI_NETWORK_DATA_WEIGHTS_SIZE);
return AI_HANDLE_PTR(AI_NETWORK_DATA_DST_ADDR);
}Update an ioc-based project
For a X-CUBE-AI IDE project (ioc-based), the user has the
possibility to update only the generated NN C-files. In this case,
the '--output' option is used to indicate the root
directory of the IDE project, that is, location of the
'.ioc' file. The destination of the previous NN c-files
is automatically discovered in the source tree else the output
directory is used.
$ stedgeai generate -m <model_path> --target stm32 -n <name> -c low -o <root_project_folder>
...
IOC file found in the output directory
...
Generated files (7)
-----------------------------------------------------------
<root_project_folder>\Inc\<name>_config.h
<root_project_folder>\Inc\<name>.h
<root_project_folder>\Src\<name>.c
<root_project_folder>\Inc\<name>_data_params.h
<root_project_folder>\Src\<name>_data_params.c
<root_project_folder>\Inc\<name>_data.h
<root_project_folder>\Src\<name>_data.c
Creating report file <root_project_folder>\<name>_generate_report.txt
or
$ stedgeai generate -m <model_path> --target stm32 --c-api st-ai -n <name> -c low -o <root_project_folder>
...
IOC file found in the output directory
...
Generated files (5)
-----------------------------------------------------------
<root_project_folder>\inc\<name>_details.h
<root_project_folder>\inc\<name>.h
<root_project_folder>\src\<name>.c
<root_project_folder>\inc\<name>_data.h
<root_project_folder>\src\<name>_data.c
Creating report file <root_project_folder>\<name>_generate_report.txt
...For multiple networks support, the update mechanism for a
particular model is the same. Users should be
vigilant to use the correct name
('--name my_name') to avoid overwrite/updating an
incorrect file and to be aligned with the multinetwork helpers
functions which are only generated by the X-CUBE-AI UI:
'app_x-cube-ai.c/.h' files. If the number of networks
is changed, X-CUBE-AI UI should be used to update the generated
c-models.
Update a proprietary source tree
The '--output' option is used to indicate the single
destination of the generated NN C-files. Note that an empty file
with the '.ioc' extension can be defined in the root
directory of the custom source tree to use the discovery mechanism
as for the update of an
ioc-based project.