hero画像
Our Products
NXP Semiconductors
Introduction to Edge AI Solutions with NXP's NPU-equipped i.MX/MCX
NXP Semiconductors
  • NXP Semiconductors
  • NEXT Mobility
  • ICT and Industrial
  • Smart Factories and Robotics

In recent years, Artificial Intelligence (AI) has rapidly expanded, becoming increasingly integral to all aspects of our lives and work. Applications of AI, such as autonomous driving and text/image generation, are also evolving rapidly, accompanied by a growing trend toward Edge AI, where AI is embedded within devices. This page provides an explanation of Edge AI, using real devices as examples.

  • Related Sites

What is Edge AI?

Edge AI is a technology where data processing occurs directly on the device, rather than being sent to the cloud. This significantly enhances real-time performance and efficiency, enabling much faster processing. Traditional cloud-based AI systems often experience delays due to data transmission, which limits their effectiveness for applications requiring immediate responses.

Cloud-Based AI and Edge AI

Cloud-Based AI and Edge AI

Another significant advantage of Edge AI is its independence from network connectivity. By processing substantial amounts of data locally instead of transmitting it to the cloud, communication costs can be reduced. It also helps to improve data privacy. This is because processing sensitive information on the device itself, rather than sending it to the cloud, lowers security risks. Additionally, Edge AI is particularly valuable in remote locations or areas with unreliable internet access. The ability to process data locally, even without reliable cloud access, ensures dependable system operation.

Advantages of Edge AI compared to Cloud-Based AI

Advantages of Edge AI compared to Cloud-Based AI

Edge AI is anticipated to find widespread application across numerous sectors. It is particularly valuable in applications demanding high-speed data processing and immediate responses, such as autonomous vehicles, smart homes, Industrial IoT, and medical devices. In autonomous vehicles, for example, the instantaneous detection of road conditions and obstacles is crucial, making the implementation of Edge AI essential. In this way, Edge AI presents a new paradigm for data processing and is poised to bring about innovative changes in numerous industries.

NXP's Edge AI Solutions

To achieve high-speed processing in the field of Edge AI, NXP offers processors and microcontrollers incorporating Neural Processing Units (NPUs). An NPU is a specialized coprocessor designed for AI inference, capable of rapidly handling extensive computations, thereby significantly enhancing efficiency and overall performance. Below is an introduction to NXP's i.MX 93 application processor and MCX N94x/54x microcontrollers, both of which feature integrated NPUs.

i.MX 93 Application Processor

The i.MX 93 application processor delivers exceptional Edge AI solutions by integrating a Cortex®-A55 processor that operates at up to 1.7 GHz with an Ethos™-U65 NPU.

NXPセミコンダクターズのMCIMX93-EVKイメージ

MCIMX93-EVK

i.MX 93 Application Processor

i.MX 93 Block Diagram

i.MX 93 Block Diagram

Multi-core Processing
  • Arm® Cortex®-A55 × 1 ⁄ × 2 @ 1.7 GHz
  • Arm® Cortex®-M33 @ 250Mhz
  • Arm® Ethos™ U-65 microNPU
  • EdgeLock® Secure Enclave
Connectivity
  • USB 2.0 Type C x 2 (with PHY)
  • Gb Ethernet x 2: AVB and IEEE 1588 for synchronization, and EEE for low power (one with TSN support)
  • CAN-FD × 2
  • UART × 8、I2C × 8、SPI × 8、I3C × 2
  • 4ch、12-bit ADC × 1
  • 32-pin FlexIO interface (camera, bus, or serial I/O) x 2
External Memory
  • Up to 3.7 GT/s x16 LPDDR4 / LPDDR4X (with inline ECC support)
  • SD 3.0 ⁄ SDIO3.0 ⁄ eMMC5.1 × 3
  • Octal SPI x 1, supporting SPI NOR and SPI NAND memory
Display Interface
  • 1080p60 MIPI-DSI (4 lanes, 1.5 Gbps / lane) x 1, with PHY
  • 720p60 LVDS (4 lanes) x 1
  • 24-bit Parallel RGB
Audio
  • I2S TDM x 7 (32-bit @ 768 KHz), SPDIF Tx / Rx
  • 8-channel PDM microphone input
  • MQS: Medium-quality sound output (Sigma-Delta modulator)
Operating System
  • Linux® OS
  • FreeRTOS
  • INTEGRITY
  • QNX
  • VxWorks
Temperature Range
  • 0℃~+95℃ (Commercial)
  • -40℃~+105℃ (Industrial)
  • -40℃~+125℃ (Automotive ⁄ Extended Industrial)

MCX N94x / 54x Microcontroller

The MCX N94x and N54x microcontrollers provide energy-efficient and high-performance Edge AI solutions by integrating an Arm® Cortex®-M33 processor running at a maximum of 150 MHz and an eIQ® Neutron NPU

FRDM-MCXN947

FRDM-MCXN947

MCX N94x Block Diagram

MCX N94x Block Diagram

Core Platform
  • Arm® Cortex®-M33 @ 150 MHz (Dual Core)
  • DSP accelerator (PowerQUAD, with coprocessor interface)
  • SmartDMA (coprocessor for applications such as parallel camera interface and keypad scan)
  • eIQ® Neutron N1-16 Neural Processing Unit
  • Power Line Communication (PLC) Controller
Memory
  • Up to 2 MB of on-chip flash memory (2 x 1 MB banks)
  • support for flash swapping and RWW (Read-While-Write).
  • Cache engine using 16 KB RAM
  • Up to 512 KB RAM, configurable up to 416 KB with ECC (supports 1-bit correction 2-bit detection)
  • Supports up to 4× 8 KB ECC RAM retaining VBAT mode
  • XIP, Octal/Quad SPI flash, HyperFlash, HyperRAM, Xccela memory types with 16 KB cache FlexSPI
Peripherals (Analog)
  • 16-bit ADC x 4 (single-ended) or 16-bit ADC x 2 (differential)
  • Built-in temperature sensors for each ADC
  • 3 high-speed comparators with 17 input pins and an 8-bit DAC as internal references
  • 12-bit DAC x 2 (max sample rate 1.0 Msamples/sec)
  • 14-bit DAC x 1 (max sample rate 10 Msamples/sec)
  • Programmable Gain Amplifier
  • Differential Amplifier
  • Instrumentation Amplifier
  • Transconductance Amplifier
  • High-precision VREF ±0.15%, 15 ppm / deg C drift
Peripherals (Timer)
  • Features five 32-bit standard general-purpose asynchronous timers/counters, supporting up to 4 capture inputs and 4 compare outputs, PWM mode, and external count input. Specific timer events can be selected to generate DMA requests.
  • SCTimer ⁄ PWM
  • LPTimer
  • Frequency Measurement Timer
  • Multi-rate Timer
  • Window Watchdog Timer
  • RTC with calendar function
  • Micro Timer
  • OS Event Timer
Peripherals
(Communication Interfaces)
  • USB High-Speed (Host / Device), with on-chip HS PHY
  • USB Full-Speed (Host / Device), with on-chip FS PHY, USB Device
  • uSDHC (MicroSD High-Speed Card Interface)
  • LP Flexcomm × 10 (each supports SPI, I2C, UART)
  • FlexCAN × 2 (FD、I3C × 2、SAI × 2)
  • Ethernet x 1 with QoS support
  • FlexIO x 1 (programmable as various serial and parallel interfaces, such as display driver or camera interface)
  • EVM Smart Card Interface x 2
  • Programmable Logic Unit (PLU)
Peripherals
(Motor Control Subsystem)
  • eFlexPWM x 2 (each with 4 sub-modules, providing 12 PWM outputs)
  • Quadrature Encoder / Decoder (ENC) x 2
  • Event Generator (AND / OR / INVERT) module x 1, supporting up to 8 output triggers
  • SINC Filter module (3rd order, 5ch, break signal connection to PWM)
Security
  • EdgeLock® Secure Enclave, Core Profile
  • Encryption services (AES-256, SHA-2, ECC NIST P-256, TRNG, includes key generation/derivation)
  • Secure key store with key usage policies (protection for platform integrity, manufacturing, and application keys)
  • Device-unique ID based on Physically Unclonable Function (PUF)
  • Device authentication supporting Device Identifier Composition Engine (DICE)
  • Secure connection and TLS support
  • Over-the-air key management through pre-integration with NXP EdgeLock® 2GO
  • EdgeLock® Accelerator (Public Key Cryptography)
  • Immutable secure boot code in ROM
  • Dual secure boot modes (asymmetric mode and fast, post-quantum secure symmetric mode)
  • Support for secure firmware updates
  • Device lifecycle management including secure authenticated debug
  • High-performance on-the-fly memory encryption with additional authentication for external flash
  • Protected Flash Region (PFR)
  • Code Watchdog x 2
  • Intrusion and Tamper Response Controller (ITRC)
  • 8 active and passive tamper pin detections
  • Voltage, temperature, light, clock tamper detection
  • Voltage glitch detection
  • Secure manufacturing in untrusted facilities and protection against IP theft. Arm® TrustZone® for Cortex®-M

Comparison of AI Processing Speeds Between NPUs and Regular CPUs

Processors equipped with NXP's NPU are able to perform AI processing much faster than regular CPUs. As an example, we measured the speed of face recognition AI processing on the MCX N94x board using both an NPU and a regular CPU.

Results of Comparing Face Recognition AI Processing Speed Between NPU and CPU

Results of Comparing Face Recognition AI Processing Speed Between NPU and CPU

This experiment compared the time required for face recognition AI processing using an NPU versus a CPU. The inference time with the NPU was 24 milliseconds, while the CPU took 869 milliseconds. This outcome demonstrates that the NPU performed the AI processing approximately 30 times faster than the CPU. While the specific difference in speed can vary depending on the particular AI model and software specifications used, these results strongly suggest that significant improvements in processing speed can be expected by utilizing an NPU for AI tasks.

Summary

The utilization of AI is expanding beyond the cloud to edge devices, and the integration of NPUs into these devices is driving further advancements in performance. This page has provided an overview of this trend, the relevant devices incorporating NPUs, and their impressive performance metrics. Solutions leveraging NXP devices equipped with NPUs provide the benefits of rapid and efficient AI processing. For details, please check the NXP official website listed on the related sites.

Related Product Information

Link to Related Technical Columns