ESP32-S3 commercial preview

Ship more AI on the silicon you already use.

MicroQuant compiles production neural networks into target-specific firmware—delivering 21.8% lower median latency and 91.0% less product flash than LiteRT Micro with ESP-NN on the validated ESP32-S3 workload.

Start an evaluation See measured results

Built for ESP32-S3 LiteRT models Production firmware

MicroQuant EngineMQ-09

KARQ 14.29 ms / inference

21.8%lower latency than ESP-NN

bit-exactboard output

14.29 msKARQ P1 median latency

92.24%W8 model accuracy

26.2 KBKARQ P1 product flash

30.7 KBKARQ P1 peak SRAM

Model coverage

Supported operators.
Customer-specific delivery.

The current compiler covers the building blocks behind dense networks and compact CNNs. Its architecture is extensible and adaptable to each customer's model, silicon, and production requirements.

Dense Matrix multiplication 2D convolution Depthwise convolution Max pooling Average pooling Global average pooling ReLU and clipping Sigmoid and tanh Flatten and reshape Identity Argmax and top-k

Built to adapt

Your model does not need to fit a generic runtime.

MicroQuant can be extended around customer-specific operators, graph patterns, memory budgets, and target hardware.

Discuss your model

Measured performance

Ahead of LiteRT Micro.
Even with ESP-NN enabled.

Speech Commands v2 DS-CNN on ESP32-S3 at 240 MHz. KARQ P1 delivers 21.8% lower median latency and uses 91.0% less product flash than the vendor-optimized reference path.

Best result1.28×ESP-NN throughput

MicroQuant Conventional pathLower latency is better

REFLiteRT MicroStandard runtime430.017 msConventional

REFLiteRT Micro + ESP-NNVendor optimized18.281 msConventional

01KARQ P1ESP32-S3 optimized14.288 ms−21.8%

02W8ESP32-S3 optimized15.441 ms−15.5%

03PCQ4ESP32-S3 optimized15.734 ms−13.9%

04KARQ P4ESP32-S3 optimized17.007 ms−7.0%

05SPQ4 G8ESP32-S3 optimized37.646 msOptimized

06SPQ4 G8Portable runtime99.128 msPortable

07KARQ P1Portable runtime115.612 msPortable

08KARQ P4Portable runtime130.364 msPortable

09PCQ4Portable runtime232.199 msPortable

10W8Portable runtime232.601 msPortable

Speech Commands v2 DS-CNN

Complete on-device comparison

Download results JSON

Configuration	Execution path	Accuracy	Median latency	p99 latency	Product flash	Peak SRAM
LiteRT Micro	Standard runtime	91.99%	430.017 ms	430.064 ms	262.9 KB	55.9 KB
LiteRT Micro + ESP-NN	Vendor optimized	91.99%	18.281 ms	18.361 ms	291.4 KB	56.4 KB
KARQ P1	ESP32-S3 optimized	91.40%	14.288 ms	14.291 ms	26.2 KB	30.7 KB
W8	ESP32-S3 optimized	92.24%	15.441 ms	15.446 ms	34.9 KB	30.7 KB
PCQ4	ESP32-S3 optimized	91.36%	15.734 ms	15.737 ms	24.8 KB	30.7 KB
KARQ P4	ESP32-S3 optimized	90.14%	17.007 ms	17.010 ms	35.0 KB	30.7 KB
SPQ4 G8	ESP32-S3 optimized	91.70%	37.646 ms	37.649 ms	34.3 KB	30.7 KB
SPQ4 G8	Portable runtime	91.70%	99.128 ms	99.130 ms	34.1 KB	30.7 KB
KARQ P1	Portable runtime	91.40%	115.612 ms	115.614 ms	26.0 KB	30.7 KB
KARQ P4	Portable runtime	90.14%	130.364 ms	130.366 ms	34.8 KB	30.7 KB
PCQ4	Portable runtime	91.36%	232.199 ms	232.202 ms	24.6 KB	30.7 KB
W8	Portable runtime	92.24%	232.601 ms	232.605 ms	34.7 KB	30.7 KB

Bit-exact execution

Host and device outputs match exactly across every MicroQuant configuration.

Production memory discipline

Static execution arenas and zero runtime heap growth keep behavior predictable.

Reproducible delivery

Source, generated assets, build identity, and hardware results are bound into the checkpoint.

Commercial engagement

A short path from
model to measured pilot.

Start with the model you already have. We qualify the graph and target, build the best candidates, and return a board-tested integration with decision-ready evidence.

01
Model assessment
Share the model, target board, and product constraints. We confirm fit and define acceptance goals.
02
Optimization sprint
MicroQuant evaluates precision formats and compiles the strongest candidates for your target.
03
Hardware pilot
You receive integrated firmware, measured results, and a clear route to production licensing.

Start a conversation

What could your model do with more room to run?

Tell us what you are building. We will reply with a focused technical assessment and the next practical step.

Typical first stepModel + target review

✓

Request received.

We will follow up at .

Ship more AI on the silicon you already use.

Supported operators.Customer-specific delivery.

Your model does not need to fit a generic runtime.

Ahead of LiteRT Micro.Even with ESP-NN enabled.

Speech Commands v2 DS-CNN

Bit-exact execution

Production memory discipline

Reproducible delivery

A short path frommodel to measured pilot.

Model assessment

Optimization sprint

Hardware pilot

What could your model do with more room to run?

Request received.

Supported operators.
Customer-specific delivery.

Ahead of LiteRT Micro.
Even with ESP-NN enabled.

A short path from
model to measured pilot.