TinyML Compiler Platform & Consulting
1. Model Ingestion
Upload a model file to compile, or configure synthetic weight simulation parameters below.
2. Interactive Weight Map
Hover over weights to inspect values. High-variance outlier rows are isolated into INT8 channels; the remaining channels are quantized into packed 4-bit blocks.
Serialization Layer (2xINT4 → 1xByte)
3. Generated C++ Assets
Direct compilation output (`model_assets.h`) aligned at `alignas(4)` for bare-metal XIP execution.
// Click "Run Quantization Compiler" to generate C++ code...
Competitive Performance Benchmark
Quantitative analysis comparing MicroQuant vs. standard Google TensorFlow Lite for Microcontrollers (TFLu) across identical network operators.
Empirical Benchmark Table
| Metric | Google TFLite Micro | MicroQuant Profile A | Divergence / Savings |
|---|---|---|---|
| Static Flash Footprint | 8.00 KB | 5.46 KB | -31.69% Reduction |
| Dynamic Heap Allocation | Variable Arena | 0.0 KB (Zero-Alloc) | 100% Heap Saved |
| Latency (Inference Time) | 2.00 us | 1.40 us | 1.43x Speedup |
| Accuracy Baseline | 94.2% | 93.9% | -0.3% (Outlier Shielded) |
__SMUAD intrinsics to execute two 16-bit multiply-accumulate actions in a single clock cycle, achieving maximum speedup.
Footprint & Execution Graphs
Static Flash Weight Size (KB)
Inference Speedup Factor (Higher is Better)
Micro-Architectural Highlights & Hardening
- Dynamic Block-Wise Radix Scaling: Adapts scale shift n_b dynamically for each 32-element block to shield outlier variables without accumulator overflow.
- SWAR INT4 Unpacking: Employs Sub-Word Parallelism (SWAR) to load and unpack adjacent INT4 parameters into registers in parallel.
- Cortex-M SIMD MACs: Compiles into optimized parallel multiply-accumulates utilizing __SMUAD instructions.
- Padding Hardening: Zero interior struct padding bytes, with buffers padded to 64-byte lines for cache coherency.
HIL Cycle-Accurate Execution Trace
Open-Core Dual-Licensing Compliance Model
Understanding the interplay between Copyleft GNU AGPLv3 protection and Commercial enterprise exemptions.
Open-Source Copyleft Track
The open-source C++ runtime engine is licensed under the Affero General Public License (AGPLv3).
- Copyleft Reciprocity: Any production deployment or distribution of the runtime engine requires releasing the source code of your entire integrated product stack under the GNU AGPLv3.
- Network Interaction Trigger: Providing access to the engine's inference outputs over a network (e.g., via a cloud API or SaaS) is legally classified as conveyance, requiring full source disclosure.
- Sandbox Evaluation Exemption: Reviewing the codebase, running local validation benchmarks, or evaluating feasibility for hiring and recruitment in non-production environments is strictly exempt from AGPLv3 obligations.
- No Commercial Guarantees: The community track is provided "as-is" without commercial warranties, intellectual property indemnification, or dedicated support SLAs.
Commercial Licensing & Exemptions
For enterprise engineering teams seeking proprietary safety, complete source isolation, and hardware-specific compilation.
- Commercial Copyleft Waiver: Removes all open-source source code disclosure requirements, allowing clean integration into closed-source commercial firmware and products.
- Intellectual Property Protection: Full commercial-grade IP indemnification and liability coverage, ensuring peace of mind for high-stakes enterprise deployments.
- Non-Production Evaluation Waiver: Academic verification, technical due diligence, and recruitment evaluations in non-production environments operate under a zero-liability sandbox waiver.
- Target Adaptation Support: Direct co-engineering support to optimize the compiler backend specifically for custom hardware targets, including ARM Helium/MVE, custom SIMD layouts, and strict RTOS structures.
Request Technical Architecture Evaluation
Need custom microcontroller optimizations, RISC-V/DSP assembly acceleration, or proprietary licensing? Schedule a deep-dive consulting review.
Enterprise Core Competencies
We convert edge-computing optimization from an engineering hurdle into a massive competitive advantage.
Assembly-Layer Custom Kernels
We write optimized vector wrappers using ARM CMSIS, RISC-V RVV, and custom Tensilica DSP intrinsics to maximize execution frequency.
Advanced PTQ & Outlier Balancing
We customize quantization channels based on activation outlier distributions, securing high classification precision under extreme integer scaling.
EULA Dual-Licensing Architecture
Seamlessly integrate our zero-allocation compiler pipeline into your commercial products without triggering copyleft requirements.