Raptor
Raptor is a domain-specific MLIR compiler for neural networks in ONNX format, targeting in-memory computing / processing-in-memory (PIM) architectures. It extends ONNX-MLIR with a PIM accelerator and progressively lowers ONNX-MLIR through custom MLIR dialects to simulator artifacts.
The current target is the PIM simulator stack under backend-simulators/pim.
Raptor emits binary per-core .pim instruction files by default, plus
memory.bin, config.json, and weight binaries. It can also emit per-core JSON
instruction files with --pim-emit-json.
Overview
PIM architectures perform most computation directly in memory. The supported target models a chip with:
- shared host memory,
- multiple PIM cores,
- ReRAM crossbars for vector-matrix / matrix-vector work,
- explicit communication between cores,
- no hardware branch or loop support in emitted simulator code.
Because repeated work such as convolutions is eventually made explicit, emitted instruction counts can grow quickly. Most compiler work therefore focuses on lowering, scheduling, memory layout, and code-generation optimizations.
Targets and simulators
backend-simulators/pim/pim-simulatoris the in-tree Rust functional simulator used by validation. It reads Raptor'spim/artifact directory and compares simulator output against native ONNX-MLIR execution.backend-simulators/pim/pimsim-nnis the performance simulator submodule. The helper scripts inpimcomp_utils/are for comparison with PIMCOMP-NN and contain local paths; treat them as local utilities, not portable workflows.
Compilation pipeline
The PIM sources live under src/PIM and tests under test/PIM. CMake exposes
them to ONNX-MLIR through generated shim directories under
onnx-mlir/src/Accelerators/PIM and onnx-mlir/test/accelerators/PIM.
High-level lowering flow:
ONNX-MLIR -> Spatial -> Pim (tensor) -> Pim (bufferized) -> PIM artifacts
-
ONNX -> Spatial (
src/PIM/Conversion/ONNXToSpatial). Lowers supported ONNX ops into thespatdialect (src/PIM/Dialect/Spatial). Conversion patterns are split by op family underPatterns/{Math,NN,Tensor}and currently cover Conv, Gemm, MatMul, elementwise Add/Mul/Div, ReduceMean, pooling, Relu, Sigmoid, Softmax, Concat, Gather, Reshape, Resize, and Split. -
Merge compute nodes (
src/PIM/Dialect/Spatial/Transforms/MergeComputeNodes). Builds a compute graph, schedules it with the PEFT scheduler, and materializes the merge schedule into Spatial IR. Supporting scheduling code lives underMergeComputeNodes/Scheduling. -
Spatial -> Pim (
src/PIM/Conversion/SpatialToPim). Lowers Spatial operations to thepimdialect (src/PIM/Dialect/Pim), includingpim.core,pim.core_batch, communication, tensor packing, global tensor materialization, and return-path normalization. -
Bufferization (
src/PIM/Dialect/Pim/Transforms/Bufferization). Converts tensor-semantics PIM IR into memref-semantics PIM IR using MLIR's bufferization interfaces. -
Static memory coalescing (
src/PIM/Dialect/Pim/Transforms/StaticMemoryCoalescing). Reuses compatible local memref allocations inside PIM cores before codegen. -
PIM code generation (
src/PIM/Pass/PimCodegenandsrc/PIM/Compiler). Folds host constants, materializes remaining host constants, verifies PIM IR, emits.pimcore files, writes weights, and writesmemory.bin/config.json.
Supporting pieces:
src/PIM/Common- shared IR, filesystem, diagnostics, reports, and utility helpers.src/PIM/Compiler- PIM compiler options, memory/address planning, binary instruction format, artifact writing, weight emission, and codegen entry points.src/PIM/Conversion/SpatialToGraphviz- optional Spatial graphviz conversion pass.src/PIM/Pass- pass registration and auxiliary passes.src/PIM/PimAccelerator.{cpp,hpp}- ONNX-MLIR accelerator entry point.
Key compiler options
Pass these to onnx-mlir when compiling for PIM:
--maccel=PIM- select the PIM accelerator.--EmitSpatial,--EmitPim,--EmitPimBufferized,--EmitPimCodegen- stop the PIM pipeline at the requested stage. The PIM default is--EmitPimCodegen.--core-count=<N>- required positive core count for PIM compilation.--crossbar-size=<N>- crossbar width/height. Default in code is2.--crossbar-count=<N>- crossbars per core. Default in code is256.--pim-merge-scheduler=peft- merge scheduler.peftis the only accepted value in the current code.--pim-only-codegen- assume input is already bufferized PIM IR and only run the codegen tail.--pim-emit-json- also emitcore_*.jsoninstruction files alongsidecore_*.pim.--use-experimental-conv-impl- use the alternate convolution lowering.--ignore-concat-error- soft-fail a ConcatOp corner case.
Example:
./build_release/Release/bin/onnx-mlir model.onnx -o /tmp/raptor/model \
--maccel=PIM --EmitPimCodegen \
--crossbar-size=2048 --crossbar-count=256 --core-count=1000
This writes PIM artifacts under /tmp/raptor/pim/.
Validation
Functional validation lives in validation/. It compiles ONNX models, builds a
native ONNX-MLIR reference runner, generates random inputs, runs Raptor, runs
the Rust PIM simulator, and compares outputs.
Python dependencies used by the validation scripts are numpy, onnx, and
colorama. The simulator requires the Rust toolchain.
Per-operation validation from the repository root:
python3 validation/validate.py \
--raptor-path build_release/Release/bin/onnx-mlir \
--onnx-include-dir onnx-mlir/include \
--core-count 1000
Validate one network or a subset by pointing --operations-dir at any directory
containing .onnx files:
python3 validation/validate.py \
--raptor-path build_release/Release/bin/onnx-mlir \
--onnx-include-dir onnx-mlir/include \
--operations-dir validation/networks/yolo11n/depth_04 \
--crossbar-size 2048 --crossbar-count 256 --core-count 1000
Useful validation options:
--simulator-dir <path>- override the auto-detectedbackend-simulators/pim/pim-simulatorpath.--threshold <float>- maximum allowed per-element output difference.--seed <int>- RNG seed for generated inputs.--command-timeout-seconds <float>- timeout for compiler, runner, and simulator subprocesses.--verbose- print subprocess logs and average PIM pass timings.--clean- remove generated validation artifacts and exit.
Each validation run writes artifacts in the model workspace, for example under
validation/operations/gemm/small/:
inputs/- generated input CSV files.outputs/- native ONNX-MLIR reference outputs.raptor/- compiler artifacts, including*.onnx.mlir, dialect dumps underdialects/, reports underreports/, and final PIM artifacts underpim/.runner/- generated reference runner source, build tree, and shared library.simulation/out.bin- raw simulator output used for comparison.
The compiler currently dumps dialect snapshots such as spatial0.mlir,
spatial1_dcp_merged.mlir, pim0.mlir, pim1_buff.mlir,
pim2_coalesced.mlir, pim3_folded.mlir, and
pim4_materialized.mlir when an output directory is available.
To rerun the simulator manually with tracing after validation has produced a
raptor/pim/ directory:
cd backend-simulators/pim/pim-simulator
cargo run --no-default-features --features tracing --release \
--package pim-simulator --bin pim-simulator -- \
-f /path/to/workspace/raptor/pim \
-o /path/to/workspace/simulation/out.bin \
-d <addr0>,<size0>,<addr1>,<size1>,...
With --features tracing, the simulator writes per-core traces as
TraceCore0, TraceCore1, ... next to out.bin. The validator normally
computes the -d ranges from raptor/pim/config.json and model output shapes.
Available validation networks under validation/networks/: vgg16,
yolo11n, yolo11nv2.
Available operation suites under validation/operations/: add, concat,
conv, div, gather, gemm, gemv, matmul, mul, pool,
reduce_mean, relu, reshape, resize, sigmoid, softmax, split.
Generated operation tests can be regenerated with:
python3 validation/operations/gen_tests.py
Build
Initialize submodules first:
git submodule update --init --recursive
The project follows ONNX-MLIR's build requirements. The CI workflow documents the currently used versions and setup:
- CMake 4.3.0 in CI,
- LLVM/MLIR checked out under
onnx-mlir/llvm-project, - Protobuf
v34.0, - Rust stable for
pim-simulator, - Python packages
numpy,onnx,coloramafor validation.
Protobuf
Install Protobuf if your system does not already provide a compatible version:
git clone --depth 1 --branch v34.0 https://github.com/protocolbuffers/protobuf
cmake -S protobuf -B protobuf/build -G Ninja \
-DCMAKE_BUILD_TYPE=Release \
-Dprotobuf_BUILD_TESTS=OFF
cmake --build protobuf/build
sudo cmake --install protobuf/build
You can then remove the temporary checkout:
rm -rf protobuf
MLIR
Follow the ONNX-MLIR instructions in
onnx-mlir/docs/BuildOnLinuxOSX.md to build LLVM/MLIR. The local Raptor build
expects MLIR_DIR to point at the MLIR CMake package, for example:
MLIR_DIR=$(pwd)/onnx-mlir/llvm-project/build_release/lib/cmake/mlir
If your LLVM build directory is named build instead of build_release, adjust
the path accordingly.
Raptor
Configure a release build:
MLIR_DIR=$(pwd)/onnx-mlir/llvm-project/build_release/lib/cmake/mlir
cmake -S . -B build_release -G Ninja \
-DCMAKE_BUILD_TYPE=Release \
-DONNX_MLIR_ACCELERATORS=PIM \
-DLLVM_ENABLE_ASSERTIONS=ON \
-DMLIR_DIR=${MLIR_DIR}
Configure a debug build similarly:
MLIR_DIR=$(pwd)/onnx-mlir/llvm-project/build_debug/lib/cmake/mlir
cmake -S . -B build_debug -G Ninja \
-DCMAKE_BUILD_TYPE=Debug \
-DONNX_MLIR_ACCELERATORS=PIM \
-DLLVM_ENABLE_ASSERTIONS=ON \
-DMLIR_DIR=${MLIR_DIR}
For debug development, using mold can reduce link time and memory use:
cmake -S . -B build_debug -G Ninja \
-DCMAKE_BUILD_TYPE=Debug \
-DONNX_MLIR_ACCELERATORS=PIM \
-DLLVM_ENABLE_ASSERTIONS=ON \
-DMLIR_DIR=${MLIR_DIR} \
-DCMAKE_EXE_LINKER_FLAGS="-fuse-ld=mold" \
-DCMAKE_SHARED_LINKER_FLAGS="-fuse-ld=mold" \
-DCMAKE_MODULE_LINKER_FLAGS="-fuse-ld=mold"
Build the compiler with CMake:
cmake --build ./build_release
cmake --build ./build_debug
Do not invoke ninja directly for this project; use cmake --build so CMake's
configuration and generated shims stay consistent.
If a build fails because Protobuf headers are missing fixed-width integer
definitions, patch the affected Protobuf-generated files by adding
#include <cstdint>.
Tests
The Rust simulator has its own tests:
cd backend-simulators/pim/pim-simulator
cargo test
Repository Layout
src/PIM/- PIM accelerator implementation.test/PIM/- PIM C++ unit tests.validation/- functional validation scripts, ONNX operation tests, network slices, and pimsim config generation.backend-simulators/pim/pim-simulator/- in-tree Rust functional simulator.backend-simulators/pim/pimsim-nn/- performance simulator submodule.pimcomp_utils/- local comparison helpers for PIMCOMP-NN..github/actions/and.github/workflows/validate_operations.yml- CI setup for MLIR/Protobuf caching, building Raptor, and validation.