Raptor is a domain-specific MLIR compiler for neural networks in ONNX format, targeting in-memory computing / processing-in-memory (PIM) architectures. It extends ONNX-MLIR with a PIM accelerator and progressively lowers ONNX-MLIR through custom MLIR dialects to simulator artifacts.

The current target is the PIM simulator stack under backend-simulators/pim. Raptor emits binary per-core .pim instruction files by default, plus memory.bin, config.json, and weight binaries. It can also emit per-core JSON instruction files with --pim-emit-json.

Overview

PIM architectures perform most computation directly in memory. The supported target models a chip with:

shared host memory,
multiple PIM cores,
ReRAM crossbars for vector-matrix / matrix-vector work,
explicit communication between cores,
no hardware branch or loop support in emitted simulator code.

Because repeated work such as convolutions is eventually made explicit, emitted instruction counts can grow quickly. Most compiler work therefore focuses on lowering, scheduling, memory layout, and code-generation optimizations.

Targets and simulators

backend-simulators/pim/pim-simulator is the in-tree Rust functional simulator used by validation. It reads Raptor's pim/ artifact directory and compares simulator output against native ONNX-MLIR execution.
backend-simulators/pim/pimsim-nn is the performance simulator submodule. The helper scripts in pimcomp_utils/ are for comparison with PIMCOMP-NN and contain local paths; treat them as local utilities, not portable workflows.

Compilation pipeline

The PIM sources live under src/PIM and tests under test/PIM. CMake exposes them to ONNX-MLIR through generated shim directories under onnx-mlir/src/Accelerators/PIM and onnx-mlir/test/accelerators/PIM.

High-level lowering flow:

ONNX-MLIR -> Spatial -> Pim (tensor) -> Pim (bufferized) -> PIM artifacts

ONNX -> Spatial (src/PIM/Conversion/ONNXToSpatial). Lowers supported ONNX ops into the spat dialect (src/PIM/Dialect/Spatial). Conversion patterns are split by op family under Patterns/{Math,NN,Tensor} and currently cover Conv, Gemm, MatMul, elementwise Add/Mul/Div, ReduceMean, pooling, Relu, Sigmoid, Softmax, Concat, Gather, Reshape, Resize, and Split.
Merge compute nodes (src/PIM/Dialect/Spatial/Transforms/MergeComputeNodes). Builds a compute graph, schedules it with the PEFT scheduler, and materializes the merge schedule into Spatial IR. Supporting scheduling code lives under MergeComputeNodes/Scheduling.
Spatial -> Pim (src/PIM/Conversion/SpatialToPim). Lowers Spatial operations to the pim dialect (src/PIM/Dialect/Pim), including pim.core, pim.core_batch, communication, tensor packing, global tensor materialization, and return-path normalization.
Bufferization (src/PIM/Dialect/Pim/Transforms/Bufferization). Converts tensor-semantics PIM IR into memref-semantics PIM IR using MLIR's bufferization interfaces.
Static memory coalescing (src/PIM/Dialect/Pim/Transforms/StaticMemoryCoalescing). Reuses compatible local memref allocations inside PIM cores before codegen.
PIM code generation (src/PIM/Pass/PimCodegen and src/PIM/Compiler). Folds host constants, materializes remaining host constants, verifies PIM IR, emits .pim core files, writes weights, and writes memory.bin / config.json.

Supporting pieces:

src/PIM/Common - shared IR, filesystem, diagnostics, reports, and utility helpers.
src/PIM/Compiler - PIM compiler options, memory/address planning, binary instruction format, artifact writing, weight emission, and codegen entry points.
src/PIM/Conversion/SpatialToGraphviz - optional Spatial graphviz conversion pass.
src/PIM/Pass - pass registration and auxiliary passes.
src/PIM/PimAccelerator.{cpp,hpp} - ONNX-MLIR accelerator entry point.

Key compiler options

Pass these to onnx-mlir when compiling for PIM:

--maccel=PIM - select the PIM accelerator.
--EmitSpatial, --EmitPim, --EmitPimBufferized, --EmitPimCodegen - stop the PIM pipeline at the requested stage. The PIM default is --EmitPimCodegen.
--core-count=<N> - required positive core count for PIM compilation.
--crossbar-size=<N> - crossbar width/height. Default in code is 2.
--crossbar-count=<N> - crossbars per core. Default in code is 256.
--pim-merge-scheduler=peft - merge scheduler. peft is the only accepted value in the current code.
--pim-only-codegen - assume input is already bufferized PIM IR and only run the codegen tail.
--pim-emit-json - also emit core_*.json instruction files alongside core_*.pim.
--use-experimental-conv-impl - use the alternate convolution lowering.
--ignore-concat-error - soft-fail a ConcatOp corner case.

Example:

./build_release/Release/bin/onnx-mlir model.onnx -o /tmp/raptor/model \
  --maccel=PIM --EmitPimCodegen \
  --crossbar-size=2048 --crossbar-count=256 --core-count=1000

This writes PIM artifacts under /tmp/raptor/pim/.

Validation

Functional validation lives in validation/. It compiles ONNX models, builds a native ONNX-MLIR reference runner, generates random inputs, runs Raptor, runs the Rust PIM simulator, and compares outputs.

Python dependencies used by the validation scripts are numpy, onnx, and colorama. The simulator requires the Rust toolchain.

Per-operation validation from the repository root:

python3 validation/validate.py \
  --raptor-path build_release/Release/bin/onnx-mlir \
  --onnx-include-dir onnx-mlir/include \
  --core-count 1000

Validate one network or a subset by pointing --operations-dir at any directory containing .onnx files:

python3 validation/validate.py \
  --raptor-path build_release/Release/bin/onnx-mlir \
  --onnx-include-dir onnx-mlir/include \
  --operations-dir validation/networks/yolo11n/depth_04 \
  --crossbar-size 2048 --crossbar-count 256 --core-count 1000

Useful validation options:

--simulator-dir <path> - override the auto-detected backend-simulators/pim/pim-simulator path.
--threshold <float> - maximum allowed per-element output difference.
--seed <int> - RNG seed for generated inputs.
--command-timeout-seconds <float> - timeout for compiler, runner, and simulator subprocesses.
--verbose - print subprocess logs and average PIM pass timings.
--clean - remove generated validation artifacts and exit.

Each validation run writes artifacts in the model workspace, for example under validation/operations/gemm/small/:

inputs/ - generated input CSV files.
outputs/ - native ONNX-MLIR reference outputs.
raptor/ - compiler artifacts, including *.onnx.mlir, dialect dumps under dialects/, reports under reports/, and final PIM artifacts under pim/.
runner/ - generated reference runner source, build tree, and shared library.
simulation/out.bin - raw simulator output used for comparison.

The compiler currently dumps dialect snapshots such as spatial0.mlir, spatial1_dcp_merged.mlir, pim0.mlir, pim1_buff.mlir, pim2_coalesced.mlir, pim3_folded.mlir, and pim4_materialized.mlir when an output directory is available.

To rerun the simulator manually with tracing after validation has produced a raptor/pim/ directory:

cd backend-simulators/pim/pim-simulator
cargo run --no-default-features --features tracing --release \
  --package pim-simulator --bin pim-simulator -- \
  -f /path/to/workspace/raptor/pim \
  -o /path/to/workspace/simulation/out.bin \
  -d <addr0>,<size0>,<addr1>,<size1>,...

With --features tracing, the simulator writes per-core traces as TraceCore0, TraceCore1, ... next to out.bin. The validator normally computes the -d ranges from raptor/pim/config.json and model output shapes.

Available validation networks under validation/networks/: vgg16, yolo11n, yolo11nv2.

Available operation suites under validation/operations/: add, concat, conv, div, gather, gemm, gemv, matmul, mul, pool, reduce_mean, relu, reshape, resize, sigmoid, softmax, split.

Generated operation tests can be regenerated with:

python3 validation/operations/gen_tests.py

Build

Initialize submodules first:

git submodule update --init --recursive

The project follows ONNX-MLIR's build requirements. The CI workflow documents the currently used versions and setup:

CMake 4.3.0 in CI,
LLVM/MLIR checked out under onnx-mlir/llvm-project,
Protobuf v34.0,
Rust stable for pim-simulator,
Python packages numpy, onnx, colorama for validation.

Protobuf

Install Protobuf if your system does not already provide a compatible version:

git clone --depth 1 --branch v34.0 https://github.com/protocolbuffers/protobuf
cmake -S protobuf -B protobuf/build -G Ninja \
  -DCMAKE_BUILD_TYPE=Release \
  -Dprotobuf_BUILD_TESTS=OFF
cmake --build protobuf/build
sudo cmake --install protobuf/build

You can then remove the temporary checkout:

rm -rf protobuf

MLIR

Follow the ONNX-MLIR instructions in onnx-mlir/docs/BuildOnLinuxOSX.md to build LLVM/MLIR. The local Raptor build expects MLIR_DIR to point at the MLIR CMake package, for example:

MLIR_DIR=$(pwd)/onnx-mlir/llvm-project/build_release/lib/cmake/mlir

If your LLVM build directory is named build instead of build_release, adjust the path accordingly.

Raptor

Configure a release build:

MLIR_DIR=$(pwd)/onnx-mlir/llvm-project/build_release/lib/cmake/mlir
cmake -S . -B build_release -G Ninja \
  -DCMAKE_BUILD_TYPE=Release \
  -DONNX_MLIR_ACCELERATORS=PIM \
  -DLLVM_ENABLE_ASSERTIONS=ON \
  -DMLIR_DIR=${MLIR_DIR}

Configure a debug build similarly:

MLIR_DIR=$(pwd)/onnx-mlir/llvm-project/build_debug/lib/cmake/mlir
cmake -S . -B build_debug -G Ninja \
  -DCMAKE_BUILD_TYPE=Debug \
  -DONNX_MLIR_ACCELERATORS=PIM \
  -DLLVM_ENABLE_ASSERTIONS=ON \
  -DMLIR_DIR=${MLIR_DIR}

For debug development, using mold can reduce link time and memory use:

cmake -S . -B build_debug -G Ninja \
  -DCMAKE_BUILD_TYPE=Debug \
  -DONNX_MLIR_ACCELERATORS=PIM \
  -DLLVM_ENABLE_ASSERTIONS=ON \
  -DMLIR_DIR=${MLIR_DIR} \
  -DCMAKE_EXE_LINKER_FLAGS="-fuse-ld=mold" \
  -DCMAKE_SHARED_LINKER_FLAGS="-fuse-ld=mold" \
  -DCMAKE_MODULE_LINKER_FLAGS="-fuse-ld=mold"

Build the compiler with CMake:

cmake --build ./build_release
cmake --build ./build_debug

Do not invoke ninja directly for this project; use cmake --build so CMake's configuration and generated shims stay consistent.

If a build fails because Protobuf headers are missing fixed-width integer definitions, patch the affected Protobuf-generated files by adding #include <cstdint>.

Tests

The Rust simulator has its own tests:

cd backend-simulators/pim/pim-simulator
cargo test

Repository Layout

src/PIM/ - PIM accelerator implementation.
test/PIM/ - PIM C++ unit tests.
validation/ - functional validation scripts, ONNX operation tests, network slices, and pimsim config generation.
backend-simulators/pim/pim-simulator/ - in-tree Rust functional simulator.
backend-simulators/pim/pimsim-nn/ - performance simulator submodule.
pimcomp_utils/ - local comparison helpers for PIMCOMP-NN.
.github/actions/ and .github/workflows/validate_operations.yml - CI setup for MLIR/Protobuf caching, building Raptor, and validation.