diff --git a/.gitignore b/.gitignore
index e319701..05dd772 100644
--- a/.gitignore
+++ b/.gitignore
@@ -7,10 +7,7 @@
 
 CMakeUserPresets.json
 
-build
-build_release
-cmake-build-debug
-cmake-build-release
+build_*
 compile.sh
 
 **/__*
diff --git a/AGENTS.md b/AGENTS.md
index 59749e6..37021ff 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -1,26 +1,31 @@
-- Always read the full README.md before building or running any commands.
-- Build command: `cmake --build /home/nico/raptor/raptor/cmake-build-release --target onnx-mlir -j 30`
-- Never use `ninja` directly — it bypasses cmake's configuration and invalidates the build cache.
+- Always read the full README.md before doing anything.
+- Build commands:
+    - `cmake --build ./build_release --target onnx-mlir -j 30`
+    - `cmake --build ./build_debug --target onnx-mlir -j 30`
+- Never use `ninja` directly: it bypasses cmake's configuration and invalidates the build cache.
 
 # Code changes
 
 - Keep changes minimal and localized to the relevant parts of the code.
 - Preserve the existing naming conventions and coding style used in the surrounding code.
-- Keep code easy to read, well organized, and suitable for future extensibility.
+- Keep code easy to read, well organized, and suitable for future extensibility. A function must not be longer than
+  200/250 lines for readability and cognitive complexity.
 - Prefer clear naming and structure over comments. Add comments only when they materially improve clarity.
 - Do not rename symbols, move files, or restructure modules unless that is necessary for the requested change.
 
 # Working style
 
 - Infer style and conventions from the existing code before introducing new patterns.
-- When several implementation options are possible, prefer the simplest one that fits the current architecture and minimizes churn.
+- When several implementation options are possible, prefer the simplest one that fits the current architecture and
+  minimizes churn.
 - Avoid broad refactors unless I explicitly ask for them.
 
 # Responses
 
 - When showing code in chat, make it easy to copy-paste into the codebase.
 - Keep outputs focused on the changed parts.
-- At the end of the response, briefly list any bad practices, mistakes, or cleaner alternatives you noticed, separate from the main solution.
+- At the end of the response, briefly list any bad practices, mistakes, or cleaner alternatives you noticed, separate
+  from the main solution.
 
 # Guidelines
 
@@ -29,6 +34,7 @@
 **Don't assume. Don't hide confusion. Surface tradeoffs.**
 
 Before implementing:
+
 - State your assumptions explicitly. If uncertain, ask.
 - If multiple interpretations exist, present them - don't pick silently.
 - If a simpler approach exists, say so. Push back when warranted.
@@ -39,8 +45,6 @@ Before implementing:
 **Minimum code that solves the problem. Nothing speculative.**
 
 - No features beyond what was asked.
-- No abstractions for single-use code.
-- No "flexibility" or "configurability" that wasn't requested.
 - No error handling for impossible scenarios.
 - If you write 200 lines and it could be 50, rewrite it.
 
@@ -51,14 +55,16 @@ Ask yourself: "Would a senior engineer say this is overcomplicated?" If yes, sim
 **Touch only what you must. Clean up only your own mess.**
 
 When editing existing code:
+
 - Don't "improve" adjacent code, comments, or formatting.
 - Don't refactor things that aren't broken.
 - Match existing style, even if you'd do it differently.
 - If you notice unrelated dead code, mention it - don't delete it.
 
 When your changes create orphans:
+
 - Remove imports/variables/functions that YOUR changes made unused.
-- Don't remove pre-existing dead code unless asked.
+- Don't remove pre-existing dead code unless asked, but mention it.
 
 The test: Every changed line should trace directly to the user's request.
 
@@ -67,11 +73,13 @@ The test: Every changed line should trace directly to the user's request.
 **Define success criteria. Loop until verified.**
 
 Transform tasks into verifiable goals:
+
 - "Add validation" → "Write tests for invalid inputs, then make them pass"
 - "Fix the bug" → "Write a test that reproduces it, then make it pass"
 - "Refactor X" → "Ensure tests pass before and after"
 
 For multi-step tasks, state a brief plan:
+
 ```
 1. [Step] → verify: [check]
 2. [Step] → verify: [check]
@@ -81,5 +89,3 @@ For multi-step tasks, state a brief plan:
 Strong success criteria let you loop independently. Weak criteria ("make it work") require constant clarification.
 
 ---
-
-**These guidelines are working if:** fewer unnecessary changes in diffs, fewer rewrites due to overcomplication, and clarifying questions come before implementation rather than after mistakes.
diff --git a/README.md b/README.md
index e569cf6..5a52b70 100644
--- a/README.md
+++ b/README.md
@@ -43,7 +43,7 @@ framework-level details).
 High-level lowering flow:
 
 ```
-ONNX-MLIR ──► Spatial ──► Pim (tensor) ──► Pim (bufferized) ──► PIM JSON
+ONNX-MLIR ──► Spatial ──► Pim (tensor) ──► Pim (bufferized) ──► PIM code
 ```
 
 1. **ONNX → Spatial** (`src/PIM/Conversion/ONNXToSpatial`).
@@ -55,7 +55,7 @@ ONNX-MLIR ──► Spatial ──► Pim (tensor) ──► Pim (bufferized) 
    Conversion patterns are split by op family under
    `Conversion/ONNXToSpatial/Patterns/{Math,NN,Tensor}` (Conv, Gemm, MatMul,
    Elementwise, ReduceMean, Pool, Relu, Sigmoid, Softmax, Concat, Gather,
-   Reshape, Resize, Split).
+   Reshape, Resize, Split, etc...).
 
 2. **Spatial → Pim** (`src/PIM/Conversion/SpatialToPim`).
    Lowers Spatial to the `pim` dialect (`src/PIM/Dialect/Pim`), which
@@ -63,10 +63,7 @@ ONNX-MLIR ──► Spatial ──► Pim (tensor) ──► Pim (bufferized) 
    (`pim.send` / `pim.receive`), halts, and crossbar-level operations.
 
 3. **Merge compute nodes** (`src/PIM/Dialect/Spatial/Transforms/MergeComputeNodes`).
-   A DCP-inspired heuristic (Dynamic Critical Path — see the original
-   scheduling paper by Kwok & Ahmad,
-   [DCP-eScience2007](https://clouds.cis.unimelb.edu.au/papers/DCP-eScience2007.pdf))
-   that coarsens the virtual node graph and decides how to group compute
+   A PEFT heuristic that coarsens the virtual node graph and decides how to group compute
    nodes onto cores. Our implementation is only DCP-*inspired*: it is a
    heuristic with different assumptions from the paper (different cost
    model, constraints from crossbar capacity / core resources, and a