Ruffian

Three Architectures

┌─────────────────────────────────────────────────────────────────────────────┐
│  1. STANDARD LLM                                                            │
│                                                                             │
│     ┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐            │
│     │  Embed   │ →  │ Transform│ →  │  Logits  │ →  │  Sample  │ → token    │
│     │  (GPU)   │    │  (GPU)   │    │  (GPU)   │    │  (CPU)   │            │
│     └──────────┘    └──────────┘    └──────────┘    └──────────┘            │
│                                                                             │
│     The model generates text. That's all it can do.                         │
└─────────────────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────────────────┐
│  2. LLM + TOOL USE                                                          │
│                                                                             │
│     ┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐            │
│     │  Embed   │ →  │ Transform│ →  │  Logits  │ →  │  Sample  │ → token    │
│     └──────────┘    └──────────┘    └──────────┘    └────┬─────┘            │
│                                                          │                  │
│                    ┌─────────────────────────────────────▼──────────────┐   │
│                    │  CPU: Parse JSON → Call API → Wait → Inject result │   │
│                    └────────────────────────────────────────────────────┘   │
│                                                                             │
│     Computation happens outside. CPU orchestrates. Latency accumulates.     │
└─────────────────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────────────────┐
│  3. RUFFIAN                                                                 │
│                                                                             │
│     ┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────────────┐    │
│     │  Embed   │ →  │ Transform│ →  │  Logits  │ →  │ Sample + VM      │    │
│     │  (GPU)   │    │  (GPU)   │    │  (GPU)   │    │ (GPU)            │    │
│     └──────────┘    └──────────┘    └──────────┘    └──────────────────┘    │
│                                                          │                  │
│                                       ┌──────────────────▼──────────────┐   │
│                                       │ If VM token: execute on GPU     │   │
│                                       │ Inject result as new tokens     │   │
│                                       └─────────────────────────────────┘   │
│                                                                             │
│     Computation happens inside. Zero CPU in the hot path. Native.           │
└─────────────────────────────────────────────────────────────────────────────┘

Config	Instructions	Time	MIPS
TINY	100,000	0.4 ms	242
SMALL	1,000,000	3.2 ms	312
MEDIUM	100,000,000	345 ms	290
LARGE	1,000,000,000	3.6 s	277

Ruffian

Computation as a Native Mode of Thought

The Problem with Tool Use

What if computation were native?

The Key Insight

Three Architectures

How Token Sampling Works (Background)

The VM State Machine

Example: What Actually Happens

Not Just Arithmetic

The Architecture

Single-Threaded, But That's Fine

Current Status

The Code is Simple

Example Conversation

Why This Matters

The Hallucination Problem

Beyond Arithmetic: The Vision

Phase 1: Calculator ✓

Phase 2: Programming Languages (current)

Phase 3: Persistent Operating System

Phase 4: Self-Modification

Context Access: The Model Can See Itself

The Path to Self-Modification

Proof Search: The Real Prize

The Trusting Trust Parallel

Design Constraints

What I Don't Know

Try It Yourself

What Comes Next

Summary

Ruffian

Computation as a Native Mode of Thought

Appendix: Token Protocol

Appendix: VM Configurations

Appendix: Performance (Measured)

Acknowledgments