
> deploying neural network to blockchain_
We deployed a full GPT neural network as a Solana program. Every matrix multiplication, every attention head, every token — computed entirely on-chain. One sentence consumes 25% of Solana's total compute capacity for a full minute.
// STATUS: PENDING MAINNET DEPLOYMENT
The LLM is built and verified on localnet. Deploying to mainnet requires ~69 SOL for program deployment fees, account rent, and transaction costs. Buy the token below to fund the launch.
// MAINNET LAUNCH FUNDLIVE
VERIFY ON-CHAIN →89VB5UmvopuCFmp5Mf8YPX28fGvvqn79afCgouQuPyhY
// FUND THE EXPERIMENT
Buy the Token, Deploy the LLM
This token funds the mainnet launch — program deployment, account creation, and transaction fees. Largely unrelated to the research itself, but it's the final piece needed to go live and stress-test Solana for real.
CA: CLWeikxiw8pC9JEtZt14fqDzYfXF7uVwLuvnJPkrE7av
Paste the CA in Axiom's search bar after signing up
// ON-CHAIN OUTPUT — VERIFIED ON SOLANA LOCALNET
PROMPT:
"Once upon a time"
OUTPUT:
"Once upon a time..."
0%
OF SOLANA'S COMPUTE
consumed per sentence
0.0B
COMPUTE UNITS
per sentence generated
0
TRANSACTIONS
for 9 tokens of output
0
BLOCKS CONSUMED
~59 seconds of chain
// ARCHITECTURE
A full GPT-Neo transformer with 8 attention layers, 16 heads, and a 50,257-token vocabulary — deployed as a Solana BPF program. Every forward pass happens on-chain. No oracles. No off-chain compute. Pure blockchain inference.

FIG.01 — NEURAL NETWORK ON CHAIN
// NETWORK IMPACT ANALYSIS

FIG.02 — BLOCK SPACE CONSUMPTION
Solana processes 48 million compute units per block, with blocks every 400ms. Each token of our LLM output requires ~148.5 million CU — that's 3 entire blocks worth of compute. But the real constraint is the per-account write lock: only 12M CU per account per block.
CRITICAL FINDING
One user generating a 9-token sentence would consume approximately 25% of Solana's entire network compute capacity for one full minute. Multiple concurrent users running inference would create significant block space contention.
In 2023, I discovered that Clockwork's scheduling software allowed recursive transactions — a transaction that spawns another transaction in the same slot. With enough SOL to pay the fees, this created an infinite loop that overwhelmed validators.
The Clockwork team shrugged it off. A few days later, the entire Solana network went down when Clockwork came back online. They eventually shut down in October 2023, citing "limited commercial upside."
"I figured out that you could do recursive transactions. A transaction that calls another transaction in the same slot. If you have enough money to pay the Pied Piper, that's terrible for blockchains."

FIG.03 — RECURSIVE TRANSACTION EXPLOIT
// OPEN INVITATION
The program is deployed. The weights are on-chain. The math checks out. Now we want to see what happens when multiple users run inference simultaneously on mainnet. How does the network handle it? What breaks first — the scheduler, the write locks, or the validators?
This isn't about breaking things for the sake of it. It's about understanding the real limits of on-chain computation. Every blockchain claims infinite scalability. Let's test that.
SCENARIO 1
Single User Inference
One wallet runs a full sentence generation. Observe how 191 transactions per token interact with the block scheduler and priority fee market.
SCENARIO 2
Concurrent Users
Multiple wallets generate simultaneously. Each user has independent state accounts, but they all compete for block space. The write-lock contention becomes the bottleneck.
SCENARIO 3
The Nash Equilibrium
What's the game-theoretic equilibrium? If inference costs ~0.001 SOL/token but consumes 3 blocks of compute, how does the priority fee market respond? At what point does it become economically irrational to continue?
// FOR THE TECHNICALLY CURIOUS
The biggest challenge wasn't deploying the model — it was fitting each operation within Solana's 1.4 million compute unit limit per transaction. A single 64x64 matrix multiplication in f32 costs ~700,000 CU in BPF bytecode. Each transformer layer requires Q, K, V, and O projections plus a 4x-wide FFN — that's 8 matrix multiplications per layer, per position.
We split each layer into 15 separate instructions: LN1, Q_PROJ, K_PROJ, ATTN, V_O_PROJ, LN2, and 4 FFN chunks (UP_A, UP_B, DOWN_A, DOWN_B) plus GELU activation. For a 4-token prompt through 8 layers, that's 480 transactions just for the prefill phase.
// Per-token instruction breakdown
EMBED → 1 tx // token embedding lookup
LAYER x8 → 128 tx // 16 instructions per layer
OUTPUT_LN → 1 tx // final layer norm
COPY_HIDDEN → 1 tx // prepare for argmax
ARGMAX → 64 tx // 4 workers x 16 sub-chunks
MERGE → 1 tx // find best token
TOTAL: ~191 transactions per generated token
The quantization story is equally wild. INT8 quantization was too lossy for a 64-dimension model — cosine similarity between INT8 and f32 hidden states degraded to 0.287 by layer 8. We ended up with a hybrid approach: f16 embeddings, f32 transformer weights, and INT8 only for the output projection (lm_head). The model produces output that matches the HuggingFace reference token-for-token.
PER-USER STATE
141 KB
Each wallet gets its own state account with KV cache for context. Fully independent — your inference doesn't touch anyone else's.
RENT DEPOSIT
~1.01 SOL
Refundable when you close your session. The state account stores hidden states, KV cache across all 8 layers, and worker accounts.
CONTEXT LENGTH
32 TOKENS
Per-layer KV cache supports up to 32 positions. The model remembers your full conversation context for multi-turn generation.
> READY TO STRESS TEST?_
Everything — the Solana program, the model conversion scripts, the generation client, the multi-user architecture — is available for anyone to deploy, test, and push to mainnet.
// MAINNET LAUNCH FUNDLIVE
VERIFY ON-CHAIN →89VB5UmvopuCFmp5Mf8YPX28fGvvqn79afCgouQuPyhY
// DISCLAIMER
Honest disclaimer: I have no idea if this actually works on mainnet. It probably shouldn't. We're talking about running a neural network on a blockchain — that's not what blockchains are for. If it works, incredible. If it doesn't, thank you for participating in the experiment. This is not financial advice. The token exists solely to fund deployment costs. No promises, no guarantees, no refunds. Just vibes, math, and an unreasonable number of transactions.