ax0n documentation
ax0n is a permissionless AI inference mesh — a decentralized network of 127,000+ GPU nodes that serves and verifies AI inference requests with sub-millisecond latency. Every inference is attested by a zero-knowledge proof and settled on Ethereum.
This documentation covers the ax0n SDK, the node operator program, and the underlying protocol components: NeuralRoute, ProofCore, and MeshSync.
Architecture
ax0n is split into three independent, auditable layers:
- NeuralRoute — latency-aware routing engine that selects an optimal node cluster for each request within 2ms.
- ProofCore — custom Groth16 ZK circuit that attests to model integrity and output determinism. Generates a constant 288-byte proof per inference.
- MeshSync — Byzantine fault-tolerant consensus across 10,000+ validators. Achieves finality in 1.2s under normal conditions.
Installation
The ax0n SDK is available for Node.js and Python. Both packages expose identical APIs.
Node.js
Python
Requirements
- Node.js ≥ 18 or Python ≥ 3.10
- An Ethereum-compatible private key with ax0n balance
- Internet access to the ax0n mempool endpoint
Authentication
ax0n authenticates requests using an Ethereum private key. Your key is used to sign inference requests and authorize ax0n fee deductions — it is never sent to any server. All signing happens locally in the SDK.
process.env.AX0N_PRIVATE_KEY by default.Testnet
Set network: 'sepolia' to use the Sepolia testnet. Testnet ax0n is available from the
faucet ↗. Proofs on testnet are still generated by real ProofCore circuits but settlement is on Sepolia, not Ethereum mainnet.
Your first inference
Once you have the SDK installed and your key configured, submitting an inference takes three lines of code.
maxFee is '0.01' ax0n. If the network fee exceeds this the request will be rejected with ERR_FEE_EXCEEDED. Lower the fee cap only if you are willing to accept rejection during congestion.client.infer()
Submit a single inference request to the mesh. Blocks until the proof is generated and the result is returned. For streaming responses use client.stream().
InferOptions
| PARAM | TYPE | DESCRIPTION | |
|---|---|---|---|
| model | string | required | Model ID. See client.models() for available models. |
| prompt | string | required | The input prompt. Max 32,768 tokens. |
| systemPrompt | string | optional | System prompt prepended before the user prompt. Not all models support this. |
| maxFee | string | optional | Maximum ax0n fee as a decimal string. Default: '0.01'. Request rejects if network fee exceeds this. |
| maxLatency | number | optional | Maximum acceptable end-to-end latency in ms. Default: 5000. NeuralRoute only selects clusters within this budget. |
| temperature | number | optional | Sampling temperature 0–2. Default: 0.7. Higher values produce more varied output. |
| maxTokens | number | optional | Maximum tokens to generate. Default: model-specific (typically 2048). |
InferResult
client.stream() v0.2
Submit an inference request and receive output as an async iterable token stream. The ZK proof is generated once the model finishes and is available via stream.proof().
stream.proof() after the loop.client.models()
Returns the list of models currently available on the mesh, along with per-model latency and fee estimates.
Tier-A models require RTX 4090-class hardware and represent less than 20% of the node pool but handle 70% of high-value requests via NeuralRoute's weighted dispatch.
client.balance()
Returns the ax0n token balances for the configured wallet address.
client.stake()
Stakes ax0n tokens to the ax0n staking contract. Staked ax0n is required to operate a node (minimum 1,000 ax0n) and earns a share of inference fees.
client.unstake()
Initiates an unstake request. There is a 7-day unbonding period before tokens are returned to your wallet.
Requirements
Any machine that meets the minimum hardware specifications can join the ax0n mesh and earn ax0n fees. There is no whitelist or application process — staking is the only gate.
| COMPONENT | MINIMUM | RECOMMENDED |
|---|---|---|
| GPU | NVIDIA RTX 3090 (24 GB VRAM) | RTX 4090 or A100 80 GB |
| System RAM | 64 GB DDR4 | 128 GB DDR5 |
| Storage | 2 TB NVMe SSD | 4 TB NVMe SSD (for multi-model) |
| Network | 1 Gbps symmetric | 10 Gbps symmetric |
| OS | Ubuntu 22.04 LTS | Ubuntu 22.04 LTS |
| ax0n staked | 1,000 ax0n | 10,000 ax0n (higher routing weight) |
Setup
Install the ax0n node runtime using the one-line installer, then configure your private key and start the daemon.
To run the node as a systemd service for automatic restarts:
Configuration
The node runtime is configured via ~/.ax0n/config.toml. Edit this file before starting the daemon.
Monitoring
The node daemon exposes a Prometheus metrics endpoint at http://localhost:9001/metrics and a JSON status endpoint at http://localhost:9001/status.
Key metrics to watch: axon_proof_latency_p95, axon_inference_errors_total, and axon_slash_events_total. A rising p95 proof latency indicates GPU thermal throttling or resource contention.
NeuralRoute Protocol
NeuralRoute is ax0n's request routing layer. For each incoming inference request it selects an optimal node cluster in under 2ms by scoring every registered node against a weighted multi-factor index.
Scoring algorithm
Each node receives a real-time composite score:
Scores are stored in a Bloom-trie index that supports O(log n) range queries filtered by model availability, geographic region, and current capacity. The index is rebuilt every 16 blocks (~3.2s).
Cluster selection
NeuralRoute selects the top-4 scoring nodes that can collectively serve the requested model within the client's maxLatency budget. Requests are dispatched to the highest-scoring node; the remaining three act as hot standby in case of primary failure.
ProofCore ZK Attestation
ProofCore is ax0n's zero-knowledge attestation layer. After a node runs inference, it passes the output through a custom Groth16 circuit that produces a 288-byte proof attesting to two properties:
- Model integrity — the model weights used match the committed hash for the declared model ID.
- Output determinism — given the same input and seed, the output is reproducible (no hallucinated randomness).
Proof generation time
ProofCore targets 120ms proof generation on RTX 4090-class hardware, regardless of model parameter count. The constant generation time is achieved by representing model inference as a fixed-depth arithmetic circuit with model-size-independent witness size. Nodes below the 200ms threshold are automatically flagged and down-weighted by NeuralRoute.
MeshSync Consensus
MeshSync is ax0n's validator layer. After a node generates a proof, a randomly-sampled quorum of 256 validators verifies it using their local verifier keys and co-signs the result.
Quorum sampling
The 256-validator quorum is sampled from the full validator set (currently 10,247 active) using the parent block hash as a randomness seed. This makes quorum membership unpredictable and resistant to targeted bribery attacks.
A v0.3 upgrade will replace block-hash randomness with a VRF-based sampling scheme for stronger security guarantees. See the governance proposal #14 ↗.
Finality
- Optimistic path (normal): 1.2s — quorum signs, aggregate signature posted to settlement batch.
- Dispute path: 6s — ZK fallback triggered if any validator submits a counter-proof; a secondary quorum adjudicates.
Settlement
Proven inference results are bundled into settlement batches and settled on Ethereum every 512 blocks (~1.7 hours). Batches include up to 4,096 proof/quorum-signature pairs, compressed using recursive SNARKs to keep calldata costs constant regardless of batch size.
Fee distribution
In the same settlement transaction, ax0n fees are distributed:
- 78% → node operator (the serving node)
- 12% → validator quorum (split evenly across 256 validators)
- 10% → protocol treasury (governed by ax0n holders)
Proposals
ax0n protocol parameters are governed by ax0n token holders via the on-chain Governor contract. Any address holding ≥ 10,000 ax0n (or delegate) can submit a proposal.
Proposals can modify any parameter in the protocol config — fee rates, quorum size, epoch length, slash window — or upgrade the underlying contracts.
Voting
Proposals enter a 3-day voting window. Passing requires ≥51% of participating ax0n weight voting in favour, with a minimum quorum of 5% of total circulating supply. Passed proposals are queued behind a 7-day timelock before execution.
| PHASE | DURATION | DESCRIPTION |
|---|---|---|
| Pending | 1 day | Proposal submitted, delegates can align votes |
| Active | 3 days | Voting window open for all ax0n holders |
| Queued | 7 days | Timelock — community can review before execution |
| Executed | — | Parameter change applied on-chain |
Error reference
| CODE | DESCRIPTION |
|---|---|
| ERR_FEE_EXCEEDED | The network fee for this request exceeded your maxFee cap. Increase maxFee or try again during lower congestion. |
| ERR_LATENCY_EXCEEDED | No node cluster could serve the request within your maxLatency budget. Increase maxLatency or choose a smaller model. |
| ERR_PROOF_TIMEOUT | ProofCore did not generate the proof within 500ms. The node was de-weighted; your request was not charged. |
| ERR_QUORUM_FAILED | The validator quorum did not reach consensus within the dispute window. Request not settled; no fee charged. |
| ERR_MODEL_UNAVAILABLE | No node in the mesh is currently serving the requested model. Call client.models() to check availability. |
| ERR_INSUFFICIENT_BALANCE | Wallet ax0n balance is below the estimated fee. Top up your balance and retry. |
| ERR_INVALID_KEY | The private key is malformed or does not match an Ethereum address. Check your AX0N_PRIVATE_KEY environment variable. |
Changelog
v0.2.3 — 2025-10-22
- Fix: NeuralRoute validator score cache now correctly invalidated on epoch boundary
- Fix:
client.unstake()no longer throws on partial unstake with rounding error - Perf: ProofCore proof generation 11% faster on Ampere-class GPUs (CUDA kernel rewrite)
- Add:
axon_slash_events_totalPrometheus metric
v0.2.1 — 2025-09-08
- Fix: ProofCore bounds check on
proof_sizeheader (Trail of Bits LOW-01) - Add:
client.stream()— streaming inference API - Add: deepseek-r2-67b model support
- Change: Default
maxFeereduced from'0.05'to'0.01'ax0n
v0.2.0 — 2025-08-01 — mainnet launch
- Initial mainnet deployment on Ethereum
- NeuralRoute v2.1 with Bloom-trie index and multi-factor scoring
- ProofCore v2 — Groth16 circuit with constant 288-byte proof size
- MeshSync with 256-node quorum sampling via block hash
- Node operator program open — 127,000+ nodes registered at launch