ax0n documentation

ax0n is a permissionless AI inference mesh — a decentralized network of 127,000+ GPU nodes that serves and verifies AI inference requests with sub-millisecond latency. Every inference is attested by a zero-knowledge proof and settled on Ethereum.

This documentation covers the ax0n SDK, the node operator program, and the underlying protocol components: NeuralRoute, ProofCore, and MeshSync.

Architecture

ax0n is split into three independent, auditable layers:

  • NeuralRoute — latency-aware routing engine that selects an optimal node cluster for each request within 2ms.
  • ProofCore — custom Groth16 ZK circuit that attests to model integrity and output determinism. Generates a constant 288-byte proof per inference.
  • MeshSync — Byzantine fault-tolerant consensus across 10,000+ validators. Achieves finality in 1.2s under normal conditions.
Currently on v0.2 mainnet. Some features marked BETA are available but may change before v1.0.

Installation

The ax0n SDK is available for Node.js and Python. Both packages expose identical APIs.

Node.js

BASH
npm install @ax0n/sdk # or pnpm add @ax0n/sdk yarn add @ax0n/sdk

Python

BASH
pip install ax0n-sdk # or with Poetry poetry add ax0n-sdk

Requirements

  • Node.js ≥ 18 or Python ≥ 3.10
  • An Ethereum-compatible private key with ax0n balance
  • Internet access to the ax0n mempool endpoint

Authentication

ax0n authenticates requests using an Ethereum private key. Your key is used to sign inference requests and authorize ax0n fee deductions — it is never sent to any server. All signing happens locally in the SDK.

Never commit your private key. Use environment variables or a secrets manager. The ax0n SDK reads from process.env.AX0N_PRIVATE_KEY by default.
TYPESCRIPT
import { Ax0n } from '@ax0n/sdk'; // Option A — env var (recommended) const client = new Ax0n(); // reads AX0N_PRIVATE_KEY from environment automatically // Option B — explicit key const client = new Ax0n({ privateKey: process.env.MY_KEY, network: 'mainnet', // or 'sepolia' for testnet });

Testnet

Set network: 'sepolia' to use the Sepolia testnet. Testnet ax0n is available from the faucet ↗. Proofs on testnet are still generated by real ProofCore circuits but settlement is on Sepolia, not Ethereum mainnet.

Your first inference

Once you have the SDK installed and your key configured, submitting an inference takes three lines of code.

TYPESCRIPT
import { Ax0n } from '@ax0n/sdk'; const client = new Ax0n(); const result = await client.infer({ model: 'llama-3-70b', prompt: 'Explain the Riemann hypothesis in one paragraph.', }); console.log(result.output); // → "The Riemann hypothesis states that all non-trivial zeros..." console.log(result.latencyMs, result.fee); // → 0.7 "0.0023"
The default maxFee is '0.01' ax0n. If the network fee exceeds this the request will be rejected with ERR_FEE_EXCEEDED. Lower the fee cap only if you are willing to accept rejection during congestion.

client.infer()

Submit a single inference request to the mesh. Blocks until the proof is generated and the result is returned. For streaming responses use client.stream().

SIGNATURE
async infer(options: InferOptions): Promise<InferResult>

InferOptions

PARAMTYPE DESCRIPTION
modelstringrequiredModel ID. See client.models() for available models.
promptstringrequiredThe input prompt. Max 32,768 tokens.
systemPromptstringoptionalSystem prompt prepended before the user prompt. Not all models support this.
maxFeestringoptionalMaximum ax0n fee as a decimal string. Default: '0.01'. Request rejects if network fee exceeds this.
maxLatencynumberoptionalMaximum acceptable end-to-end latency in ms. Default: 5000. NeuralRoute only selects clusters within this budget.
temperaturenumberoptionalSampling temperature 0–2. Default: 0.7. Higher values produce more varied output.
maxTokensnumberoptionalMaximum tokens to generate. Default: model-specific (typically 2048).

InferResult

TYPESCRIPT
interface InferResult { output: string; // model response text proof: string; // 288-byte Groth16 SNARK, hex-encoded txHash: string; // Ethereum settlement tx hash latencyMs: number; // end-to-end wall-clock latency fee: string; // ax0n deducted (decimal string) nodeId: string; // serving node identifier (hex) batchId: number; // settlement batch number containing this proof tokens: number; // output token count }

client.stream() v0.2

Submit an inference request and receive output as an async iterable token stream. The ZK proof is generated once the model finishes and is available via stream.proof().

Streaming and proof generation are sequential. The proof is not available until the stream is fully consumed. If you need the proof, await stream.proof() after the loop.
TYPESCRIPT
const stream = client.stream({ model: 'llama-3-70b', prompt: 'Write a short poem about entropy.', }); for await (const chunk of stream) { process.stdout.write(chunk.delta); } // proof available after stream is consumed const proof = await stream.proof(); console.log(proof.txHash);

client.models()

Returns the list of models currently available on the mesh, along with per-model latency and fee estimates.

MODEL ID
PARAMS
TIER
BASE FEE
AVG LATENCY
llama-3-70b
70B
A
0.0023 ax0n
0.7ms
qwen2-72b
72B
A
0.0031 ax0n
1.1ms
mistral-7b
7B
B
0.0008 ax0n
0.3ms
phi-3-mini
3.8B
B
0.0004 ax0n
0.2ms
deepseek-r2-67b
67B
A
0.0029 ax0n
0.9ms

Tier-A models require RTX 4090-class hardware and represent less than 20% of the node pool but handle 70% of high-value requests via NeuralRoute's weighted dispatch.

client.balance()

Returns the ax0n token balances for the configured wallet address.

TYPESCRIPT
const bal = await client.balance(); // → { wallet: "4821.33", staked: "1000.00", pending: "0.00" }

client.stake()

Stakes ax0n tokens to the ax0n staking contract. Staked ax0n is required to operate a node (minimum 1,000 ax0n) and earns a share of inference fees.

TYPESCRIPT
const tx = await client.stake('1000'); // amount in ax0n console.log(tx.hash); // Ethereum tx hash
Staked ax0n is subject to a 256-block slash window. If your node produces an invalid proof, up to 10% of your stake can be slashed by the MeshSync validator quorum.

client.unstake()

Initiates an unstake request. There is a 7-day unbonding period before tokens are returned to your wallet.

TYPESCRIPT
const tx = await client.unstake('500'); // Tokens claimable after: block #5,447,392 (~7 days)

Requirements

Any machine that meets the minimum hardware specifications can join the ax0n mesh and earn ax0n fees. There is no whitelist or application process — staking is the only gate.

COMPONENTMINIMUMRECOMMENDED
GPUNVIDIA RTX 3090 (24 GB VRAM)RTX 4090 or A100 80 GB
System RAM64 GB DDR4128 GB DDR5
Storage2 TB NVMe SSD4 TB NVMe SSD (for multi-model)
Network1 Gbps symmetric10 Gbps symmetric
OSUbuntu 22.04 LTSUbuntu 22.04 LTS
ax0n staked1,000 ax0n10,000 ax0n (higher routing weight)
Uptime matters. Nodes with <95% uptime in an epoch are de-weighted by NeuralRoute and earn proportionally fewer fees. Nodes below 80% uptime for two consecutive epochs are automatically deregistered.

Setup

Install the ax0n node runtime using the one-line installer, then configure your private key and start the daemon.

BASH
# 1. install runtime curl -sSL https://get.ax0n.fun | bash # 2. configure axon-node init --key $AX0N_PRIVATE_KEY # 3. start axon-node start # → registering to mesh... # → assigned cluster: eu-west-7 # → node online. earning ax0n per inference.

To run the node as a systemd service for automatic restarts:

BASH
axon-node install-service systemctl enable ax0n-node && systemctl start ax0n-node

Configuration

The node runtime is configured via ~/.ax0n/config.toml. Edit this file before starting the daemon.

TOML — ~/.ax0n/config.toml
[node] private_key = "0x..." # required cluster = "auto" # or e.g. "eu-west-7" max_concurrent = 4 # parallel inference slots models = ["llama-3-70b", "mistral-7b"] [hardware] gpu_device = 0 # CUDA device index gpu_memory_limit = "22GB" cpu_threads = 8 [network] listen_addr = "0.0.0.0:9000" max_connections = 256 announce_ip = "auto" # or your public IP [proofcore] proof_workers = 2 # parallel proof generation threads cache_keys = true # cache verifier keys on disk

Monitoring

The node daemon exposes a Prometheus metrics endpoint at http://localhost:9001/metrics and a JSON status endpoint at http://localhost:9001/status.

BASH
curl http://localhost:9001/status | jq # → { "node_id": "0x9a77…33ef", "cluster": "eu-west-7", "status": "online", "uptime_pct": 99.8, "inferences": 841293, "axn_earned": "1934.22", "slash_risk": false, "stake": "1000.00" }

Key metrics to watch: axon_proof_latency_p95, axon_inference_errors_total, and axon_slash_events_total. A rising p95 proof latency indicates GPU thermal throttling or resource contention.

NeuralRoute Protocol

NeuralRoute is ax0n's request routing layer. For each incoming inference request it selects an optimal node cluster in under 2ms by scoring every registered node against a weighted multi-factor index.

Scoring algorithm

Each node receives a real-time composite score:

PSEUDOCODE
score(node) = (1 - norm_latency_p95(node)) * 0.45 + norm_stake_weight(node) * 0.30 + norm_uptime_30d(node) * 0.15 + norm_hardware_tier(node) * 0.10

Scores are stored in a Bloom-trie index that supports O(log n) range queries filtered by model availability, geographic region, and current capacity. The index is rebuilt every 16 blocks (~3.2s).

Cluster selection

NeuralRoute selects the top-4 scoring nodes that can collectively serve the requested model within the client's maxLatency budget. Requests are dispatched to the highest-scoring node; the remaining three act as hot standby in case of primary failure.

ProofCore ZK Attestation

ProofCore is ax0n's zero-knowledge attestation layer. After a node runs inference, it passes the output through a custom Groth16 circuit that produces a 288-byte proof attesting to two properties:

  • Model integrity — the model weights used match the committed hash for the declared model ID.
  • Output determinism — given the same input and seed, the output is reproducible (no hallucinated randomness).
Weights are never revealed. The Groth16 circuit uses a commitment scheme: the node commits to its weights during registration and proves knowledge of the preimage. The verifier can check the proof without seeing the weights.

Proof generation time

ProofCore targets 120ms proof generation on RTX 4090-class hardware, regardless of model parameter count. The constant generation time is achieved by representing model inference as a fixed-depth arithmetic circuit with model-size-independent witness size. Nodes below the 200ms threshold are automatically flagged and down-weighted by NeuralRoute.

MeshSync Consensus

MeshSync is ax0n's validator layer. After a node generates a proof, a randomly-sampled quorum of 256 validators verifies it using their local verifier keys and co-signs the result.

Quorum sampling

The 256-validator quorum is sampled from the full validator set (currently 10,247 active) using the parent block hash as a randomness seed. This makes quorum membership unpredictable and resistant to targeted bribery attacks.

A v0.3 upgrade will replace block-hash randomness with a VRF-based sampling scheme for stronger security guarantees. See the governance proposal #14 ↗.

Finality

  • Optimistic path (normal): 1.2s — quorum signs, aggregate signature posted to settlement batch.
  • Dispute path: 6s — ZK fallback triggered if any validator submits a counter-proof; a secondary quorum adjudicates.

Settlement

Proven inference results are bundled into settlement batches and settled on Ethereum every 512 blocks (~1.7 hours). Batches include up to 4,096 proof/quorum-signature pairs, compressed using recursive SNARKs to keep calldata costs constant regardless of batch size.

Fee distribution

In the same settlement transaction, ax0n fees are distributed:

  • 78% → node operator (the serving node)
  • 12% → validator quorum (split evenly across 256 validators)
  • 10% → protocol treasury (governed by ax0n holders)

Proposals

ax0n protocol parameters are governed by ax0n token holders via the on-chain Governor contract. Any address holding ≥ 10,000 ax0n (or delegate) can submit a proposal.

Proposals can modify any parameter in the protocol config — fee rates, quorum size, epoch length, slash window — or upgrade the underlying contracts.

TYPESCRIPT
const gov = client.governance(); const proposal = await gov.propose({ title: 'Reduce min_node_stake to 500 ax0n', description:'Lower barrier to entry for smaller operators...', calldata: governorABI.encodeFunctionData('setParam', [ 'min_node_stake', parseAX0N('500'), ]), });

Voting

Proposals enter a 3-day voting window. Passing requires ≥51% of participating ax0n weight voting in favour, with a minimum quorum of 5% of total circulating supply. Passed proposals are queued behind a 7-day timelock before execution.

PHASEDURATIONDESCRIPTION
Pending1 dayProposal submitted, delegates can align votes
Active3 daysVoting window open for all ax0n holders
Queued7 daysTimelock — community can review before execution
ExecutedParameter change applied on-chain

Error reference

CODEDESCRIPTION
ERR_FEE_EXCEEDEDThe network fee for this request exceeded your maxFee cap. Increase maxFee or try again during lower congestion.
ERR_LATENCY_EXCEEDEDNo node cluster could serve the request within your maxLatency budget. Increase maxLatency or choose a smaller model.
ERR_PROOF_TIMEOUTProofCore did not generate the proof within 500ms. The node was de-weighted; your request was not charged.
ERR_QUORUM_FAILEDThe validator quorum did not reach consensus within the dispute window. Request not settled; no fee charged.
ERR_MODEL_UNAVAILABLENo node in the mesh is currently serving the requested model. Call client.models() to check availability.
ERR_INSUFFICIENT_BALANCEWallet ax0n balance is below the estimated fee. Top up your balance and retry.
ERR_INVALID_KEYThe private key is malformed or does not match an Ethereum address. Check your AX0N_PRIVATE_KEY environment variable.

Changelog

v0.2.3 — 2025-10-22

  • Fix: NeuralRoute validator score cache now correctly invalidated on epoch boundary
  • Fix: client.unstake() no longer throws on partial unstake with rounding error
  • Perf: ProofCore proof generation 11% faster on Ampere-class GPUs (CUDA kernel rewrite)
  • Add: axon_slash_events_total Prometheus metric

v0.2.1 — 2025-09-08

  • Fix: ProofCore bounds check on proof_size header (Trail of Bits LOW-01)
  • Add: client.stream() — streaming inference API
  • Add: deepseek-r2-67b model support
  • Change: Default maxFee reduced from '0.05' to '0.01' ax0n

v0.2.0 — 2025-08-01 — mainnet launch

  • Initial mainnet deployment on Ethereum
  • NeuralRoute v2.1 with Bloom-trie index and multi-factor scoring
  • ProofCore v2 — Groth16 circuit with constant 288-byte proof size
  • MeshSync with 256-node quorum sampling via block hash
  • Node operator program open — 127,000+ nodes registered at launch