// getting started

ax0n documentation

ax0n is a permissionless AI inference mesh — a decentralized network of 127,000+ GPU nodes that serves and verifies AI inference requests with sub-millisecond latency. Every inference is attested by a zero-knowledge proof and settled on Ethereum.

This documentation covers the ax0n SDK, the node operator program, and the underlying protocol components: NeuralRoute, ProofCore, and MeshSync.

Architecture

ax0n is split into three independent, auditable layers:

NeuralRoute — latency-aware routing engine that selects an optimal node cluster for each request within 2ms.
ProofCore — custom Groth16 ZK circuit that attests to model integrity and output determinism. Generates a constant 288-byte proof per inference.
MeshSync — Byzantine fault-tolerant consensus across 10,000+ validators. Achieves finality in 1.2s under normal conditions.

⬡

Currently on v0.2 mainnet. Some features marked BETA are available but may change before v1.0.

Installation

The ax0n SDK is available for Node.js and Python. Both packages expose identical APIs.

Node.js

BASH

npm install @ax0n/sdk # or pnpm add @ax0n/sdk yarn add @ax0n/sdk

Python

BASH

pip install ax0n-sdk # or with Poetry poetry add ax0n-sdk

Requirements

Node.js ≥ 18 or Python ≥ 3.10
An Ethereum-compatible private key with ax0n balance
Internet access to the ax0n mempool endpoint

Authentication

ax0n authenticates requests using an Ethereum private key. Your key is used to sign inference requests and authorize ax0n fee deductions — it is never sent to any server. All signing happens locally in the SDK.

⚠

Never commit your private key. Use environment variables or a secrets manager. The ax0n SDK reads from process.env.AX0N_PRIVATE_KEY by default.

TYPESCRIPT

import { Ax0n } from '@ax0n/sdk'; // Option A — env var (recommended) const client = new Ax0n(); // reads AX0N_PRIVATE_KEY from environment automatically // Option B — explicit key const client = new Ax0n({ privateKey: process.env.MY_KEY, network: 'mainnet', // or 'sepolia' for testnet });

Testnet

Set network: 'sepolia' to use the Sepolia testnet. Testnet ax0n is available from the faucet ↗. Proofs on testnet are still generated by real ProofCore circuits but settlement is on Sepolia, not Ethereum mainnet.

Your first inference

Once you have the SDK installed and your key configured, submitting an inference takes three lines of code.

TYPESCRIPT

import { Ax0n } from '@ax0n/sdk'; const client = new Ax0n(); const result = await client.infer({ model: 'llama-3-70b', prompt: 'Explain the Riemann hypothesis in one paragraph.', }); console.log(result.output); // → "The Riemann hypothesis states that all non-trivial zeros..." console.log(result.latencyMs, result.fee); // → 0.7 "0.0023"

→

The default maxFee is '0.01' ax0n. If the network fee exceeds this the request will be rejected with ERR_FEE_EXCEEDED. Lower the fee cap only if you are willing to accept rejection during congestion.

// sdk reference

client.infer()

Submit a single inference request to the mesh. Blocks until the proof is generated and the result is returned. For streaming responses use client.stream().

SIGNATURE

async infer(options: InferOptions): Promise<InferResult>

InferOptions

PARAM	TYPE		DESCRIPTION
model	string	required	Model ID. See client.models() for available models.
prompt	string	required	The input prompt. Max 32,768 tokens.
systemPrompt	string	optional	System prompt prepended before the user prompt. Not all models support this.
maxFee	string	optional	Maximum ax0n fee as a decimal string. Default: `'0.01'`. Request rejects if network fee exceeds this.
maxLatency	number	optional	Maximum acceptable end-to-end latency in ms. Default: `5000`. NeuralRoute only selects clusters within this budget.
temperature	number	optional	Sampling temperature 0–2. Default: `0.7`. Higher values produce more varied output.
maxTokens	number	optional	Maximum tokens to generate. Default: model-specific (typically 2048).

InferResult

TYPESCRIPT

interface InferResult { output: string; // model response text proof: string; // 288-byte Groth16 SNARK, hex-encoded txHash: string; // Ethereum settlement tx hash latencyMs: number; // end-to-end wall-clock latency fee: string; // ax0n deducted (decimal string) nodeId: string; // serving node identifier (hex) batchId: number; // settlement batch number containing this proof tokens: number; // output token count }

client.stream() v0.2

Submit an inference request and receive output as an async iterable token stream. The ZK proof is generated once the model finishes and is available via stream.proof().

△

Streaming and proof generation are sequential. The proof is not available until the stream is fully consumed. If you need the proof, await stream.proof() after the loop.

TYPESCRIPT

const stream = client.stream({ model: 'llama-3-70b', prompt: 'Write a short poem about entropy.', }); for await (const chunk of stream) { process.stdout.write(chunk.delta); } // proof available after stream is consumed const proof = await stream.proof(); console.log(proof.txHash);

client.models()

Returns the list of models currently available on the mesh, along with per-model latency and fee estimates.

llama-3-70b

70B

A

0.0023 ax0n

0.7ms

qwen2-72b

72B

A

0.0031 ax0n

1.1ms

mistral-7b

7B

B

0.0008 ax0n

0.3ms

phi-3-mini

3.8B

B

0.0004 ax0n

0.2ms

deepseek-r2-67b

67B

A

0.0029 ax0n

0.9ms

Tier-A models require RTX 4090-class hardware and represent less than 20% of the node pool but handle 70% of high-value requests via NeuralRoute's weighted dispatch.

client.balance()

Returns the ax0n token balances for the configured wallet address.

TYPESCRIPT

const bal = await client.balance(); // → { wallet: "4821.33", staked: "1000.00", pending: "0.00" }

client.stake()

Stakes ax0n tokens to the ax0n staking contract. Staked ax0n is required to operate a node (minimum 1,000 ax0n) and earns a share of inference fees.

TYPESCRIPT

const tx = await client.stake('1000'); // amount in ax0n console.log(tx.hash); // Ethereum tx hash

→

Staked ax0n is subject to a 256-block slash window. If your node produces an invalid proof, up to 10% of your stake can be slashed by the MeshSync validator quorum.

client.unstake()

Initiates an unstake request. There is a 7-day unbonding period before tokens are returned to your wallet.

TYPESCRIPT

const tx = await client.unstake('500'); // Tokens claimable after: block #5,447,392 (~7 days)

// node operators

Requirements

Any machine that meets the minimum hardware specifications can join the ax0n mesh and earn ax0n fees. There is no whitelist or application process — staking is the only gate.

COMPONENT	MINIMUM	RECOMMENDED
GPU	NVIDIA RTX 3090 (24 GB VRAM)	RTX 4090 or A100 80 GB
System RAM	64 GB DDR4	128 GB DDR5
Storage	2 TB NVMe SSD	4 TB NVMe SSD (for multi-model)
Network	1 Gbps symmetric	10 Gbps symmetric
OS	Ubuntu 22.04 LTS	Ubuntu 22.04 LTS
ax0n staked	1,000 ax0n	10,000 ax0n (higher routing weight)

△

Uptime matters. Nodes with <95% uptime in an epoch are de-weighted by NeuralRoute and earn proportionally fewer fees. Nodes below 80% uptime for two consecutive epochs are automatically deregistered.

Setup

Install the ax0n node runtime using the one-line installer, then configure your private key and start the daemon.

BASH

# 1. install runtime curl -sSL https://get.ax0n.fun | bash # 2. configure axon-node init --key $AX0N_PRIVATE_KEY # 3. start axon-node start # → registering to mesh... # → assigned cluster: eu-west-7 # → node online. earning ax0n per inference.

To run the node as a systemd service for automatic restarts:

BASH

axon-node install-service systemctl enable ax0n-node && systemctl start ax0n-node

Configuration

The node runtime is configured via ~/.ax0n/config.toml. Edit this file before starting the daemon.

TOML — ~/.ax0n/config.toml

[node] private_key = "0x..." # required cluster = "auto" # or e.g. "eu-west-7" max_concurrent = 4 # parallel inference slots models = ["llama-3-70b", "mistral-7b"] [hardware] gpu_device = 0 # CUDA device index gpu_memory_limit = "22GB" cpu_threads = 8 [network] listen_addr = "0.0.0.0:9000" max_connections = 256 announce_ip = "auto" # or your public IP [proofcore] proof_workers = 2 # parallel proof generation threads cache_keys = true # cache verifier keys on disk

Monitoring

The node daemon exposes a Prometheus metrics endpoint at http://localhost:9001/metrics and a JSON status endpoint at http://localhost:9001/status.

BASH

curl http://localhost:9001/status | jq # → { "node_id": "0x9a77…33ef", "cluster": "eu-west-7", "status": "online", "uptime_pct": 99.8, "inferences": 841293, "axn_earned": "1934.22", "slash_risk": false, "stake": "1000.00" }

Key metrics to watch: axon_proof_latency_p95, axon_inference_errors_total, and axon_slash_events_total. A rising p95 proof latency indicates GPU thermal throttling or resource contention.

// protocol

NeuralRoute Protocol

NeuralRoute is ax0n's request routing layer. For each incoming inference request it selects an optimal node cluster in under 2ms by scoring every registered node against a weighted multi-factor index.

Scoring algorithm

Each node receives a real-time composite score:

PSEUDOCODE

score(node) = (1 - norm_latency_p95(node)) * 0.45 + norm_stake_weight(node) * 0.30 + norm_uptime_30d(node) * 0.15 + norm_hardware_tier(node) * 0.10

Scores are stored in a Bloom-trie index that supports O(log n) range queries filtered by model availability, geographic region, and current capacity. The index is rebuilt every 16 blocks (~3.2s).

Cluster selection

NeuralRoute selects the top-4 scoring nodes that can collectively serve the requested model within the client's maxLatency budget. Requests are dispatched to the highest-scoring node; the remaining three act as hot standby in case of primary failure.

ProofCore ZK Attestation

ProofCore is ax0n's zero-knowledge attestation layer. After a node runs inference, it passes the output through a custom Groth16 circuit that produces a 288-byte proof attesting to two properties:

Model integrity — the model weights used match the committed hash for the declared model ID.
Output determinism — given the same input and seed, the output is reproducible (no hallucinated randomness).

◈

Weights are never revealed. The Groth16 circuit uses a commitment scheme: the node commits to its weights during registration and proves knowledge of the preimage. The verifier can check the proof without seeing the weights.

Proof generation time

ProofCore targets 120ms proof generation on RTX 4090-class hardware, regardless of model parameter count. The constant generation time is achieved by representing model inference as a fixed-depth arithmetic circuit with model-size-independent witness size. Nodes below the 200ms threshold are automatically flagged and down-weighted by NeuralRoute.

MeshSync Consensus

MeshSync is ax0n's validator layer. After a node generates a proof, a randomly-sampled quorum of 256 validators verifies it using their local verifier keys and co-signs the result.

Quorum sampling

The 256-validator quorum is sampled from the full validator set (currently 10,247 active) using the parent block hash as a randomness seed. This makes quorum membership unpredictable and resistant to targeted bribery attacks.

A v0.3 upgrade will replace block-hash randomness with a VRF-based sampling scheme for stronger security guarantees. See the governance proposal #14 ↗.

Finality

Optimistic path (normal): 1.2s — quorum signs, aggregate signature posted to settlement batch.
Dispute path: 6s — ZK fallback triggered if any validator submits a counter-proof; a secondary quorum adjudicates.

Settlement

Proven inference results are bundled into settlement batches and settled on Ethereum every 512 blocks (~1.7 hours). Batches include up to 4,096 proof/quorum-signature pairs, compressed using recursive SNARKs to keep calldata costs constant regardless of batch size.

Fee distribution

In the same settlement transaction, ax0n fees are distributed:

78% → node operator (the serving node)
12% → validator quorum (split evenly across 256 validators)
10% → protocol treasury (governed by ax0n holders)

// governance

Proposals

ax0n protocol parameters are governed by ax0n token holders via the on-chain Governor contract. Any address holding ≥ 10,000 ax0n (or delegate) can submit a proposal.

Proposals can modify any parameter in the protocol config — fee rates, quorum size, epoch length, slash window — or upgrade the underlying contracts.

TYPESCRIPT

const gov = client.governance(); const proposal = await gov.propose({ title: 'Reduce min_node_stake to 500 ax0n', description:'Lower barrier to entry for smaller operators...', calldata: governorABI.encodeFunctionData('setParam', [ 'min_node_stake', parseAX0N('500'), ]), });

Voting

Proposals enter a 3-day voting window. Passing requires ≥51% of participating ax0n weight voting in favour, with a minimum quorum of 5% of total circulating supply. Passed proposals are queued behind a 7-day timelock before execution.

PHASE	DURATION	DESCRIPTION
Pending	1 day	Proposal submitted, delegates can align votes
Active	3 days	Voting window open for all ax0n holders
Queued	7 days	Timelock — community can review before execution
Executed	—	Parameter change applied on-chain

// resources

Error reference

CODE	DESCRIPTION
ERR_FEE_EXCEEDED	The network fee for this request exceeded your `maxFee` cap. Increase `maxFee` or try again during lower congestion.
ERR_LATENCY_EXCEEDED	No node cluster could serve the request within your `maxLatency` budget. Increase `maxLatency` or choose a smaller model.
ERR_PROOF_TIMEOUT	ProofCore did not generate the proof within 500ms. The node was de-weighted; your request was not charged.
ERR_QUORUM_FAILED	The validator quorum did not reach consensus within the dispute window. Request not settled; no fee charged.
ERR_MODEL_UNAVAILABLE	No node in the mesh is currently serving the requested model. Call `client.models()` to check availability.
ERR_INSUFFICIENT_BALANCE	Wallet ax0n balance is below the estimated fee. Top up your balance and retry.
ERR_INVALID_KEY	The private key is malformed or does not match an Ethereum address. Check your `AX0N_PRIVATE_KEY` environment variable.

Changelog

v0.2.3 — 2025-10-22

Fix: NeuralRoute validator score cache now correctly invalidated on epoch boundary
Fix: client.unstake() no longer throws on partial unstake with rounding error
Perf: ProofCore proof generation 11% faster on Ampere-class GPUs (CUDA kernel rewrite)
Add: axon_slash_events_total Prometheus metric

v0.2.1 — 2025-09-08

Fix: ProofCore bounds check on proof_size header (Trail of Bits LOW-01)
Add: client.stream() — streaming inference API
Add: deepseek-r2-67b model support
Change: Default maxFee reduced from '0.05' to '0.01' ax0n

v0.2.0 — 2025-08-01 — mainnet launch

Initial mainnet deployment on Ethereum
NeuralRoute v2.1 with Bloom-trie index and multi-factor scoring
ProofCore v2 — Groth16 circuit with constant 288-byte proof size
MeshSync with 256-node quorum sampling via block hash
Node operator program open — 127,000+ nodes registered at launch

← PREVIOUS Overview BACK TO APP → ax0n mainnet