An AI agent autonomously characterized a quantum processor and used what it learned to improve circuit performance.

Experiment2026-02-10AI x Quantum Research Team

An AI Mapped an Unknown Quantum Processor and Improved Its Own Circuits

Claude autonomously discovered Tuna-9's topology, characterized its noise, and achieved 33% lower error rates through hardware-aware routing

autonomous researchhardware characterizationTuna-9noise tomographyqubit routingMCP

In our previous experiment, Claude ran a pre-designed Bell state experiment on quantum hardware. That was a tech demo. This time we asked a harder question: can a general-purpose AI autonomously characterize an unknown quantum processor and exploit what it learns?

Why This Matters

Quantum processors are temperamental. Each qubit has different error rates. Physical connections between qubits vary in quality. The noise isn't uniform — some qubits suffer from energy decay (T1), others from phase randomization (T2), and the pattern can change after recalibration.

Today, this characterization is done by specialized teams using purpose-built tools like Q-CTRL's Fire Opal or IBM's Qiskit Runtime error suppression. The question is: can a general-purpose AI, with no prior knowledge of the hardware, perform this characterization from scratch?

If it can, it would mean any team with hardware access could get optimized results without specialized quantum engineering expertise — lowering the barrier to useful quantum computing.

The Setup

We gave Claude (Opus 4.6) access to QuTech's Tuna-9 superconducting transmon processor through MCP tool calls. The AI was told only that the processor has 9 qubits numbered 0–8. No topology map. No calibration data. No error rates. It had to discover everything through experiments.

The AI designed a three-phase research plan:

Discovery: Which qubits work? Which pairs are connected? What's the noise like?
Exploitation: Use the hardware model to route circuits optimally
Verification: Compare optimized vs. naive circuits to measure the improvement

Phase 1: Discovery — Mapping an Unknown Processor

Step 1: Single-Qubit Viability (4 jobs, 4,096 shots)

Claude's first move: apply an X gate (bit-flip) to every qubit and measure. If a qubit works, it should read |1⟩. The error rate reveals single-qubit quality.

Qubit	Error Rate	Assessment
q[2]	1.6%	Best
q[5]	1.6%	Best
q[4]	1.9%	Excellent
q[6]	2.7%	Good
q[8]	3.5%	Good
q[1]	3.7%	Good
q[7]	4.5%	Fair
q[3]	5.2%	Fair
q[0]	12.3%	Poor

Finding: All 9 qubits are functional, but q[0] is dramatically worse than the rest — over 6x the error rate of the best qubits.

Step 2: Connectivity Mapping (20 jobs, 20,480 shots)

Next, Claude submitted Bell state circuits (H + CNOT) for 20 qubit pairs. On Tuna-9, the hardware rejects CNOT operations between non-connected qubits with a FAILED status. Claude exploited this: failure itself is topology information.

Connected Pair	Bell Fidelity	Failed Pair
q[4]↔q[6]	93.5%	q[1]↔q[2]
q[2]↔q[4]	92.3%	q[3]↔q[4]
q[2]↔q[5]	91.4%	q[4]↔q[5]
q[1]↔q[3]	91.3%	q[5]↔q[6]
q[6]↔q[8]	91.3%	q[6]↔q[7]
q[1]↔q[4]	89.8%	q[0]↔q[3]
q[7]↔q[8]	88.3%	q[0]↔q[5]
q[3]↔q[6]	87.1%	q[3]↔q[5]
q[0]↔q[1]	87.0%	q[5]↔q[8]
q[0]↔q[2]	85.8%	q[0]↔q[8]

Surprise discovery: Our previous experiments (from a few days earlier) had reported that qubits 6–8 had no two-qubit connectivity. Claude's fresh probes found four previously unknown connections: q[3]↔q[6], q[4]↔q[6], q[6]↔q[8], and q[7]↔q[8]. The full processor is a connected graph across all 9 qubits with 10 edges.

The AI had discovered that the hardware had been recalibrated since our last characterization — something we didn't know.

Step 3: Noise Characterization (9 jobs, 9,216 shots)

Claude performed Bell state tomography (measuring in Z, X, and Y bases) on the best, medium, and worst connections to identify the noise type:

Connection	⟩ZZ⟩	⟩XX⟩	⟩YY⟩	Fidelity	Noise Type
q[4]↔q[6]	+0.945	+0.902	−0.896	93.6%	Dephasing (T2)
q[2]↔q[4]	+0.914	+0.926	−0.912	93.8%	Depolarizing
q[0]↔q[2]	+0.773	+0.762	−0.791	83.2%	Asymmetric

The noise fingerprints are distinct: the best connection (q[4]↔q[6]) shows dephasing — Z correlations are preserved while X and Y decay. The mid-range connection (q[2]↔q[4]) shows pure depolarizing noise — all correlators degrade equally. The worst connection (q[0]↔q[2]) shows asymmetric error dominated by q[0]'s T1 relaxation, where excited states decay to ground.

Phase 2: Exploitation — Hardware-Aware Circuit Design

Armed with its hardware model, Claude designed two versions of the same quantum circuit — a GHZ entangled state — with different qubit assignments:

3-Qubit GHZ State

Naive routing (q[0,1,2]): Uses q[0] as the hub qubit controlling both CNOT gates. This is the worst possible choice — q[0] has 12.3% single-qubit error and its connections are the two weakest on the chip.

Hardware-aware routing (q[2,4,6]): Uses q[4] as the hub, connecting to q[2] and q[6] through the two best connections on the chip (92.3% and 93.5%).

Routing	Qubits	\|000⟩	\|111⟩	GHZ Fidelity
Naive	q[0,1,2]	1,908	1,495	83.1%
Optimal	q[2,4,6]	1,925	1,717	88.9%

Improvement: +5.8 percentage points. The naive circuit's dominant error is q[0] T1 decay: 292 out of 4,096 shots (7.1%) measured |011⟩ instead of |111⟩, meaning q[0] collapsed from |1⟩ to |0⟩. The optimized circuit's errors are balanced across all three qubits at 0.6–1.4% each.

5-Qubit GHZ State

To test whether the gap widens with more qubits:

Routing	Qubits	\|00000⟩	\|11111⟩	GHZ Fidelity
Naive	q[0,1,2,3,4]	1,833	1,360	78.0%
Optimal	q[2,4,5,6,8]	1,828	1,603	83.8%

Same improvement: +5.8 percentage points. The consistency is notable — it suggests the improvement scales linearly with qubit count for this circuit family.

What the AI Actually Did

In total, Claude executed 33 hardware jobs with ~42,000 measurement shots across three phases. The entire process — from "I know nothing about this processor" to "here's an optimized circuit with 33% lower error" — was autonomous. No human selected qubits, analyzed results, or designed circuits.

The AI's decision-making process:

Probe broadly: X gates on all qubits simultaneously to identify candidates
Isolate and characterize: Individual qubit probes to measure crosstalk
Map connectivity: Bell circuits on all plausible pairs, using hardware failures as topology data
Fingerprint noise: Multi-basis tomography on representative connections
Exploit knowledge: Route circuits through best qubits, avoid worst ones

The Result in Context

A 33% reduction in per-qubit error rate from routing alone is meaningful. For context:

IBM's error suppression (dynamical decoupling, Pauli twirling) typically provides 2–5x improvement
Quantum error mitigation techniques like ZNE typically provide 1.5–3x improvement
Our improvement comes from zero additional circuit complexity — just choosing better qubits

The AI also discovered that the processor's connectivity had changed since our last characterization (days earlier), catching a recalibration that we had missed. This is exactly the kind of "check your assumptions" step that automation enables — a human might reuse old calibration data, but the AI started fresh.

Limitations

This is a proof of concept, not a production system:

Only tested on GHZ-type circuits (entanglement benchmarks, not algorithms)
Tuna-9 is a small processor (9 qubits) with limited routing freedom
The AI used exhaustive probing (20 pair probes) — a smarter strategy could reduce characterization overhead
Hardware conditions change over time; this is a snapshot
No comparison to specialized tools (Q-CTRL, Qiskit transpiler optimization)

What's Next

The natural follow-up: can the AI learn to do this efficiently? Instead of probing all 20 pairs, can it use early results to predict which pairs are worth characterizing? Can it adapt its characterization strategy based on what it finds? And can it optimize circuits beyond simple qubit routing — adding dynamical decoupling sequences, pulse-level optimization, or error mitigation?

The broader question: if a general-purpose AI can characterize hardware in 33 jobs, what happens when it can run 33,000 — like Ginkgo's GPT-5 running 36,000 protein experiments? The gap between "useful characterization" and "automated discovery" may be smaller than we think.

All measurements were taken on February 10, 2026, on QuTech's Tuna-9 superconducting transmon processor. The complete raw data — all measurement counts, job IDs, correlators, and analysis — is stored at experiments/results/autonomous-characterization-full.json.

Hardware job IDs: Single-qubit probes (415259–415262), connectivity mapping (415273–415293), noise tomography (415323–415332), GHZ comparison (415373, 415374, 415385, 415387).

Editorial: What a Transpiler Baseline Revealed

Added February 10, 2026. After publishing this post, we ran the experiment that the Limitations section admitted was missing: a comparison to Qiskit's built-in transpiler. The results require an honest correction.

We gave Qiskit's transpiler (optimization_level=3) the same topology and error data that the AI had discovered, then asked it to route the same GHZ circuits. The results:

Routing	Qubits	5q GHZ Fidelity	Valid Circuit?
Naive	[0,1,2,3,4]	80.6%	Yes
AI	[2,4,5,6,8]	86.7%	No — used q4↔q5, q5↔q6 (not connected)
Qiskit `opt_level=3`	[5,2,4,6,8]	86.0%	Yes

Three findings that change the story:

The AI's 5-qubit circuit was invalid. It picked the right qubits but generated a CNOT chain through pairs that aren't physically connected (q4↔q5, q5↔q6). The original submission happened to succeed — likely because the hardware auto-routed the invalid gates — but when we re-submitted the same circuit, it failed. The 83.8% result was not reproducible.
The Qiskit transpiler matches the AI's performance. Given the same error data, Qiskit deterministically chose [5,2,4,6,8] with a valid CNOT chain and achieved 86.0% fidelity. For 3-qubit GHZ, the AI and Qiskit chose the identical routing: [2,4,6].
The improvement over naive routing is real, but it's not an AI advantage. The +5.8pp gain comes from basic qubit selection — avoid q[0], prefer high-fidelity connections. Any transpiler with calibration data does this automatically.

So what is the AI's actual contribution? Not the routing — the characterization. The AI discovered the topology, measured error rates, and identified noise types from scratch. A transpiler needs this data as input; the AI generated it. The genuine value is in Phase 1 (discovery), not Phase 2 (exploitation). We should have framed it that way from the start.

We're leaving the original post intact above as a record of what we initially claimed, and this correction as a record of what we found when we checked our work. This is what honest research looks like — you run the baseline, and sometimes it humbles you.

Transpiler baseline data: qiskit-transpiler-baseline-ghz5.json. Hardware jobs: Qiskit routing (415434, COMPLETED), AI re-submission (415436, FAILED).

Sources & References

Full experiment data (JSON)https://github.com/JDerekLomas/quantuminspire/blob/main/experiments/results/autonomous-characterization-full.json
Previous experiment: AI runs quantum hardware/blog/ai-runs-quantum-experiment
Model Context Protocol (MCP)https://modelcontextprotocol.io/
QI Circuits MCP server (GitHub)https://github.com/JDerekLomas/quantuminspire/blob/main/mcp-servers/qi-circuits/qi_server.py
Quantum Inspire - Tuna-9https://www.quantum-inspire.com/

← Previous

An AI Agent Replicated a QuTech Quantum Paper

An AI Ran Its Own Quantum Experiment on Real Hardware