License: arXiv.org perpetual non-exclusive license
arXiv:2501.09079v1 [quant-ph] 15 Jan 2025
thanks: These authors contributed equally.thanks: These authors contributed equally.thanks: These authors contributed equally.

Demonstrating quantum error mitigation on logical qubits

Aosai Zhang School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
   Haipeng Xie Graduate School of China Academy of Engineering Physics, Beijing 100193, China    Yu Gao School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
   Jia-Nan Yang School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
   Zehang Bao School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
   Zitian Zhu School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
   Jiachen Chen School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
   Ning Wang School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
   Chuanyu Zhang School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
   Jiarun Zhong School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
   Shibo Xu School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
   Ke Wang School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
   Yaozu Wu School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
   Feitong Jin School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
   Xuhao Zhu School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
   Yiren Zou School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
   Ziqi Tan School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
   Zhengyi Cui School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
   Fanhao Shen School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
   Tingting Li School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
   Yihang Han School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
   Yiyang He School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
   Gongyu Liu School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
   Jiayuan Shen School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
   Han Wang School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
   Yanzhe Wang School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
   Hang Dong School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
   Jinfeng Deng School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
State Key Laboratory of Extreme Photonics and Instrumentation,
College of Optical Science and Engineering, Zhejiang University, Hangzhou 310027, China
   Hekang Li School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
   Zhen Wang School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
Hefei National Laboratory, Hefei 230088, China
   Chao Song School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
Hefei National Laboratory, Hefei 230088, China
   Qiujiang Guo School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
Hefei National Laboratory, Hefei 230088, China
   Pengfei Zhang pfzhang@zju.edu.cn School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
   Ying Li yli@gscaep.ac.cn Graduate School of China Academy of Engineering Physics, Beijing 100193, China    H. Wang School of Physics, ZJU-Hangzhou Global Scientific and Technological Innovation Center,
and Zhejiang Key Laboratory of Micro-nano Quantum Chips and Quantum Control, Zhejiang University, Hangzhou, China
State Key Laboratory of Extreme Photonics and Instrumentation,
College of Optical Science and Engineering, Zhejiang University, Hangzhou 310027, China
Hefei National Laboratory, Hefei 230088, China
Abstract

A long-standing challenge in quantum computing is developing technologies to overcome the inevitable noise in qubits. To enable meaningful applications in the early stages of fault-tolerant quantum computing, devising methods to suppress post-correction logical failures is becoming increasingly crucial. In this work, we propose and experimentally demonstrate the application of zero-noise extrapolation, a practical quantum error mitigation technique, to error correction circuits on state-of-the-art superconducting processors. By amplifying the noise on physical qubits, the circuits yield outcomes that exhibit a predictable dependence on noise strength, following a polynomial function determined by the code distance. This property enables the effective application of polynomial extrapolation to mitigate logical errors. Our experiments demonstrate a universal reduction in logical errors across various quantum circuits, including fault-tolerant circuits of repetition and surface codes. We observe a favorable performance in multi-round error correction circuits, indicating that this method remains effective when the circuit depth increases. These results advance the frontier of quantum error suppression technologies, opening a practical way to achieve reliable quantum computing in the early fault-tolerant era.

I Introduction

Suppressing errors is a problem that lies at the center of quantum computing technologies. Quantum error correction and mitigation are the two generic methods of error suppression. Error correction promises to reach an arbitrarily high fidelity provided sufficient qubit resources. Recently, experiments have demonstrated a positive gain in the surface code error correction when scaling the code distance up to 7 [1]. However, it is still viewed as a long-term goal to achieve negligible infidelity through error correction, which could need millions or even more qubits and pose a challenge to the experimental technologies regarding scalability [2]. Error mitigation takes different resources, computing time, to suppress errors, being originally designed for the regime that error correction is lacking [3]. Considering the earliest applications of quantum computing, it is highly likely that they will be carried out under strict technology constraints, mainly on the qubit number. In this scenario, error correction can only attain a limited fidelity, and a practical approach taking advantage of the two error suppression methods is necessary [4, 5].

Zero-noise extrapolation (ZNE) is one of the most practical error mitigation techniques, universally applicable to quantum algorithms that evaluate expectation values [6, 7]. The central idea behind ZNE is to amplify the noise in a quantum circuit by a controllable factor r𝑟ritalic_r, and then extrapolate the results back to r=0𝑟0r=0italic_r = 0, thereby inferring the behavior of a noiseless circuit. In a recent experiment, ZNE demonstrated substantial error suppression on a superconducting system with more than a hundred qubits [8], making it the only error mitigation technique successfully applied at this scale to date. Given these promising results, we consider ZNE the leading candidate for mitigating errors in quantum error correction circuits. Specifically, we apply ZNE to reduce post-correction logical errors by amplifying noise on physical qubits.

The implementation of ZNE on logical qubits faces two primary experimental challenges. First, it requires a quantum processor with high-fidelity gates and sufficient qubits, which is crucial for executing quantum error correction. Recent progress in qubit fabrication and control technologies have made superconducting qubits a promising platform for investigating quantum error correction and mitigation technologies [8, 9, 10, 11, 12, 13, 1]. In this work, we utilize two superconducting quantum processors to experimentally assess residual errors and costs in quantum error correction and mitigation, employing both the repetition code and surface code [14, 15, 16, 17]. Second, accurate error mitigation relies on measuring error rates per operation [3, 18]. While measuring logical error rates is feasible for small codes, it becomes increasingly time-consuming for larger codes due to the small logical error rates [19, 20]. Additionally, for high-encoding-rate quantum error correction codes, such as certain qLDPC codes, logical errors may involve multi-qubit correlations, making the measurement impractical [21, 22] (see Supplementary Section S2). To overcome this issue, we adopt a strategy of amplifying noise on physical qubits instead of logical qubits. By integrating error mitigation techniques with error correction, this study demonstrates a practical pathway to bridge the gap between the noisy intermediate-scale quantum (NISQ) era and the fault-tolerant quantum computing (FTQC) era, advancing the pursuit of practical quantum computing technologies.

II Results

We shall first justify the application of ZNE to quantum error correction circuits. The essential assumption made in ZNE is that the expected value of an observable is a function O(r)delimited-⟨⟩𝑂𝑟\langle O\rangle(r)⟨ italic_O ⟩ ( italic_r ) of the noise strength r𝑟ritalic_r, and this function can be fitted well by simple functions such as polynomials. This assumption holds in NISQ circuits, which has been illustrated in many theoretical works and experiments [6, 7, 3, 23, 24, 25, 12]. In this paper, one of the primary goals is to demonstrate that the assumption is also valid in quantum error correction circuits.

Our first evidence is a theoretical result based on the stochastic error model: each operation in the circuit is either error-free or erroneous with a certain probability of p𝑝pitalic_p, and we increase the probability to rp𝑟𝑝rpitalic_r italic_p (with r>1𝑟1r>1italic_r > 1) when amplifying the noise (Fig. 1). We apply this model to general quantum circuits with all the components necessary for quantum error correction, including mid-circuit state preparation and measurement, feedback and post-selection; these components are beyond NISQ circuits [26]. Even with these components, the expected value with errors is in the form O(r)=Oideal+k=1Nakrkdelimited-⟨⟩𝑂𝑟subscriptdelimited-⟨⟩𝑂idealsuperscriptsubscript𝑘1𝑁subscript𝑎𝑘superscript𝑟𝑘\langle O\rangle(r)=\langle O\rangle_{\rm ideal}+\sum_{k=1}^{N}a_{k}r^{k}⟨ italic_O ⟩ ( italic_r ) = ⟨ italic_O ⟩ start_POSTSUBSCRIPT roman_ideal end_POSTSUBSCRIPT + ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_r start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT, where N𝑁Nitalic_N is the number of operations, and aksubscript𝑎𝑘a_{k}italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT is the deviation caused by k𝑘kitalic_k errors in the circuit (see Supplementary Section S1). This polynomial expression holds for all quantum circuits and suggests the use of a polynomial fitting function.

Refer to caption
Figure 1: Schematic diagram of ZNE on logical qubits. For an arbitrary quantum circuit with or without error correction, we can amplify the noise in either all physical operations or those causing most errors to mitigate corresponding imperfections. When using the circuit to measure the expectation of an observable O𝑂Oitalic_O, the value changes with the noise strength r𝑟ritalic_r, depicted by a function O(r)delimited-⟨⟩𝑂𝑟\langle O\rangle(r)⟨ italic_O ⟩ ( italic_r ) for each case (lines). To implement a K𝐾Kitalic_Kth-order ZNE, we amplify the noise in the circuit and measure the observable expectation Odelimited-⟨⟩𝑂\langle O\rangle⟨ italic_O ⟩ at K+1𝐾1K+1italic_K + 1 different noise strengths r0,r1,,rKsubscript𝑟0subscript𝑟1subscript𝑟𝐾r_{0},r_{1},\ldots,r_{K}italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_r start_POSTSUBSCRIPT italic_K end_POSTSUBSCRIPT (K=1𝐾1K=1italic_K = 1 in the figure). We always take r0=1subscript𝑟01r_{0}=1italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 1 corresponding to the raw noise without amplification. With the expectation values O(r0),O(r1),,O(rK)delimited-⟨⟩𝑂subscript𝑟0delimited-⟨⟩𝑂subscript𝑟1delimited-⟨⟩𝑂subscript𝑟𝐾\langle O\rangle(r_{0}),\langle O\rangle(r_{1}),\ldots,\langle O\rangle(r_{K})⟨ italic_O ⟩ ( italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) , ⟨ italic_O ⟩ ( italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , … , ⟨ italic_O ⟩ ( italic_r start_POSTSUBSCRIPT italic_K end_POSTSUBSCRIPT ) (black solid circles), we fit the function and infer the noise-free result Oidealsubscriptdelimited-⟨⟩𝑂ideal\langle O\rangle_{\rm ideal}⟨ italic_O ⟩ start_POSTSUBSCRIPT roman_ideal end_POSTSUBSCRIPT using a (K+1)𝐾1(K+1)( italic_K + 1 )-term polynomial (shadows, where the width indicates the variance in inference). We choose different polynomials depending on whether error correction is utilized and the code distance. The combination of error correction and ZNE results in a smaller bias and variance of observable expectations.

We use different polynomial fitting functions in cases with and without error correction. Without error correction, the leading contribution of errors is a1subscript𝑎1a_{1}italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT due to only one error in the circuit. Therefore, we always include the linear term a1rsubscript𝑎1𝑟a_{1}ritalic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_r when choosing a polynomial fitting function, then we introduce r2,r3,superscript𝑟2superscript𝑟3r^{2},r^{3},\ldotsitalic_r start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , italic_r start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT , … terms sequentially for higher-order fitting; we use such polynomials in conventional ZNE. With error correction, the leading order is different. In a circuit, its capability of error correction is characterized by an effective code distance d𝑑ditalic_d: when the number of errors is smaller than d/2𝑑2\lceil d/2\rceil⌈ italic_d / 2 ⌉, we can always successfully detect and correct the errors. Because of this reason, the coefficients aksubscript𝑎𝑘a_{k}italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT are all zero for k=1,2,,d/21𝑘12𝑑21k=1,2,\ldots,\lceil d/2\rceil-1italic_k = 1 , 2 , … , ⌈ italic_d / 2 ⌉ - 1, and the leading contribution becomes ad/2subscript𝑎𝑑2a_{\lceil d/2\rceil}italic_a start_POSTSUBSCRIPT ⌈ italic_d / 2 ⌉ end_POSTSUBSCRIPT; this coincides with numerical results on logical error rates of surface codes and bivariate bicycle codes [27, 22]. Therefore, with error correction, we choose a fitting function in the form O(r)=Oem+k=d/2d/2+K1akrkdelimited-⟨⟩𝑂𝑟subscriptdelimited-⟨⟩𝑂emsuperscriptsubscript𝑘𝑑2𝑑2𝐾1superscriptsubscript𝑎𝑘superscript𝑟𝑘\langle O\rangle(r)=\langle O\rangle_{\rm em}+\sum_{k=\lceil d/2\rceil}^{% \lceil d/2\rceil+K-1}a_{k}^{\prime}r^{k}⟨ italic_O ⟩ ( italic_r ) = ⟨ italic_O ⟩ start_POSTSUBSCRIPT roman_em end_POSTSUBSCRIPT + ∑ start_POSTSUBSCRIPT italic_k = ⌈ italic_d / 2 ⌉ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⌈ italic_d / 2 ⌉ + italic_K - 1 end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_r start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT, which has K𝐾Kitalic_K non-constant terms starting with the rd/2superscript𝑟𝑑2r^{\lceil d/2\rceil}italic_r start_POSTSUPERSCRIPT ⌈ italic_d / 2 ⌉ end_POSTSUPERSCRIPT term.

Refer to caption
Figure 2: ZNE on an example circuit with the feedback error correction. a, Circuit schematic. Each data qubit Qjsubscript𝑄𝑗Q_{j}italic_Q start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT is initialized into a superposition state parametrized by the rotation angle θjsubscript𝜃𝑗\theta_{j}italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT. Either idling gate or one of the Pauli gates is injected as errors during the operational stage, effectively yielding 4 types of circuit instances. All qubits are measured simultaneously, returning bit strings for uncorrected data. Based on the bit strings of all circuit instances, post-selection on Q3subscript𝑄3Q_{3}italic_Q start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT and feedback X𝑋Xitalic_X gate are done numerically for corrected data. b, Uncorrected (left) and corrected (right) expectation values of Z0subscript𝑍0Z_{0}italic_Z start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT observable as a function of rotation angle θ0subscript𝜃0\theta_{0}italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and noise scaling factor r𝑟ritalic_r, where the unit error probability is chosen as p=8.8%𝑝percent8.8p=8.8\%italic_p = 8.8 %. Lines and markers are colored according to the measured values at θ0=0subscript𝜃00\theta_{0}=0italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 0. c, ZNE for θ0=0subscript𝜃00\theta_{0}=0italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 0 and 0.4π0.4𝜋-0.4\pi- 0.4 italic_π. Dashed lines are numerical simulation results using a model with gate and measurement errors. Empty symbols indicate the extrapolated results of ZNE using two data points at r=1𝑟1r=1italic_r = 1 and r=3𝑟3r=3italic_r = 3.

We carry out experimental demonstrations on two superconducting quantum processors to certify that ZNE can be naturally integrated into the FTQC circuits. The processors used here, each of which contains a lattice of tens of frequency-tunable qubits featuring adjustable nearest-neighbor couplings, have performance similar to those dedicated to quantum error correction experiments in the literature [13, 12, 1]. The qubits selected for the experiments are highly coherent, with the median T1subscript𝑇1T_{1}italic_T start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT time being above 100100100100 μ𝜇\muitalic_μs (see parameters in Supplementary Table S1), and the controlled π𝜋\piitalic_π-phase (CZ) gates between nearest-neighbor qubits have a median fidelity around 0.995. Since FTQC also relies heavily on measurement quality, we can differentiate between the |0ket0\ket{0}| start_ARG 0 end_ARG ⟩ and |1ket1\ket{1}| start_ARG 1 end_ARG ⟩ states of each qubit, achieving median readout fidelities of 0.9950.9950.9950.995 and 0.9910.9910.9910.991 for Processor I (with Purcell filters, 0.50.50.50.5-μ𝜇\muitalic_μs measurement time) and Processor II (without Purcell filters, 2.52.52.52.5-μ𝜇\muitalic_μs measurement time), respectively. This is accomplished by pumping the |1ket1\ket{1}| start_ARG 1 end_ARG ⟩ state of each qubit to its next-higher level. Furthermore, even without the pumping technique which is a prerequisite for feedback operations in FTQC circuits, we can still achieve a median readout fidelity of 0.9910.9910.9910.991 on Processor I within 0.5 μ𝜇\muitalic_μs, as demonstrated in our repetition code experiment by allowing repetitive measurements on the syndrome qubits up to M=4𝑀4M=4italic_M = 4 rounds (see next).

As the first experimental demonstration, we show that ZNE works on an example circuit with a feedback X𝑋Xitalic_X control to eliminate the bit-flip error on Q0subscript𝑄0Q_{0}italic_Q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT (Fig. 2a), where the nominal data qubits (Qjsubscript𝑄𝑗Q_{j}italic_Q start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT for j=0𝑗0j=0italic_j = 0, 2, and 4) are each initialized into a superposition state given by |ψj=cosθj2|0isinθj2|1ketsubscript𝜓𝑗subscript𝜃𝑗2ket0𝑖subscript𝜃𝑗2ket1\ket{\psi_{j}}=\cos\dfrac{\theta_{j}}{2}\ket{0}-i\sin\dfrac{\theta_{j}}{2}\ket% {1}| start_ARG italic_ψ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG ⟩ = roman_cos divide start_ARG italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG 2 end_ARG | start_ARG 0 end_ARG ⟩ - italic_i roman_sin divide start_ARG italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG 2 end_ARG | start_ARG 1 end_ARG ⟩. The first 4 CNOT gates in the sequence diagram are used to encode the parity of the data qubits onto the syndrome qubits (Q1subscript𝑄1Q_{1}italic_Q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and Q3subscript𝑄3Q_{3}italic_Q start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT), followed by operations for algorithmic purpose, and the next 4 CNOT gates serve to decode and identify the bit-flip type of errors that may occur during the operational stage. To implement ZNE, one needs to be able to controllably amplify the errors, which can be achieved using schemes such as pulse stretching [28, 25] or subcircuit repetition [29]. Here to simulate depolarizing errors occurring at a probability of rp𝑟𝑝rpitalic_r italic_p with r𝑟ritalic_r being the scaling factor, we run 43superscript434^{3}4 start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT types of circuit instances which correspond to all combinations of inserting one operation drawn from the list of {I,X,Y,Z}𝐼𝑋𝑌𝑍\left\{I,X,Y,Z\right\}{ italic_I , italic_X , italic_Y , italic_Z } at the operational stage for each of the 3 data qubit, based on which we construct a weighted list of circuit instances so that the insertion probabilities of X𝑋Xitalic_X, Y𝑌Yitalic_Y, and Z𝑍Zitalic_Z all equal to rp/3𝑟𝑝3rp/3italic_r italic_p / 3 for each data qubit (see Supplementary Section S4 for more details). Finally all qubits are measured in the Z𝑍Zitalic_Z basis to yield a 5-bit binary string, μ0μ4subscript𝜇0subscript𝜇4\mu_{0}\dots\mu_{4}italic_μ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT … italic_μ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT, for each circuit instance, and the no correction data are directly calculated using μ0subscript𝜇0\mu_{0}italic_μ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPTs from all circuit instances. In the case of error correction, since a bit-flip error due to the insertion of X𝑋Xitalic_X or Y𝑌Yitalic_Y on the pair of data qubits Qjsubscript𝑄𝑗Q_{j}italic_Q start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT and Qj+2subscript𝑄𝑗2Q_{j+2}italic_Q start_POSTSUBSCRIPT italic_j + 2 end_POSTSUBSCRIPT, j{0,2}𝑗02j\in\{0,2\}italic_j ∈ { 0 , 2 }, flips the state of Qj+1subscript𝑄𝑗1Q_{j+1}italic_Q start_POSTSUBSCRIPT italic_j + 1 end_POSTSUBSCRIPT, the 5-bit binary strings from all instances are selected only when μ3=1subscript𝜇31\mu_{3}=1italic_μ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT = 1 to ensure no error on Q2subscript𝑄2Q_{2}italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, following which μ0subscript𝜇0\mu_{0}italic_μ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is numerically reversed due to the feedback X𝑋Xitalic_X gate only if μ1=1subscript𝜇11\mu_{1}=-1italic_μ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = - 1 to eliminate the bit-flip error on Q0subscript𝑄0Q_{0}italic_Q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. The newly processed strings are then used to calculate the data with correction.

For the case of no correction, Fig. 2b shows that the expectation values of Z0subscript𝑍0Z_{0}italic_Z start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT for Q0subscript𝑄0Q_{0}italic_Q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, for different input states parametrized by θ0subscript𝜃0\theta_{0}italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, all gradually approach zero as the noise strength r𝑟ritalic_r increases. In comparison, the data using processed bit strings with error correction indicate that the impact of noise is reduced, resulting in a slower decline in Z0delimited-⟨⟩subscript𝑍0\langle Z_{0}\rangle⟨ italic_Z start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩. Experimental results closely match numerical simulations of circuits with gate and measurement errors (Fig. 2c). Furthermore, when we use ZNE to extrapolate the values at r=0𝑟0r=0italic_r = 0, the results show excellent agreement with numerical predictions, which demonstrates that ZNE effectively works on the circuit.

In the above-mentioned example circuit, small errors occurring at the CNOT gates and measurement cannot be corrected, so that the post-mitigation residual error at r=0𝑟0r=0italic_r = 0 is comparable to the case without error correction. Next, we resort to fault-tolerant circuits of repetition and surface codes, where the logical error rates can be suppressed to an arbitrarily low level by increasing the code distance once physical error rates are below the threshold. On fault-tolerant circuits, we find that the post-mitigation residual error is much smaller than the case without error correction.

We implement repetition codes with distances d=3,5,7𝑑357d=3,5,7italic_d = 3 , 5 , 7 using up to 13 qubits on Processor I, as shown in Fig. 3a. Repetition codes protect the encoded logical qubit from bit-flip errors and have been widely used as a prototype for demonstrating error correction across various quantum platforms, including ion traps [30, 31], nuclear magnetic resonance [32, 33], and superconducting qubits [34, 1]. Here, we adopt it for our first demonstration of error mitigation on fault-tolerant circuits.

Refer to caption
Figure 3: Performance of ZNE in the repetition code. a, Layout of the repetition code on Processor I, with the data (golden) and syndrome (blue) qubits depicted as circles. b, Schematic of the experimental circuit, where error injection and parity measurement are repeatedly performed for M𝑀Mitalic_M rounds, followed by the final round of error injection and data qubit measurement. c, Measured expectation values of ZLsubscript𝑍LZ_{\rm L}italic_Z start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT with respect to the noise scaling factor r𝑟ritalic_r in circuits with one round of mid-circuit parity measurements. We present the uncorrected results (averaged over all data qubits) exclusively for the distance-7 repetition code, as the results for different code distances are nearly identical (Supplementary Fig. S5). Dashed lines: numerical simulation results. d, Scatter plot showing the bias δ𝛿\deltaitalic_δ and sampling overhead η𝜂\etaitalic_η for all possible choices of r1,,rKsubscript𝑟1subscript𝑟𝐾r_{1},\ldots,r_{K}italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_r start_POSTSUBSCRIPT italic_K end_POSTSUBSCRIPT in ZNE. Generally, the ZNE results of K=2𝐾2K=2italic_K = 2 (square) exhibit reduced bias yet introduce higher sampling overhead compared to K=1𝐾1K=1italic_K = 1 (triangle). Dashed lines: biases without ZNE, i.e., δ0=|ZLidealZL(r=1)|subscript𝛿0subscriptdelimited-⟨⟩subscript𝑍Lidealdelimited-⟨⟩subscript𝑍L𝑟1\delta_{0}=\left|\langle Z_{\rm L}\rangle_{\rm ideal}-\langle Z_{\rm L}\rangle% \left(r=1\right)\right|italic_δ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = | ⟨ italic_Z start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT roman_ideal end_POSTSUBSCRIPT - ⟨ italic_Z start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT ⟩ ( italic_r = 1 ) |. ef, Similar to panels c and d, but utilizing a multi-round repetition code with d=7𝑑7d=7italic_d = 7 and performing ZNE with K=1𝐾1K=1italic_K = 1. The lines in panel f serve as guides to illustrate the respective trends of ZNE performance.

The circuit of a repetition code, schematically depicted in Fig. 3b, consists of d𝑑ditalic_d data qubits and d1𝑑1d-1italic_d - 1 syndrome qubits arranged alternately in a linear chain. The data qubits are initialized in the state |0ket0\ket{0}| start_ARG 0 end_ARG ⟩, preparing the logical qubit in the logical state |0Lketsubscript0L\ket{0_{\rm L}}| start_ARG 0 start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT end_ARG ⟩. Next, we apply error injections and perform parity-check measurements for M𝑀Mitalic_M successive rounds. The circuit terminates with an additional round of error injection and transversal measurement on data qubits. The transversal measurement facilitates a final round of parity checks and readout of the logical operator ZLsubscript𝑍LZ_{\rm L}italic_Z start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT. The parity checks are designed to detect all possible bit-flip errors, whether injected deliberately or introduced by computational operations such as gates, state preparation, and measurements. Each experimental shot produces a bit string of length M(d1)+d𝑀𝑑1𝑑M(d-1)+ditalic_M ( italic_d - 1 ) + italic_d, recording the measurement outcomes. This bit string is then processed by a minimum-weight perfect matching (MWPM) decoder [35, 36], which outputs the correction gates. These corrections ensure faithful reconstruction of the logical state, provided that the number of bit-flip errors does not exceed d/21𝑑21\lceil d/2\rceil-1⌈ italic_d / 2 ⌉ - 1; therefore, the circuit is fault-tolerant with respect to bit-flip errors.

We evaluate the performance of error suppression methods using the expectation value of the logical operator ZLsubscript𝑍LZ_{\rm L}italic_Z start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT. In an ideal noise-free circuit, the expectation value is ZLideal=1subscriptdelimited-⟨⟩subscript𝑍Lideal1\langle Z_{\rm L}\rangle_{\rm ideal}=1⟨ italic_Z start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT roman_ideal end_POSTSUBSCRIPT = 1; the presence of noise reduces this value. For illustrative purpose, we experimentally inject Pauli errors during the operational stage with the unit probability set at p=3.6%𝑝percent3.6p=3.6\%italic_p = 3.6 %, as done in the previous example. Figure 3c shows the experimental results for circuits with one round of parity-check measurements (M=1𝑀1M=1italic_M = 1): the expectation value decreases gradually with increasing noise strength, while error correction slows the rate of decline. As the code distance d𝑑ditalic_d increases, the rate of decline approaches zero asymptotically, indicating a below-threshold error regime.

To further suppress errors beyond error correction, we apply ZNE to the results. Specifically, we select K+1𝐾1K+1italic_K + 1 data points corresponding to noise strengths r=r0,r1,,rK𝑟subscript𝑟0subscript𝑟1subscript𝑟𝐾r=r_{0},r_{1},\cdots,r_{K}italic_r = italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_r start_POSTSUBSCRIPT italic_K end_POSTSUBSCRIPT and perform polynomial extrapolation to obtain the error-mitigated expectation value of the logical operator, denoted as ZLemsubscriptdelimited-⟨⟩subscript𝑍Lem\langle Z_{\rm L}\rangle_{\rm em}⟨ italic_Z start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT roman_em end_POSTSUBSCRIPT. To evaluate the performance of ZNE, we define two metrics: the bias δ𝛿\deltaitalic_δ and the sampling overhead η𝜂\etaitalic_η. The bias δ𝛿\deltaitalic_δ quantifies the deviation of the error-mitigated value from the ideal expectation and is expressed as

δ=|ZLemZLideal|.𝛿subscriptdelimited-⟨⟩subscript𝑍Lemsubscriptdelimited-⟨⟩subscript𝑍Lideal\delta=\left|\langle Z_{\rm L}\rangle_{\rm em}-\langle Z_{\rm L}\rangle_{\rm ideal% }\right|.italic_δ = | ⟨ italic_Z start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT roman_em end_POSTSUBSCRIPT - ⟨ italic_Z start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT roman_ideal end_POSTSUBSCRIPT | . (1)

The sampling overhead η𝜂\etaitalic_η measures the relative increase in sampling cost required to achieve the same variance for the error-mitigated value as for the uncorrected value, and it is defined as

η=Var[ZLem]Var[ZL(r0)].𝜂Vardelimited-[]subscriptdelimited-⟨⟩subscript𝑍LemVardelimited-[]delimited-⟨⟩subscript𝑍Lsubscript𝑟0\eta=\dfrac{{\rm Var}\left[\langle Z_{\rm L}\rangle_{\rm em}\right]}{{\rm Var}% \left[\langle Z_{\rm L}\rangle\left(r_{0}\right)\right]}.italic_η = divide start_ARG roman_Var [ ⟨ italic_Z start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT roman_em end_POSTSUBSCRIPT ] end_ARG start_ARG roman_Var [ ⟨ italic_Z start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT ⟩ ( italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ] end_ARG . (2)

Unless otherwise specified, we take r0=1subscript𝑟01r_{0}=1italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 1 in what follows and iterate over all other data points for r1,,rKsubscript𝑟1subscript𝑟𝐾r_{1},\cdots,r_{K}italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_r start_POSTSUBSCRIPT italic_K end_POSTSUBSCRIPT.

The δ𝛿\deltaitalic_δ-η𝜂\etaitalic_η scatter plots in Fig. 3d visualize that ZNE consistently provides more accurate estimations compared to the uncorrected value at r0=1subscript𝑟01r_{0}=1italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 1 (indicated by dashed lines) across all cases. The plots also highlight a trade-off between the bias δ𝛿\deltaitalic_δ and sampling overhead η𝜂\etaitalic_η: while ZNE reduces bias, it increases the sampling overhead. If the noise-boosted data points at r1,,rKsubscript𝑟1subscript𝑟𝐾r_{1},\cdots,r_{K}italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_r start_POSTSUBSCRIPT italic_K end_POSTSUBSCRIPT are chosen closer to r0subscript𝑟0r_{0}italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT or if a higher-order extrapolation (larger K𝐾Kitalic_K) is used, the bias generally decreases, but the sampling overhead tends to grow. Figures 3c-d further highlight the advantage of combining ZNE with error correction. When error correction is applied, the δ𝛿\deltaitalic_δ-η𝜂\etaitalic_η scatter points progressively migrate toward the lower-left corner as the code distance d𝑑ditalic_d increases, indicating simultaneous improvements in precision and efficiency. Notably, for d=7𝑑7d=7italic_d = 7, the residual error δ𝛿\deltaitalic_δ is reduced to approximately 1×1041superscript1041\times 10^{-4}1 × 10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT, with a modest sampling overhead of just 5555.

It is important to note that error mitigation is scalable on error correction circuits as the code distance and circuit complexity increase [5]. A fundamental limitation of error mitigation methods is that they become inefficient when the product of the error rate per gate and the number of gates becomes large. This limitation is well-studied in probabilistic error cancellation, a bias-free method, where variance increases exponentially with the gate number [37, 38, 39]. In contrast, the polynomial-function ZNE has a bias that grows with the gate number [7]. Although these issues also arise in error mitigation applied to error correction circuits, the key factors become the logical error rate and the number of logical gates. Therefore, as long as the logical error rate remains sufficiently low — that is, with a sufficiently large code distance — error mitigation can be applied to circuits of arbitrary gate complexity. For instance, using the surface code with a physical error rate of 111\permil1 ‰ per gate and a code distance of eleven, we can achieve a logical error rate of approximately 2×10102superscript10102\times 10^{-10}2 × 10 start_POSTSUPERSCRIPT - 10 end_POSTSUPERSCRIPT [27]. This allows a circuit with 5×1075superscript1075\times 10^{7}5 × 10 start_POSTSUPERSCRIPT 7 end_POSTSUPERSCRIPT logical operations to run at one logical error in one hundred circuit shots. In this case, our protocol can reduce the logical error by a factor of about 0.0250.0250.0250.025 with an overhead cost of η136similar-to-or-equals𝜂136\eta\simeq 136italic_η ≃ 136 due to our estimation (see Supplementary Section S7). Moreover, our experiments show that as the code distance and parity-check rounds increase, the performance of ZNE remains nearly unchanged; notice that multi-round parity checks are necessary in practical fault-tolerant quantum computing. This insight can be demonstrated in two complementary ways. First, when the error rate per parity-check round is fixed, increasing the number of parity-check rounds results in a similar relative bias and sampling overhead for ZNE, despite the growth in unmitigated bias (see Supplementary Fig. S6). Second, for a fixed total error rate in all rounds, partitioning the circuit and employing multi-round QEC significantly improves both the unmitigated and mitigated results (Fig. 3ef). Remarkably, these enhancements are achieved while the sampling overhead remains nearly unchanged.

Refer to caption
Figure 4: Correction and mitigation of both X𝑋Xitalic_X- and Z𝑍Zitalic_Z-type errors in a distance-3 surface code. a, Illustration of the experimental circuit implemented on Processor II. The qubit layout (left) comprises 9 data qubits (yellow), 4 Z𝑍Zitalic_Z-type syndrome qubits (blue), and 4 X𝑋Xitalic_X-type syndrome qubits (violet). The circuit consists of two error injection layers, interleaved with multiple layers of Hadamard gates (H) and CNOT gates (black) designed to perform parity check, and terminates with simultaneous readout on all data and syndrome qubits. The system is initialized in an arbitrary logical state of the surface code, with detailed protocol provided in Supplementary Section S5. b, Logical states on the XLsubscript𝑋LX_{\rm L}italic_X start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT-ZLsubscript𝑍LZ_{\rm L}italic_Z start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT plane of the Bloch sphere, constructed from the measured expectation values of XLsubscript𝑋LX_{\rm L}italic_X start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT and ZLsubscript𝑍LZ_{\rm L}italic_Z start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT. The first plot shows various logical states measured without error injection, while the subsequent plots show the effect of different noisy scaling factors r𝑟ritalic_r for three initial states: |0Lketsubscript0L|0_{\rm L}\rangle| 0 start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT ⟩, |+LketsubscriptL|+_{\rm L}\rangle| + start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT ⟩ and |ψL=cosπ6|0L+sinπ6|1Lketsubscript𝜓L𝜋6ketsubscript0L𝜋6ketsubscript1L|\psi_{\rm L}\rangle=\cos\frac{\pi}{6}|0_{\rm L}\rangle+\sin\frac{\pi}{6}|1_{% \rm L}\rangle| italic_ψ start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT ⟩ = roman_cos divide start_ARG italic_π end_ARG start_ARG 6 end_ARG | 0 start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT ⟩ + roman_sin divide start_ARG italic_π end_ARG start_ARG 6 end_ARG | 1 start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT ⟩. c, Expectation values of the ZLsubscript𝑍LZ_{\mathrm{L}}italic_Z start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT observable for the logical state |ψLketsubscript𝜓L|\psi_{\mathrm{L}}\rangle| italic_ψ start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT ⟩ as a function of noise scaling factor r𝑟ritalic_r, including the raw data (blue points), corrected data (crimson points), and numerical simulation results (dashed lines). d, Scatter plot of the bias δ𝛿\deltaitalic_δ and sampling overhead η𝜂\etaitalic_η. The bias δ𝛿\deltaitalic_δ derived from the corrected data approaches the limit set by the imperfect initial state, which is close to 5×1025superscript1025\times 10^{-2}5 × 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT.

As a final experiment, we extend our investigation to the rotated surface code, a quantum error correction code capable of correcting both bit-flip and phase-flip errors. We implement the distance-3 rotated surface code on our Processor II, as illustrated in Fig. 4a. The encoded logical qubit is defined by two anti-commuting logical operators, XL=XD1XD4XD7subscript𝑋Lsubscript𝑋𝐷1subscript𝑋𝐷4subscript𝑋𝐷7X_{\rm L}=X_{D1}X_{D4}X_{D7}italic_X start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT = italic_X start_POSTSUBSCRIPT italic_D 1 end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT italic_D 4 end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT italic_D 7 end_POSTSUBSCRIPT and ZL=ZD3ZD4ZD5subscript𝑍Lsubscript𝑍𝐷3subscript𝑍𝐷4subscript𝑍𝐷5Z_{\rm L}=Z_{D3}Z_{D4}Z_{D5}italic_Z start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT = italic_Z start_POSTSUBSCRIPT italic_D 3 end_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT italic_D 4 end_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT italic_D 5 end_POSTSUBSCRIPT, and the system can be initialized by preparing the logical states using digital quantum circuits that consist of single- and two-qubit gates [40, 13].

We demonstrate the effectiveness of ZNE following the sequence diagram shown in Fig. 4a, with three representative initial states: |0Lketsubscript0L\ket{0_{\rm L}}| start_ARG 0 start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT end_ARG ⟩, |+LketsubscriptL\ket{+_{\rm L}}| start_ARG + start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT end_ARG ⟩, and |ψL=cosπ6|0L+sinπ6|1Lketsubscript𝜓L𝜋6ketsubscript0L𝜋6ketsubscript1L\ket{\psi_{\rm L}}=\cos\frac{\pi}{6}\ket{0_{\rm L}}+\sin\frac{\pi}{6}\ket{1_{% \rm L}}| start_ARG italic_ψ start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT end_ARG ⟩ = roman_cos divide start_ARG italic_π end_ARG start_ARG 6 end_ARG | start_ARG 0 start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT end_ARG ⟩ + roman_sin divide start_ARG italic_π end_ARG start_ARG 6 end_ARG | start_ARG 1 start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT end_ARG ⟩ (Fig. 4b). In the absence of error injections, the initial states reconstructed from the corrected expectation values of the ZLsubscript𝑍LZ_{\rm L}italic_Z start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT and XLsubscript𝑋LX_{\rm L}italic_X start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT observables locate significantly closer to the surface of the Bloch sphere compared to those reconstructed from the uncorrected values (Fig. 4b). This demonstrates reductions in both bit-flip and phase-flip errors that arise during the initial state preparation and parity measurement processes. We present numerical simulations based on the single-qubit depolarizing model, where the depolarizing rate is calibrated by matching the fidelity of the prepared logical state, |0Lketsubscript0L\ket{0_{\rm L}}| start_ARG 0 start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT end_ARG ⟩. As shown in Figs. 4b-c, this noise model agrees well with the experimental data, regardless of the initial states. Detailed results for both uncorrected and corrected expectation values of the ZLsubscript𝑍LZ_{\rm L}italic_Z start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT observable, with the initial state set to |ψLketsubscript𝜓L\ket{\psi_{\rm L}}| start_ARG italic_ψ start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT end_ARG ⟩, are presented in Fig. 4c; see Supplementary Fig. S8 for more experimental results. To implement ZNE, we insert random Pauli gates on data qubits before and after the parity-check measurements to amplify the noise. In surface codes, quantum computing are driven by the parity-check measurements applied on a lattice deforming with time in protocols such as braiding transformation [15] and lattice surgery [41]. ZNE can be applied to such circuits through the same noise amplification strategy as taken in our experiment. By incorporating ZNE, the residual errors in ZLdelimited-⟨⟩subscript𝑍L\langle Z_{\rm L}\rangle⟨ italic_Z start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT ⟩ and XLdelimited-⟨⟩subscript𝑋L\langle X_{\rm L}\rangle⟨ italic_X start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT ⟩ are further reduced beyond error correction (see Fig. 4d and Supplementary Fig. S8).

III Discussion

Our results demonstrate for the first time the application of quantum error mitigation on error correction circuits, effectively minimizing the impact of error correction failures. The specific method that we choose is ZNE: we amplify noise on physical qubits and utilize a selected polynomial extrapolation function, thereby adapting the method to be compatible with fault-tolerant quantum circuits. Amplifying physical error rates, a well-established and experimentally validated technique on NISQ circuits, ensures the feasibility of our approach. Moreover, the success of error mitigation methods such as ZNE and probabilistic error cancellation fundamentally depends on precise noise modeling. Our experimental results exhibit strong consistency with numerical simulations of noise models from calibration, offering a robust foundation for further performance optimization. As the main result, our approach achieves a reduction in error-induced bias, outperforming both standalone error correction and error mitigation. This bias reduction illustrates the potential to achieve reliable quantum computing on hardware that only permits limited error correction capabilities, marking a pivotal step toward large-scale quantum computing on noisy devices.


Acknowledgements

The device was fabricated at the Micro-Nano Fabrication Center of Zhejiang University. We acknowledge the support from the National Natural Science Foundation of China (Grant Nos. 92365301, 92065204, 12404574, 12274368, 12274367, 12174342, 12322414, 12404570, U20A2076, 12225507 and 12088101), the Innovation Program for Quantum Science and Technology (Grant No. 2021ZD0300200), the Zhejiang Provincial Natural Science Foundation of China (Grant Nos. LR24A040002 and LDQ23A040001), the National Key Research and Development Program of China (Grant No. 2023YFB4502600), and the NSAF (Grant No. U1930403).


Author contributions

H.X. and Y.L. proposed the ideas and conducted the theoretical analysis; A.Z., Y.G. and J.Y. carried out the experiments and analyzed the experimental data under the supervision of P.Z. and H.W.; H.L. and J.C. fabricated the device, supervised by H.W.; All authors contributed to the experimental setup, analysis of data, discussions of the results and writing of the manuscript.

References

  • Acharya et al. [2024] R. Acharya, L. Aghababaie-Beni, I. Aleiner, T. I. Andersen, M. Ansmann, F. Arute, K. Arya, A. Asfaw, N. Astrakhantsev, J. Atalaya, R. Babbush, D. Bacon, B. Ballard, J. C. Bardin, J. Bausch, A. Bengtsson, A. Bilmes, S. Blackwell, S. Boixo, G. Bortoli, A. Bourassa, J. Bovaird, L. Brill, M. Broughton, D. A. Browne, B. Buchea, B. B. Buckley, D. A. Buell, T. Burger, B. Burkett, N. Bushnell, A. Cabrera, J. Campero, H.-S. Chang, Y. Chen, Z. Chen, B. Chiaro, D. Chik, C. Chou, J. Claes, A. Y. Cleland, J. Cogan, R. Collins, P. Conner, W. Courtney, A. L. Crook, B. Curtin, S. Das, A. Davies, L. D. Lorenzo, D. M. Debroy, S. Demura, M. Devoret, A. D. Paolo, P. Donohoe, I. Drozdov, A. Dunsworth, C. Earle, T. Edlich, A. Eickbusch, A. M. Elbag, M. Elzouka, C. Erickson, L. Faoro, E. Farhi, V. S. Ferreira, L. F. Burgos, E. Forati, A. G. Fowler, B. Foxen, S. Ganjam, G. Garcia, R. Gasca, Élie Genois, W. Giang, C. Gidney, D. Gilboa, R. Gosula, A. G. Dau, D. Graumann, A. Greene, J. A. Gross, S. Habegger, J. Hall, M. C. Hamilton, M. Hansen, M. P. Harrigan, S. D. Harrington, F. J. H. Heras, S. Heslin, P. Heu, O. Higgott, G. Hill, J. Hilton, G. Holland, S. Hong, H.-Y. Huang, A. Huff, W. J. Huggins, L. B. Ioffe, S. V. Isakov, J. Iveland, E. Jeffrey, Z. Jiang, C. Jones, S. Jordan, C. Joshi, P. Juhas, D. Kafri, H. Kang, A. H. Karamlou, K. Kechedzhi, J. Kelly, T. Khaire, T. Khattar, M. Khezri, S. Kim, P. V. Klimov, A. R. Klots, B. Kobrin, P. Kohli, A. N. Korotkov, F. Kostritsa, R. Kothari, B. Kozlovskii, J. M. Kreikebaum, V. D. Kurilovich, N. Lacroix, D. Landhuis, T. Lange-Dei, B. W. Langley, P. Laptev, K.-M. Lau, L. L. Guevel, J. Ledford, K. Lee, Y. D. Lensky, S. Leon, B. J. Lester, W. Y. Li, Y. Li, A. T. Lill, W. Liu, W. P. Livingston, A. Locharla, E. Lucero, D. Lundahl, A. Lunt, S. Madhuk, F. D. Malone, A. Maloney, S. Mandrá, L. S. Martin, S. Martin, O. Martin, C. Maxfield, J. R. McClean, M. McEwen, S. Meeks, A. Megrant, X. Mi, K. C. Miao, A. Mieszala, R. Molavi, S. Molina, S. Montazeri, A. Morvan, R. Movassagh, W. Mruczkiewicz, O. Naaman, M. Neeley, C. Neill, A. Nersisyan, H. Neven, M. Newman, J. H. Ng, A. Nguyen, M. Nguyen, C.-H. Ni, T. E. O’Brien, W. D. Oliver, A. Opremcak, K. Ottosson, A. Petukhov, A. Pizzuto, J. Platt, R. Potter, O. Pritchard, L. P. Pryadko, C. Quintana, G. Ramachandran, M. J. Reagor, D. M. Rhodes, G. Roberts, E. Rosenberg, E. Rosenfeld, P. Roushan, N. C. Rubin, N. Saei, D. Sank, K. Sankaragomathi, K. J. Satzinger, H. F. Schurkus, C. Schuster, A. W. Senior, M. J. Shearn, A. Shorter, N. Shutty, V. Shvarts, S. Singh, V. Sivak, J. Skruzny, S. Small, V. Smelyanskiy, W. C. Smith, R. D. Somma, S. Springer, G. Sterling, D. Strain, J. Suchard, A. Szasz, A. Sztein, D. Thor, A. Torres, M. M. Torunbalci, A. Vaishnav, J. Vargas, S. Vdovichev, G. Vidal, B. Villalonga, C. V. Heidweiller, S. Waltman, S. X. Wang, B. Ware, K. Weber, T. White, K. Wong, B. W. K. Woo, C. Xing, Z. J. Yao, P. Yeh, B. Ying, J. Yoo, N. Yosri, G. Young, A. Zalcman, Y. Zhang, N. Zhu, and N. Zobrist, Quantum error correction below the surface code threshold (2024), arXiv:2408.13687 [quant-ph] .
  • Preskill [2018] J. Preskill, Quantum Computing in the NISQ era and beyond, Quantum 2, 79 (2018).
  • Cai et al. [2023] Z. Cai, R. Babbush, S. C. Benjamin, S. Endo, W. J. Huggins, Y. Li, J. R. McClean, and T. E. O’Brien, Quantum error mitigation, Rev. Mod. Phys. 95, 045005 (2023).
  • Piveteau et al. [2021] C. Piveteau, D. Sutter, S. Bravyi, J. M. Gambetta, and K. Temme, Error mitigation for universal gates on encoded qubits, Phys. Rev. Lett. 127, 200505 (2021).
  • Suzuki et al. [2022] Y. Suzuki, S. Endo, K. Fujii, and Y. Tokunaga, Quantum error mitigation as a universal error reduction technique: Applications from the nisq to the fault-tolerant quantum computing eras, PRX Quantum 3, 010345 (2022).
  • Li and Benjamin [2017] Y. Li and S. C. Benjamin, Efficient variational quantum simulator incorporating active error minimization, Phys. Rev. X 7, 021050 (2017).
  • Temme et al. [2017a] K. Temme, S. Bravyi, and J. M. Gambetta, Error mitigation for short-depth quantum circuits, Phys. Rev. Lett. 119, 180509 (2017a).
  • Kim et al. [2023] Y. Kim, A. Eddins, S. Anand, K. X. Wei, E. van den Berg, S. Rosenblatt, H. Nayfeh, Y. Wu, M. Zaletel, K. Temme, and A. Kandala, Evidence for the utility of quantum computing before fault tolerance, Nature 618, 500 (2023).
  • O’Brien et al. [2023] T. E. O’Brien, G. Anselmetti, F. Gkritsis, V. E. Elfving, S. Polla, W. J. Huggins, O. Oumarou, K. Kechedzhi, D. Abanin, R. Acharya, I. Aleiner, R. Allen, T. I. Andersen, K. Anderson, M. Ansmann, F. Arute, K. Arya, A. Asfaw, J. Atalaya, J. C. Bardin, A. Bengtsson, G. Bortoli, A. Bourassa, J. Bovaird, L. Brill, M. Broughton, B. Buckley, D. A. Buell, T. Burger, B. Burkett, N. Bushnell, J. Campero, Z. Chen, B. Chiaro, D. Chik, J. Cogan, R. Collins, P. Conner, W. Courtney, A. L. Crook, B. Curtin, D. M. Debroy, S. Demura, I. Drozdov, A. Dunsworth, C. Erickson, L. Faoro, E. Farhi, R. Fatemi, V. S. Ferreira, L. Flores Burgos, E. Forati, A. G. Fowler, B. Foxen, W. Giang, C. Gidney, D. Gilboa, M. Giustina, R. Gosula, A. Grajales Dau, J. A. Gross, S. Habegger, M. C. Hamilton, M. Hansen, M. P. Harrigan, S. D. Harrington, P. Heu, M. R. Hoffmann, S. Hong, T. Huang, A. Huff, L. B. Ioffe, S. V. Isakov, J. Iveland, E. Jeffrey, Z. Jiang, C. Jones, P. Juhas, D. Kafri, T. Khattar, M. Khezri, M. Kieferová, S. Kim, P. V. Klimov, A. R. Klots, A. N. Korotkov, F. Kostritsa, J. M. Kreikebaum, D. Landhuis, P. Laptev, K.-M. Lau, L. Laws, J. Lee, K. Lee, B. J. Lester, A. T. Lill, W. Liu, W. P. Livingston, A. Locharla, F. D. Malone, S. Mandrà, O. Martin, S. Martin, J. R. McClean, T. McCourt, M. McEwen, X. Mi, A. Mieszala, K. C. Miao, M. Mohseni, S. Montazeri, A. Morvan, R. Movassagh, W. Mruczkiewicz, O. Naaman, M. Neeley, C. Neill, A. Nersisyan, M. Newman, J. H. Ng, A. Nguyen, M. Nguyen, M. Y. Niu, S. Omonije, A. Opremcak, A. Petukhov, R. Potter, L. P. Pryadko, C. Quintana, C. Rocque, P. Roushan, N. Saei, D. Sank, K. Sankaragomathi, K. J. Satzinger, H. F. Schurkus, C. Schuster, M. J. Shearn, A. Shorter, N. Shutty, V. Shvarts, J. Skruzny, W. C. Smith, R. D. Somma, G. Sterling, D. Strain, M. Szalay, D. Thor, A. Torres, G. Vidal, B. Villalonga, C. Vollgraff Heidweiller, T. White, B. W. K. Woo, C. Xing, Z. J. Yao, P. Yeh, J. Yoo, G. Young, A. Zalcman, Y. Zhang, N. Zhu, N. Zobrist, D. Bacon, S. Boixo, Y. Chen, J. Hilton, J. Kelly, E. Lucero, A. Megrant, H. Neven, V. Smelyanskiy, C. Gogolin, R. Babbush, and N. C. Rubin, Purification-based quantum error mitigation of pair-correlated electron simulations, Nat. Phys. 19, 1787 (2023).
  • van den Berg et al. [2023a] E. van den Berg, Z. K. Minev, A. Kandala, and K. Temme, Probabilistic error cancellation with sparse Pauli–Lindblad models on noisy quantum processors, Nat. Phys. 19, 1116 (2023a).
  • Xu et al. [2023] S. Xu, Z.-Z. Sun, K. Wang, L. Xiang, Z. Bao, Z. Zhu, F. Shen, Z. Song, P. Zhang, W. Ren, X. Zhang, H. Dong, J. Deng, J. Chen, Y. Wu, Z. Tan, Y. Gao, F. Jin, X. Zhu, C. Zhang, N. Wang, Y. Zou, J. Zhong, A. Zhang, W. Li, W. Jiang, L.-W. Yu, Y. Yao, Z. Wang, H. Li, Q. Guo, C. Song, H. Wang, and D.-L. Deng, Digital simulation of projective non-abelian anyons with 68 superconducting qubits, Chin. Phys. Lett. 40, 060301 (2023).
  • Krinner et al. [2022] S. Krinner, N. Lacroix, A. Remm, A. Di Paolo, E. Genois, C. Leroux, C. Hellings, S. Lazar, F. Swiadek, J. Herrmann, G. J. Norris, C. K. Andersen, M. Müller, A. Blais, C. Eichler, and A. Wallraff, Realizing repeated quantum error correction in a distance-three surface code, Nature 605, 669 (2022).
  • Zhao et al. [2022] Y. Zhao, Y. Ye, H.-L. Huang, Y. Zhang, D. Wu, H. Guan, Q. Zhu, Z. Wei, T. He, S. Cao, F. Chen, T.-H. Chung, H. Deng, D. Fan, M. Gong, C. Guo, S. Guo, L. Han, N. Li, S. Li, Y. Li, F. Liang, J. Lin, H. Qian, H. Rong, H. Su, L. Sun, S. Wang, Y. Wu, Y. Xu, C. Ying, J. Yu, C. Zha, K. Zhang, Y.-H. Huo, C.-Y. Lu, C.-Z. Peng, X. Zhu, and J.-W. Pan, Realization of an error-correcting surface code with superconducting qubits, Phys. Rev. Lett. 129, 030501 (2022).
  • Dennis et al. [2002] E. Dennis, A. Kitaev, A. Landahl, and J. Preskill, Topological quantum memory, J. Math. Phys. 43, 4452 (2002).
  • Fowler et al. [2012] A. G. Fowler, M. Mariantoni, J. M. Martinis, and A. N. Cleland, Surface codes: Towards practical large-scale quantum computation, Phys. Rev. A 86, 032324 (2012).
  • Devitt et al. [2013] S. J. Devitt, W. J. Munro, and K. Nemoto, Quantum error correction for beginners, Rep. Prog. Phys. 76, 076001 (2013).
  • Terhal [2015] B. M. Terhal, Quantum error correction for quantum memories, Rev. Mod. Phys. 87, 307 (2015).
  • van den Berg et al. [2023b] E. van den Berg, Z. K. Minev, A. Kandala, and K. Temme, Probabilistic error cancellation with sparse Pauli–Lindblad models on noisy quantum processors, Nat. Phys. 19, 1116 (2023b).
  • Eisert et al. [2020] J. Eisert, D. Hangleiter, N. Walk, I. Roth, D. Markham, R. Parekh, U. Chabaud, and E. Kashefi, Quantum certification and benchmarking, Nat. Rev. Phys. 2, 382 (2020).
  • Chen et al. [2022] S. Chen, S. Zhou, A. Seif, and L. Jiang, Quantum advantages for pauli channel estimation, Phys. Rev. A 105, 032435 (2022).
  • Breuckmann and Eberhardt [2021] N. P. Breuckmann and J. N. Eberhardt, Quantum low-density parity-check codes, PRX Quantum 2, 040101 (2021).
  • Bravyi et al. [2024] S. Bravyi, A. W. Cross, J. M. Gambetta, D. Maslov, P. Rall, and T. J. Yoder, High-threshold and low-overhead fault-tolerant quantum memory, Nature 627, 778 (2024).
  • Czarnik et al. [2021] P. Czarnik, A. Arrasmith, P. J. Coles, and L. Cincio, Error mitigation with Clifford quantum-circuit data, Quantum 5, 592 (2021).
  • Strikis et al. [2021] A. Strikis, D. Qin, Y. Chen, S. C. Benjamin, and Y. Li, Learning-based quantum error mitigation, PRX Quantum 2, 040330 (2021).
  • Kandala et al. [2019] A. Kandala, K. Temme, A. D. Córcoles, A. Mezzacapo, J. M. Chow, and J. M. Gambetta, Error mitigation extends the computational reach of a noisy quantum processor, Nature 567, 491 (2019).
  • Chen et al. [2023] S. Chen, J. Cotler, H.-Y. Huang, and J. Li, The complexity of NISQ, Nat. Commun. 14, 6001 (2023).
  • Bravyi and Vargo [2013] S. Bravyi and A. Vargo, Simulation of rare events in quantum error correction, Phys. Rev. A 88, 062308 (2013).
  • Temme et al. [2017b] K. Temme, S. Bravyi, and J. M. Gambetta, Error mitigation for short-depth quantum circuits, Phys. Rev. Lett. 119, 180509 (2017b).
  • He et al. [2020] A. He, B. Nachman, W. A. de Jong, and C. W. Bauer, Zero-noise extrapolation for quantum-gate error mitigation with identity insertions, Phys. Rev. A 102, 012426 (2020).
  • Chiaverini et al. [2004] J. Chiaverini, D. Leibfried, T. Schaetz, M. D. Barrett, R. B. Blakestad, J. Britton, W. M. Itano, J. D. Jost, E. Knill, C. Langer, R. Ozeri, and D. J. Wineland, Realization of quantum error correction, Nature 432, 602 (2004).
  • Moses et al. [2023] S. A. Moses, C. H. Baldwin, M. S. Allman, R. Ancona, L. Ascarrunz, C. Barnes, J. Bartolotta, B. Bjork, P. Blanchard, M. Bohn, J. G. Bohnet, N. C. Brown, N. Q. Burdick, W. C. Burton, S. L. Campbell, J. P. Campora, C. Carron, J. Chambers, J. W. Chan, Y. H. Chen, A. Chernoguzov, E. Chertkov, J. Colina, J. P. Curtis, R. Daniel, M. DeCross, D. Deen, C. Delaney, J. M. Dreiling, C. T. Ertsgaard, J. Esposito, B. Estey, M. Fabrikant, C. Figgatt, C. Foltz, M. Foss-Feig, D. Francois, J. P. Gaebler, T. M. Gatterman, C. N. Gilbreth, J. Giles, E. Glynn, A. Hall, A. M. Hankin, A. Hansen, D. Hayes, B. Higashi, I. M. Hoffman, B. Horning, J. J. Hout, R. Jacobs, J. Johansen, L. Jones, J. Karcz, T. Klein, P. Lauria, P. Lee, D. Liefer, S. T. Lu, D. Lucchetti, C. Lytle, A. Malm, M. Matheny, B. Mathewson, K. Mayer, D. B. Miller, M. Mills, B. Neyenhuis, L. Nugent, S. Olson, J. Parks, G. N. Price, Z. Price, M. Pugh, A. Ransford, A. P. Reed, C. Roman, M. Rowe, C. Ryan-Anderson, S. Sanders, J. Sedlacek, P. Shevchuk, P. Siegfried, T. Skripka, B. Spaun, R. T. Sprenkle, R. P. Stutz, M. Swallows, R. I. Tobey, A. Tran, T. Tran, E. Vogt, C. Volin, J. Walker, A. M. Zolot, and J. M. Pino, A race-track trapped-ion quantum processor, Phys. Rev. X 13, 041052 (2023).
  • Cory et al. [1998] D. G. Cory, M. D. Price, W. Maas, E. Knill, R. Laflamme, W. H. Zurek, T. F. Havel, and S. S. Somaroo, Experimental quantum error correction, Phys. Rev. Lett. 81, 2152 (1998).
  • Moussa et al. [2011] O. Moussa, J. Baugh, C. A. Ryan, and R. Laflamme, Demonstration of sufficient control for two rounds of quantum error correction in a solid state ensemble quantum information processor, Phys. Rev. Lett. 107, 160501 (2011).
  • Reed et al. [2012] M. D. Reed, L. DiCarlo, S. E. Nigg, L. Sun, L. Frunzio, S. M. Girvin, and R. J. Schoelkopf, Realization of three-qubit quantum error correction with superconducting circuits, Nature 482, 382 (2012).
  • Fowler [2015] A. G. Fowler, Minimum weight perfect matching of fault-tolerant topological quantum error correction in average o(1) parallel time, Quantum Info. Comput. 15, 145–158 (2015).
  • Gidney [2021] C. Gidney, Stim: a fast stabilizer circuit simulator, Quantum 5, 497 (2021).
  • Takagi et al. [2022] R. Takagi, S. Endo, S. Minagawa, and M. Gu, Fundamental limits of quantum error mitigation, npj Quantum Inf. 8, 114 (2022).
  • Quek et al. [2024] Y. Quek, D. Stilck França, S. Khatri, J. J. Meyer, and J. Eisert, Exponentially tighter bounds on limitations of quantum error mitigation, Nat. Phys. 20, 1648 (2024).
  • Tsubouchi et al. [2023] K. Tsubouchi, T. Sagawa, and N. Yoshioka, Universal cost bound of quantum error mitigation based on quantum estimation theory, Phys. Rev. Lett. 131, 210601 (2023).
  • Satzinger et al. [2021] K. J. Satzinger, Y.-J. Liu, A. Smith, C. Knapp, M. Newman, C. Jones, Z. Chen, C. Quintana, X. Mi, A. Dunsworth, C. Gidney, I. Aleiner, F. Arute, K. Arya, J. Atalaya, R. Babbush, J. C. Bardin, R. Barends, J. Basso, A. Bengtsson, A. Bilmes, M. Broughton, B. B. Buckley, D. A. Buell, B. Burkett, N. Bushnell, B. Chiaro, R. Collins, W. Courtney, S. Demura, A. R. Derk, D. Eppens, C. Erickson, L. Faoro, E. Farhi, A. G. Fowler, B. Foxen, M. Giustina, A. Greene, J. A. Gross, M. P. Harrigan, S. D. Harrington, J. Hilton, S. Hong, T. Huang, W. J. Huggins, L. B. Ioffe, S. V. Isakov, E. Jeffrey, Z. Jiang, D. Kafri, K. Kechedzhi, T. Khattar, S. Kim, P. V. Klimov, A. N. Korotkov, F. Kostritsa, D. Landhuis, P. Laptev, A. Locharla, E. Lucero, O. Martin, J. R. McClean, M. McEwen, K. C. Miao, M. Mohseni, S. Montazeri, W. Mruczkiewicz, J. Mutus, O. Naaman, M. Neeley, C. Neill, M. Y. Niu, T. E. O’Brien, A. Opremcak, B. Pató, A. Petukhov, N. C. Rubin, D. Sank, V. Shvarts, D. Strain, M. Szalay, B. Villalonga, T. C. White, Z. Yao, P. Yeh, J. Yoo, A. Zalcman, H. Neven, S. Boixo, A. Megrant, Y. Chen, J. Kelly, V. Smelyanskiy, A. Kitaev, M. Knap, F. Pollmann, and P. Roushan, Realizing topologically ordered states on a quantum processor, Science 374, 1237 (2021).
  • Horsman et al. [2012] D. Horsman, A. G. Fowler, S. Devitt, and R. V. Meter, Surface code quantum computing by lattice surgery, New J. Phys. 14, 123011 (2012).
  • Javadi-Abhari et al. [2024] A. Javadi-Abhari, M. Treinish, K. Krsulich, C. J. Wood, J. Lishman, J. Gacon, S. Martiel, P. D. Nation, L. S. Bishop, A. W. Cross, B. R. Johnson, and J. M. Gambetta, Quantum computing with Qiskit (2024), arXiv:2405.08810 [quant-ph] .
  • Fowler [2013] A. G. Fowler, Analytic asymptotic performance of topological codes, Phys. Rev. A 87, 040301 (2013).
  • Watson and Barrett [2014] F. H. E. Watson and S. D. Barrett, Logical error rate scaling of the toric code, New J. Phys. 16, 093045 (2014).
  • Bravyi and Kitaev [2005] S. Bravyi and A. Kitaev, Universal quantum computation with ideal clifford gates and noisy ancillas, Phys. Rev. A 71, 022316 (2005).
  • Raussendorf et al. [2007] R. Raussendorf, J. Harrington, and K. Goyal, Topological fault-tolerance in cluster state quantum computation, New J. Phys. 9, 199 (2007).
  • Endo et al. [2018] S. Endo, S. C. Benjamin, and Y. Li, Practical quantum error mitigation for near-future applications, Phys. Rev. X 8, 031027 (2018).

Supplementary Information for “Demonstrating quantum error mitigation on logical qubits”

S1 Zero-noise extrapolation formulas

First, we introduce some notations and briefly review NISQ and FTQC circuits. Here, FTQC circuits refer to quantum circuits with mid-circuit state preparation and measurement, feedback and post-selection. Then, we show how to express an FTQC circuit with an example. Finally, we generalize the expression to arbitrary FTQC circuits. With the expression, we derive the formula for ZNE.

Notations. We use Xi,Yi,Zisubscript𝑋𝑖subscript𝑌𝑖subscript𝑍𝑖X_{i},Y_{i},Z_{i}italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT to denote Pauli operators on qubit-i𝑖iitalic_i, and we use 𝟙𝟙\openoneblackboard_1 to denote the identity operator. Given an operator V𝑉Vitalic_V, we use [V]delimited-[]𝑉[V][ italic_V ] to denote a completely positive map [V]=VV[V]\bullet=V\bullet V^{\dagger}[ italic_V ] ∙ = italic_V ∙ italic_V start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT.

We use three types of primitive operations to implement quantum computing: gate, state preparation and measurement. We can denote them with completely positive maps. An ideal gate is denoted by [U]delimited-[]𝑈[U][ italic_U ], where U𝑈Uitalic_U is a unitary operator. An ideal state preparation is denoted by 𝒜P,i=[(𝟙+𝕚)/𝟚]+[𝕚][(𝟙𝕚)/𝟚]subscript𝒜𝑃𝑖delimited-[]𝟙subscript𝕚2delimited-[]subscriptsuperscript𝕚delimited-[]𝟙subscript𝕚2\mathcal{A}_{P,i}=[(\openone+P_{i})/2]+[P^{\prime}_{i}][(\openone-P_{i})/2]caligraphic_A start_POSTSUBSCRIPT italic_P , italic_i end_POSTSUBSCRIPT = [ ( blackboard_1 + blackboard_P start_POSTSUBSCRIPT blackboard_i end_POSTSUBSCRIPT ) / blackboard_2 ] + [ blackboard_P start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT blackboard_i end_POSTSUBSCRIPT ] [ ( blackboard_1 - blackboard_P start_POSTSUBSCRIPT blackboard_i end_POSTSUBSCRIPT ) / blackboard_2 ], which prepares qubit-i𝑖iitalic_i in the eigenstate of P=X,Y,Z𝑃𝑋𝑌𝑍P=X,Y,Zitalic_P = italic_X , italic_Y , italic_Z with the eigenvalue +11+1+ 1. Here, Psuperscript𝑃P^{\prime}italic_P start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is an arbitrary Pauli operator different from P𝑃Pitalic_P, and (𝟙±𝕚)/𝟚plus-or-minus𝟙subscript𝕚2(\openone\pm P_{i})/2( blackboard_1 ± blackboard_P start_POSTSUBSCRIPT blackboard_i end_POSTSUBSCRIPT ) / blackboard_2 is the projection operator onto eigenstates of Pisubscript𝑃𝑖P_{i}italic_P start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT with the eigenvalue ±1plus-or-minus1\pm 1± 1. An ideal measurement is denoted by P,i(μ)=[(𝟙+μ𝕚)/𝟚]subscript𝑃𝑖𝜇delimited-[]𝟙𝜇subscript𝕚2\mathcal{B}_{P,i}(\mu)=[(\openone+\mu P_{i})/2]caligraphic_B start_POSTSUBSCRIPT italic_P , italic_i end_POSTSUBSCRIPT ( italic_μ ) = [ ( blackboard_1 + italic_μ blackboard_P start_POSTSUBSCRIPT blackboard_i end_POSTSUBSCRIPT ) / blackboard_2 ], which is a measurement on qubit-i𝑖iitalic_i in the P=X,Y,Z𝑃𝑋𝑌𝑍P=X,Y,Zitalic_P = italic_X , italic_Y , italic_Z basis. Here, μ=±1𝜇plus-or-minus1\mu=\pm 1italic_μ = ± 1 is the measurement outcome. The maps [U]delimited-[]𝑈[U][ italic_U ] and 𝒜P,isubscript𝒜𝑃𝑖\mathcal{A}_{P,i}caligraphic_A start_POSTSUBSCRIPT italic_P , italic_i end_POSTSUBSCRIPT are trace-preserving; P,i(μ)subscript𝑃𝑖𝜇\mathcal{B}_{P,i}(\mu)caligraphic_B start_POSTSUBSCRIPT italic_P , italic_i end_POSTSUBSCRIPT ( italic_μ ) is not, but P,i(+1)+P,i,(1)subscript𝑃𝑖1subscript𝑃𝑖1\mathcal{B}_{P,i}(+1)+\mathcal{B}_{P,i,}(-1)caligraphic_B start_POSTSUBSCRIPT italic_P , italic_i end_POSTSUBSCRIPT ( + 1 ) + caligraphic_B start_POSTSUBSCRIPT italic_P , italic_i , end_POSTSUBSCRIPT ( - 1 ) is trace-preserving.

When errors occur stochastically in an operation, the actual operation is in the form =(1p)I+pE1𝑝superscript𝐼𝑝superscript𝐸\mathcal{M}=(1-p)\mathcal{M}^{I}+p\mathcal{M}^{E}caligraphic_M = ( 1 - italic_p ) caligraphic_M start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT + italic_p caligraphic_M start_POSTSUPERSCRIPT italic_E end_POSTSUPERSCRIPT, where I=[U],𝒜P,i,P,i(μ)superscript𝐼delimited-[]𝑈subscript𝒜𝑃𝑖subscript𝑃𝑖𝜇\mathcal{M}^{I}=[U],\mathcal{A}_{P,i},\mathcal{B}_{P,i}(\mu)caligraphic_M start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT = [ italic_U ] , caligraphic_A start_POSTSUBSCRIPT italic_P , italic_i end_POSTSUBSCRIPT , caligraphic_B start_POSTSUBSCRIPT italic_P , italic_i end_POSTSUBSCRIPT ( italic_μ ) is the ideal operation, p𝑝pitalic_p is the probability of errors, and Esuperscript𝐸\mathcal{M}^{E}caligraphic_M start_POSTSUPERSCRIPT italic_E end_POSTSUPERSCRIPT is a completely positive map denoting the operation with errors.

Now, we take the Pauli error model as an example, which is a practical mode of errors in quantum computing. In the Pauli error model, the process generating Pauli errors is a map in the form 𝒩=(1p)[𝟙]+𝕡𝒩1𝑝delimited-[]𝟙𝕡\mathcal{N}=(1-p)[\openone]+p\mathcal{E}caligraphic_N = ( 1 - italic_p ) [ blackboard_1 ] + blackboard_p caligraphic_E, where p𝑝pitalic_p is the error probability, and \mathcal{E}caligraphic_E is a trace-preserving completely positive map denoting errors. If we neglect crosstalk, errors only happen on the qubit subset that an operation acts on, called the support. For a single-qubit operation on qubit-i𝑖iitalic_i, the noise map is 𝒩1=(1p)[𝟙]+(𝕡𝕏[𝕏𝕚]+𝕡𝕐[𝕐𝕚]+𝕡[𝕚])subscript𝒩11𝑝delimited-[]𝟙subscript𝕡𝕏delimited-[]subscript𝕏𝕚subscript𝕡𝕐delimited-[]subscript𝕐𝕚subscript𝕡delimited-[]subscript𝕚\mathcal{N}_{1}=(1-p)[\openone]+(p_{X}[X_{i}]+p_{Y}[Y_{i}]+p_{Z}[Z_{i}])caligraphic_N start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = ( 1 - italic_p ) [ blackboard_1 ] + ( blackboard_p start_POSTSUBSCRIPT blackboard_X end_POSTSUBSCRIPT [ blackboard_X start_POSTSUBSCRIPT blackboard_i end_POSTSUBSCRIPT ] + blackboard_p start_POSTSUBSCRIPT blackboard_Y end_POSTSUBSCRIPT [ blackboard_Y start_POSTSUBSCRIPT blackboard_i end_POSTSUBSCRIPT ] + blackboard_p start_POSTSUBSCRIPT blackboard_Z end_POSTSUBSCRIPT [ blackboard_Z start_POSTSUBSCRIPT blackboard_i end_POSTSUBSCRIPT ] ), where PX,PY,PZsubscript𝑃𝑋subscript𝑃𝑌subscript𝑃𝑍P_{X},P_{Y},P_{Z}italic_P start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT , italic_P start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT , italic_P start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT are probabilities of corresponding Pauli errors, and p=pX+pY+pZ𝑝subscript𝑝𝑋subscript𝑝𝑌subscript𝑝𝑍p=p_{X}+p_{Y}+p_{Z}italic_p = italic_p start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT is the total error probability. Similarly, for a two-qubit gate on qubits i𝑖iitalic_i and j𝑗jitalic_j, the noise map is 𝒩2=(1p)[𝟙]+=𝕏,𝕐,(𝕡𝕀[𝕚]+𝕡𝕀[𝕛])+,=𝕏,𝕐,𝕡[𝕚𝕛]subscript𝒩21𝑝delimited-[]𝟙subscript𝕏𝕐subscript𝕡𝕀delimited-[]subscript𝕚subscript𝕡𝕀delimited-[]subscript𝕛subscriptformulae-sequencesuperscript𝕏𝕐subscript𝕡superscriptdelimited-[]subscript𝕚subscriptsuperscript𝕛\mathcal{N}_{2}=(1-p)[\openone]+\sum_{P=X,Y,Z}(p_{PI}[P_{i}]+p_{IP}[P_{j}])+% \sum_{P,P^{\prime}=X,Y,Z}p_{PP^{\prime}}[P_{i}P^{\prime}_{j}]caligraphic_N start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = ( 1 - italic_p ) [ blackboard_1 ] + ∑ start_POSTSUBSCRIPT blackboard_P = blackboard_X , blackboard_Y , blackboard_Z end_POSTSUBSCRIPT ( blackboard_p start_POSTSUBSCRIPT blackboard_P blackboard_I end_POSTSUBSCRIPT [ blackboard_P start_POSTSUBSCRIPT blackboard_i end_POSTSUBSCRIPT ] + blackboard_p start_POSTSUBSCRIPT blackboard_I blackboard_P end_POSTSUBSCRIPT [ blackboard_P start_POSTSUBSCRIPT blackboard_j end_POSTSUBSCRIPT ] ) + ∑ start_POSTSUBSCRIPT blackboard_P , blackboard_P start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = blackboard_X , blackboard_Y , blackboard_Z end_POSTSUBSCRIPT blackboard_p start_POSTSUBSCRIPT blackboard_P blackboard_P start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT [ blackboard_P start_POSTSUBSCRIPT blackboard_i end_POSTSUBSCRIPT blackboard_P start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT blackboard_j end_POSTSUBSCRIPT ], where p=P=X,Y,Z(pPI+pIP)+P,P=X,Y,ZpPP𝑝subscript𝑃𝑋𝑌𝑍subscript𝑝𝑃𝐼subscript𝑝𝐼𝑃subscriptformulae-sequence𝑃superscript𝑃𝑋𝑌𝑍subscript𝑝𝑃superscript𝑃p=\sum_{P=X,Y,Z}(p_{PI}+p_{IP})+\sum_{P,P^{\prime}=X,Y,Z}p_{PP^{\prime}}italic_p = ∑ start_POSTSUBSCRIPT italic_P = italic_X , italic_Y , italic_Z end_POSTSUBSCRIPT ( italic_p start_POSTSUBSCRIPT italic_P italic_I end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT italic_I italic_P end_POSTSUBSCRIPT ) + ∑ start_POSTSUBSCRIPT italic_P , italic_P start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = italic_X , italic_Y , italic_Z end_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT italic_P italic_P start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT. To include crosstalk, the general noise map is in the form 𝒩=(1p)[𝟙]+𝕡[]𝒩1𝑝delimited-[]𝟙subscriptsubscript𝕡delimited-[]\mathcal{N}=(1-p)[\openone]+\sum_{P}p_{P}[P]caligraphic_N = ( 1 - italic_p ) [ blackboard_1 ] + ∑ start_POSTSUBSCRIPT blackboard_P end_POSTSUBSCRIPT blackboard_p start_POSTSUBSCRIPT blackboard_P end_POSTSUBSCRIPT [ blackboard_P ], where P𝑃Pitalic_P is a Pauli operator, pPsubscript𝑝𝑃p_{P}italic_p start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT is the probability of the Pauli error P𝑃Pitalic_P, and p=PpP𝑝subscript𝑃subscript𝑝𝑃p=\sum_{P}p_{P}italic_p = ∑ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT. Here, the Pauli operator P𝑃Pitalic_P is either a single-qubit Pauli operator or an arbitrary product of single-qubit operators, and it may act on qubits out of operation’s support. In this general Pauli-error noise map, the erroneous map is =p1(PpP[P])superscript𝑝1subscript𝑃subscript𝑝𝑃delimited-[]𝑃\mathcal{E}=p^{-1}(\sum_{P}p_{P}[P])caligraphic_E = italic_p start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( ∑ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT [ italic_P ] ).

In the Pauli error model, the actual operation of a gate is in the form =𝒩I𝒩superscript𝐼\mathcal{M}=\mathcal{N}\mathcal{M}^{I}caligraphic_M = caligraphic_N caligraphic_M start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT, where I=[U]superscript𝐼delimited-[]𝑈\mathcal{M}^{I}=[U]caligraphic_M start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT = [ italic_U ] is the ideal gate. For a state preparation operation, the actual operation is also in the form =𝒩I𝒩superscript𝐼\mathcal{M}=\mathcal{N}\mathcal{M}^{I}caligraphic_M = caligraphic_N caligraphic_M start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT, where I=𝒜P,isuperscript𝐼subscript𝒜𝑃𝑖\mathcal{M}^{I}=\mathcal{A}_{P,i}caligraphic_M start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT = caligraphic_A start_POSTSUBSCRIPT italic_P , italic_i end_POSTSUBSCRIPT is the ideal state preparation. For a measurement, the actual operation is in the form (μ)=I(μ)𝒩𝜇superscript𝐼𝜇𝒩\mathcal{M}(\mu)=\mathcal{M}^{I}(\mu)\mathcal{N}caligraphic_M ( italic_μ ) = caligraphic_M start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( italic_μ ) caligraphic_N (the noise occur before the ideal operation this time), where I(μ)=P,i(μ)superscript𝐼𝜇subscript𝑃𝑖𝜇\mathcal{M}^{I}(\mu)=\mathcal{B}_{P,i}(\mu)caligraphic_M start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( italic_μ ) = caligraphic_B start_POSTSUBSCRIPT italic_P , italic_i end_POSTSUBSCRIPT ( italic_μ ) is the ideal measurement. Substituting the expression of the noise map 𝒩𝒩\mathcal{N}caligraphic_N, we can find that actual operations are in the form =(1p)I+pE1𝑝superscript𝐼𝑝superscript𝐸\mathcal{M}=(1-p)\mathcal{M}^{I}+p\mathcal{M}^{E}caligraphic_M = ( 1 - italic_p ) caligraphic_M start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT + italic_p caligraphic_M start_POSTSUPERSCRIPT italic_E end_POSTSUPERSCRIPT, where E=Isuperscript𝐸superscript𝐼\mathcal{M}^{E}=\mathcal{E}\mathcal{M}^{I}caligraphic_M start_POSTSUPERSCRIPT italic_E end_POSTSUPERSCRIPT = caligraphic_E caligraphic_M start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT for gates and state preparation, and E=Isuperscript𝐸superscript𝐼\mathcal{M}^{E}=\mathcal{M}^{I}\mathcal{E}caligraphic_M start_POSTSUPERSCRIPT italic_E end_POSTSUPERSCRIPT = caligraphic_M start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT caligraphic_E for measurement.

NISQ circuits and FTQC circuits. In an NISQ circuit, qubits are prepared at the beginning and measured at the end of the circuit, and gates are applied between the state preparation and measurement. Such circuits are insufficient for the usual protocols of FTQC.

In error correction, we usually detect errors using measurements. Typically, we repeatedly measure a set of carefully chosen Pauli operators, stabilizer operators of the error correction code. Then, we send the measurement outcomes to a classical algorithm called a decoder. With the algorithm, we compute a set of Pauli gates. By applying the Pauli gates, we can correct errors on qubits. Therefore, the error correction is a typical feedback process.

We need post-selection when the magic state is used in quantum computing. Because of the error correction code, we cannot directly apply all necessary gates on logical qubits. The magic state represents a promising protocol to complete the universal gate set. In the magic-state protocol, we prepare some resource states on logical qubits and then improve their fidelities in certain distillation circuits. With the resource states distilled, we can use them to implement gates that cannot be implemented directly. The distillation circuits improve the fidelity through post-selection: the resource state is kept or discarded depending on measurements in the distillation circuit; if discarded, we need to re-prepare the state and try distillation for another round. We remark that the preparation, distillation and utilization of magic states also require feedback operations.

Refer to caption
Figure S1111: An example circuit with feedback and post-selection operations. The circuit succeeds when μ4=+1subscript𝜇41\mu_{4}=+1italic_μ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT = + 1.

An example FTQC circuit. To illustrate the expression of FTQC circuits using completely positive maps, we take the circuit in Fig. S1111 as an example, which includes feedback and post-selection operations. There are three state preparation operations and three measurements in the circuit. Depending on the measurement outcome μ7subscript𝜇7\mu_{7}italic_μ start_POSTSUBSCRIPT 7 end_POSTSUBSCRIPT, the number of gates is either two or three.

Using completely positive maps, we can express the ideal circuit in the following form

I(μ4,μ7,μ9)superscript𝐼subscript𝜇4subscript𝜇7subscript𝜇9\displaystyle\mathcal{M}^{I}(\mu_{4},\mu_{7},\mu_{9})caligraphic_M start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( italic_μ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT , italic_μ start_POSTSUBSCRIPT 7 end_POSTSUBSCRIPT , italic_μ start_POSTSUBSCRIPT 9 end_POSTSUBSCRIPT ) (S1)
=\displaystyle== δμ4,+19I(μ9)8I(μ7)7I(μ7)6Isubscript𝛿subscript𝜇41superscriptsubscript9𝐼subscript𝜇9superscriptsubscript8𝐼subscript𝜇7superscriptsubscript7𝐼subscript𝜇7superscriptsubscript6𝐼\displaystyle\delta_{\mu_{4},+1}\mathcal{M}_{9}^{I}(\mu_{9})\mathcal{M}_{8}^{I% }(\mu_{7})\mathcal{M}_{7}^{I}(\mu_{7})\mathcal{M}_{6}^{I}italic_δ start_POSTSUBSCRIPT italic_μ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT , + 1 end_POSTSUBSCRIPT caligraphic_M start_POSTSUBSCRIPT 9 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( italic_μ start_POSTSUBSCRIPT 9 end_POSTSUBSCRIPT ) caligraphic_M start_POSTSUBSCRIPT 8 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( italic_μ start_POSTSUBSCRIPT 7 end_POSTSUBSCRIPT ) caligraphic_M start_POSTSUBSCRIPT 7 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( italic_μ start_POSTSUBSCRIPT 7 end_POSTSUBSCRIPT ) caligraphic_M start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT
×5I4I(μ4)3I2I1I.absentsuperscriptsubscript5𝐼superscriptsubscript4𝐼subscript𝜇4superscriptsubscript3𝐼superscriptsubscript2𝐼superscriptsubscript1𝐼\displaystyle\times\mathcal{M}_{5}^{I}\mathcal{M}_{4}^{I}(\mu_{4})\mathcal{M}_% {3}^{I}\mathcal{M}_{2}^{I}\mathcal{M}_{1}^{I}.× caligraphic_M start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT caligraphic_M start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( italic_μ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ) caligraphic_M start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT caligraphic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT caligraphic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT .

Here, 1I=𝒜Z,1superscriptsubscript1𝐼subscript𝒜𝑍1\mathcal{M}_{1}^{I}=\mathcal{A}_{Z,1}caligraphic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT = caligraphic_A start_POSTSUBSCRIPT italic_Z , 1 end_POSTSUBSCRIPT and 2I=5I=𝒜Z,2superscriptsubscript2𝐼superscriptsubscript5𝐼subscript𝒜𝑍2\mathcal{M}_{2}^{I}=\mathcal{M}_{5}^{I}=\mathcal{A}_{Z,2}caligraphic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT = caligraphic_M start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT = caligraphic_A start_POSTSUBSCRIPT italic_Z , 2 end_POSTSUBSCRIPT are state preparation operations, 3I=6I=[U]superscriptsubscript3𝐼superscriptsubscript6𝐼delimited-[]𝑈\mathcal{M}_{3}^{I}=\mathcal{M}_{6}^{I}=[U]caligraphic_M start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT = caligraphic_M start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT = [ italic_U ] are controlled-NOT gates, 4I(μ4)=Z,2(μ4)superscriptsubscript4𝐼subscript𝜇4subscript𝑍2subscript𝜇4\mathcal{M}_{4}^{I}(\mu_{4})=\mathcal{B}_{Z,2}(\mu_{4})caligraphic_M start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( italic_μ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ) = caligraphic_B start_POSTSUBSCRIPT italic_Z , 2 end_POSTSUBSCRIPT ( italic_μ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ), 7I(μ7)=Z,2(μ7)superscriptsubscript7𝐼subscript𝜇7subscript𝑍2subscript𝜇7\mathcal{M}_{7}^{I}(\mu_{7})=\mathcal{B}_{Z,2}(\mu_{7})caligraphic_M start_POSTSUBSCRIPT 7 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( italic_μ start_POSTSUBSCRIPT 7 end_POSTSUBSCRIPT ) = caligraphic_B start_POSTSUBSCRIPT italic_Z , 2 end_POSTSUBSCRIPT ( italic_μ start_POSTSUBSCRIPT 7 end_POSTSUBSCRIPT ) and 9I(μ9)=Z,1(μ9)superscriptsubscript9𝐼subscript𝜇9subscript𝑍1subscript𝜇9\mathcal{M}_{9}^{I}(\mu_{9})=\mathcal{B}_{Z,1}(\mu_{9})caligraphic_M start_POSTSUBSCRIPT 9 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( italic_μ start_POSTSUBSCRIPT 9 end_POSTSUBSCRIPT ) = caligraphic_B start_POSTSUBSCRIPT italic_Z , 1 end_POSTSUBSCRIPT ( italic_μ start_POSTSUBSCRIPT 9 end_POSTSUBSCRIPT ) are measurements, 8I(μ7)=δμ7,+1[𝟙]+δμ𝟟,𝟙[𝕏𝟙]superscriptsubscript8𝐼subscript𝜇7subscript𝛿subscript𝜇71delimited-[]𝟙subscript𝛿subscript𝜇71delimited-[]subscript𝕏1\mathcal{M}_{8}^{I}(\mu_{7})=\delta_{\mu_{7},+1}[\openone]+\delta_{\mu_{7},-1}% [X_{1}]caligraphic_M start_POSTSUBSCRIPT 8 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( italic_μ start_POSTSUBSCRIPT 7 end_POSTSUBSCRIPT ) = italic_δ start_POSTSUBSCRIPT italic_μ start_POSTSUBSCRIPT 7 end_POSTSUBSCRIPT , + 1 end_POSTSUBSCRIPT [ blackboard_1 ] + italic_δ start_POSTSUBSCRIPT italic_μ start_POSTSUBSCRIPT blackboard_7 end_POSTSUBSCRIPT , - blackboard_1 end_POSTSUBSCRIPT [ blackboard_X start_POSTSUBSCRIPT blackboard_1 end_POSTSUBSCRIPT ] is the measurement-dependent gate in the feedback operation, and δμ4,+1subscript𝛿subscript𝜇41\delta_{\mu_{4},+1}italic_δ start_POSTSUBSCRIPT italic_μ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT , + 1 end_POSTSUBSCRIPT denotes the post-selection operation.

With noise, the map of the circuit becomes

(μ4,μ7,μ9)subscript𝜇4subscript𝜇7subscript𝜇9\displaystyle\mathcal{M}(\mu_{4},\mu_{7},\mu_{9})caligraphic_M ( italic_μ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT , italic_μ start_POSTSUBSCRIPT 7 end_POSTSUBSCRIPT , italic_μ start_POSTSUBSCRIPT 9 end_POSTSUBSCRIPT ) (S2)
=\displaystyle== δμ4,+19(μ9)8(μ7)7(μ7)6subscript𝛿subscript𝜇41subscript9subscript𝜇9subscript8subscript𝜇7subscript7subscript𝜇7subscript6\displaystyle\delta_{\mu_{4},+1}\mathcal{M}_{9}(\mu_{9})\mathcal{M}_{8}(\mu_{7% })\mathcal{M}_{7}(\mu_{7})\mathcal{M}_{6}italic_δ start_POSTSUBSCRIPT italic_μ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT , + 1 end_POSTSUBSCRIPT caligraphic_M start_POSTSUBSCRIPT 9 end_POSTSUBSCRIPT ( italic_μ start_POSTSUBSCRIPT 9 end_POSTSUBSCRIPT ) caligraphic_M start_POSTSUBSCRIPT 8 end_POSTSUBSCRIPT ( italic_μ start_POSTSUBSCRIPT 7 end_POSTSUBSCRIPT ) caligraphic_M start_POSTSUBSCRIPT 7 end_POSTSUBSCRIPT ( italic_μ start_POSTSUBSCRIPT 7 end_POSTSUBSCRIPT ) caligraphic_M start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT
×54(μ4)321,absentsubscript5subscript4subscript𝜇4subscript3subscript2subscript1\displaystyle\times\mathcal{M}_{5}\mathcal{M}_{4}(\mu_{4})\mathcal{M}_{3}% \mathcal{M}_{2}\mathcal{M}_{1},× caligraphic_M start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT caligraphic_M start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ( italic_μ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ) caligraphic_M start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT caligraphic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT caligraphic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ,

where operations are in the from j=(1pj)jI+pjjEsubscript𝑗1subscript𝑝𝑗superscriptsubscript𝑗𝐼subscript𝑝𝑗superscriptsubscript𝑗𝐸\mathcal{M}_{j}=(1-p_{j})\mathcal{M}_{j}^{I}+p_{j}\mathcal{M}_{j}^{E}caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = ( 1 - italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT + italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_E end_POSTSUPERSCRIPT. Because the eighth operation is measurement-dependent, it has a measurement-dependent error probability p8(μ7)subscript𝑝8subscript𝜇7p_{8}(\mu_{7})italic_p start_POSTSUBSCRIPT 8 end_POSTSUBSCRIPT ( italic_μ start_POSTSUBSCRIPT 7 end_POSTSUBSCRIPT ) and erroneous operation 8E(μ7)superscriptsubscript8𝐸subscript𝜇7\mathcal{M}_{8}^{E}(\mu_{7})caligraphic_M start_POSTSUBSCRIPT 8 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_E end_POSTSUPERSCRIPT ( italic_μ start_POSTSUBSCRIPT 7 end_POSTSUBSCRIPT ): p8(+1)subscript𝑝81p_{8}(+1)italic_p start_POSTSUBSCRIPT 8 end_POSTSUBSCRIPT ( + 1 ) and 8E(+1)superscriptsubscript8𝐸1\mathcal{M}_{8}^{E}(+1)caligraphic_M start_POSTSUBSCRIPT 8 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_E end_POSTSUPERSCRIPT ( + 1 ) [p8(1)subscript𝑝81p_{8}(-1)italic_p start_POSTSUBSCRIPT 8 end_POSTSUBSCRIPT ( - 1 ) and 8E(1)superscriptsubscript8𝐸1\mathcal{M}_{8}^{E}(-1)caligraphic_M start_POSTSUBSCRIPT 8 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_E end_POSTSUPERSCRIPT ( - 1 )] are the error probability and erroneous operation, respectively, of the gate 𝟙𝟙\openoneblackboard_1 (X1subscript𝑋1X_{1}italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT).

General FTQC circuits. For a general FTQC circuit, we can express the ideal circuit consisting of N𝑁Nitalic_N operations in the form

I(μ~N)superscript𝐼subscript~𝜇𝑁\displaystyle\mathcal{M}^{I}(\tilde{\mu}_{N})caligraphic_M start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT ) =\displaystyle== g(μ~N)NI(μ~N1,μN)𝑔subscript~𝜇𝑁superscriptsubscript𝑁𝐼subscript~𝜇𝑁1subscript𝜇𝑁\displaystyle g(\tilde{\mu}_{N})\mathcal{M}_{N}^{I}(\tilde{\mu}_{N-1},\mu_{N})\cdotsitalic_g ( over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT ) caligraphic_M start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_N - 1 end_POSTSUBSCRIPT , italic_μ start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT ) ⋯ (S3)
×jI(μ~j1,μj)2I(μ~1,μ2)1I(μ1).absentsuperscriptsubscript𝑗𝐼subscript~𝜇𝑗1subscript𝜇𝑗superscriptsubscript2𝐼subscript~𝜇1subscript𝜇2superscriptsubscript1𝐼subscript𝜇1\displaystyle\times\mathcal{M}_{j}^{I}(\tilde{\mu}_{j-1},\mu_{j})\cdots% \mathcal{M}_{2}^{I}(\tilde{\mu}_{1},\mu_{2})\mathcal{M}_{1}^{I}(\mu_{1}).~{}~{}× caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_j - 1 end_POSTSUBSCRIPT , italic_μ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ⋯ caligraphic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_μ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) caligraphic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( italic_μ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) .

Here, jIsuperscriptsubscript𝑗𝐼\mathcal{M}_{j}^{I}caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT is the completely positive map denoting the j𝑗jitalic_jth operations. We use μjsubscript𝜇𝑗\mu_{j}italic_μ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT to represent the measurement outcome of the j𝑗jitalic_jth operation: μj=±1subscript𝜇𝑗plus-or-minus1\mu_{j}=\pm 1italic_μ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = ± 1 (μj=0subscript𝜇𝑗0\mu_{j}=0italic_μ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = 0) if the j𝑗jitalic_jth operation is (not) a measurement. If the j𝑗jitalic_jth operation is a measurement, the map jIsuperscriptsubscript𝑗𝐼\mathcal{M}_{j}^{I}caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT is a function of the measurement outcome μjsubscript𝜇𝑗\mu_{j}italic_μ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT; even if the j𝑗jitalic_jth operation is not a measurement, we still can express it as a function of μjsubscript𝜇𝑗\mu_{j}italic_μ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT because μjsubscript𝜇𝑗\mu_{j}italic_μ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT takes only one value anyway. We use μ~jsubscript~𝜇𝑗\tilde{\mu}_{j}over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT to represent a tuple of measurement outcomes from the first to j𝑗jitalic_jth operations, and we can define it formally with the recursion formula μ~j=(μ~j1,μj)subscript~𝜇𝑗subscript~𝜇𝑗1subscript𝜇𝑗\tilde{\mu}_{j}=(\tilde{\mu}_{j-1},\mu_{j})over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = ( over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_j - 1 end_POSTSUBSCRIPT , italic_μ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) and the initial value μ~0=()subscript~𝜇0\tilde{\mu}_{0}=()over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = ( ). The j𝑗jitalic_jth operation may depend on measurement outcomes of previous operations since the feedback. Therefore, the map jIsuperscriptsubscript𝑗𝐼\mathcal{M}_{j}^{I}caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT is also a function of μ~j1subscript~𝜇𝑗1\tilde{\mu}_{j-1}over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_j - 1 end_POSTSUBSCRIPT: If the operation is not a feedback operation, the dependence is trivial, i.e. jI(μ~j1,μj)=jI(μj)superscriptsubscript𝑗𝐼subscript~𝜇𝑗1subscript𝜇𝑗superscriptsubscript𝑗𝐼subscript𝜇𝑗\mathcal{M}_{j}^{I}(\tilde{\mu}_{j-1},\mu_{j})=\mathcal{M}_{j}^{I}(\mu_{j})caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_j - 1 end_POSTSUBSCRIPT , italic_μ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) = caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( italic_μ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ), such as operations with j8𝑗8j\neq 8italic_j ≠ 8 in the example circuit; the eighth operation is the only one with a non-trivial dependence on previous measurement outcomes. The function g𝑔gitalic_g describes the post-selection,

g(μ~N)𝑔subscript~𝜇𝑁\displaystyle g(\tilde{\mu}_{N})italic_g ( over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT ) =\displaystyle== {1,μ~NS;0,μ~NS,cases1subscript~𝜇𝑁𝑆0subscript~𝜇𝑁𝑆\displaystyle\begin{cases}1,&\tilde{\mu}_{N}\in S;\\ 0,&\tilde{\mu}_{N}\notin S,\end{cases}{ start_ROW start_CELL 1 , end_CELL start_CELL over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT ∈ italic_S ; end_CELL end_ROW start_ROW start_CELL 0 , end_CELL start_CELL over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT ∉ italic_S , end_CELL end_ROW (S4)

where S𝑆Sitalic_S denotes the set of measurement outcomes indicating success.

For the noisy circuit, its map is in the form

(μ~N)subscript~𝜇𝑁\displaystyle\mathcal{M}(\tilde{\mu}_{N})caligraphic_M ( over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT ) =\displaystyle== g(μ~N)N(μ~N1,μN)𝑔subscript~𝜇𝑁subscript𝑁subscript~𝜇𝑁1subscript𝜇𝑁\displaystyle g(\tilde{\mu}_{N})\mathcal{M}_{N}(\tilde{\mu}_{N-1},\mu_{N})\cdotsitalic_g ( over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT ) caligraphic_M start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT ( over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_N - 1 end_POSTSUBSCRIPT , italic_μ start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT ) ⋯ (S5)
×j(μ~j1,μj)2(μ~1,μ2)1(μ1),absentsubscript𝑗subscript~𝜇𝑗1subscript𝜇𝑗subscript2subscript~𝜇1subscript𝜇2subscript1subscript𝜇1\displaystyle\times\mathcal{M}_{j}(\tilde{\mu}_{j-1},\mu_{j})\cdots\mathcal{M}% _{2}(\tilde{\mu}_{1},\mu_{2})\mathcal{M}_{1}(\mu_{1}),~{}~{}× caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_j - 1 end_POSTSUBSCRIPT , italic_μ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ⋯ caligraphic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_μ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) caligraphic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_μ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ,

where operations are in the from

j(μ~j1,μj)subscript𝑗subscript~𝜇𝑗1subscript𝜇𝑗\displaystyle\mathcal{M}_{j}(\tilde{\mu}_{j-1},\mu_{j})caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_j - 1 end_POSTSUBSCRIPT , italic_μ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) =\displaystyle== [1pj(μ~j1)]jI(μ~j1,μj)delimited-[]1subscript𝑝𝑗subscript~𝜇𝑗1superscriptsubscript𝑗𝐼subscript~𝜇𝑗1subscript𝜇𝑗\displaystyle[1-p_{j}(\tilde{\mu}_{j-1})]\mathcal{M}_{j}^{I}(\tilde{\mu}_{j-1}% ,\mu_{j})[ 1 - italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_j - 1 end_POSTSUBSCRIPT ) ] caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_j - 1 end_POSTSUBSCRIPT , italic_μ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) (S6)
+pj(μ~j1)jE(μ~j1,μj).subscript𝑝𝑗subscript~𝜇𝑗1superscriptsubscript𝑗𝐸subscript~𝜇𝑗1subscript𝜇𝑗\displaystyle+p_{j}(\tilde{\mu}_{j-1})\mathcal{M}_{j}^{E}(\tilde{\mu}_{j-1},% \mu_{j}).+ italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_j - 1 end_POSTSUBSCRIPT ) caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_E end_POSTSUPERSCRIPT ( over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_j - 1 end_POSTSUBSCRIPT , italic_μ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) .

If the j𝑗jitalic_jth operation is a feedback operation that depends on previous measurement outcomes, such as the eighth operation in the example circuit, error probability pjsubscript𝑝𝑗p_{j}italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT depends on μ~j1subscript~𝜇𝑗1\tilde{\mu}_{j-1}over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_j - 1 end_POSTSUBSCRIPT; otherwise, pjsubscript𝑝𝑗p_{j}italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT is a constant function; the erroneous operation jEsuperscriptsubscript𝑗𝐸\mathcal{M}_{j}^{E}caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_E end_POSTSUPERSCRIPT is similar.

The expression in Eq. (S5) illustrates the temporal order of operations that an operation can only depend on measurement outcomes in previous operations. This temporal order is unimportant for error mitigation, and if neglect, we have a simplified expression of a noisy circuit: We use 𝝁=μ~N𝝁subscript~𝜇𝑁\bm{\mu}=\tilde{\mu}_{N}bold_italic_μ = over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT to denote all measurement outcomes in the circuit, then

(𝝁)𝝁\displaystyle\mathcal{M}(\bm{\mu})caligraphic_M ( bold_italic_μ ) =\displaystyle== g(𝝁)N(𝝁)j(𝝁)2(𝝁)1(𝝁),𝑔𝝁subscript𝑁𝝁subscript𝑗𝝁subscript2𝝁subscript1𝝁\displaystyle g(\bm{\mu})\mathcal{M}_{N}(\bm{\mu})\cdots\mathcal{M}_{j}(\bm{% \mu})\cdots\mathcal{M}_{2}(\bm{\mu})\mathcal{M}_{1}(\bm{\mu}),~{}~{}~{}italic_g ( bold_italic_μ ) caligraphic_M start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT ( bold_italic_μ ) ⋯ caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( bold_italic_μ ) ⋯ caligraphic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( bold_italic_μ ) caligraphic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_italic_μ ) , (S7)

where operations are in the from

j(𝝁)subscript𝑗𝝁\displaystyle\mathcal{M}_{j}(\bm{\mu})caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( bold_italic_μ ) =\displaystyle== [1pj(𝝁)]jI(𝝁)+pj(𝝁)jE(𝝁).delimited-[]1subscript𝑝𝑗𝝁superscriptsubscript𝑗𝐼𝝁subscript𝑝𝑗𝝁superscriptsubscript𝑗𝐸𝝁\displaystyle[1-p_{j}(\bm{\mu})]\mathcal{M}_{j}^{I}(\bm{\mu})+p_{j}(\bm{\mu})% \mathcal{M}_{j}^{E}(\bm{\mu}).[ 1 - italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( bold_italic_μ ) ] caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( bold_italic_μ ) + italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( bold_italic_μ ) caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_E end_POSTSUPERSCRIPT ( bold_italic_μ ) . (S8)

Expansion formula of the observable with errors. With a circuit, we can evaluate observables that are functions of measurement outcomes, f(𝝁)𝑓𝝁f(\bm{\mu})italic_f ( bold_italic_μ ). For example, if we want to evaluate the observable Z1subscript𝑍1Z_{1}italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT in the example circuit, we take f(𝝁)=μ9𝑓𝝁subscript𝜇9f(\bm{\mu})=\mu_{9}italic_f ( bold_italic_μ ) = italic_μ start_POSTSUBSCRIPT 9 end_POSTSUBSCRIPT. In the case of error correction, this function takes into account the last-round error correction on the measurement outcomes.

The ideal expected value of the observable is

Oidealsubscriptdelimited-⟨⟩𝑂ideal\displaystyle\langle O\rangle_{\rm ideal}⟨ italic_O ⟩ start_POSTSUBSCRIPT roman_ideal end_POSTSUBSCRIPT =\displaystyle== 𝝁f(𝝁)Tr[I(𝝁)ρ],subscript𝝁𝑓𝝁Trsuperscript𝐼𝝁𝜌\displaystyle\sum_{\bm{\mu}}f(\bm{\mu})\operatorname{Tr}\left[\mathcal{M}^{I}(% \bm{\mu})\rho\right],∑ start_POSTSUBSCRIPT bold_italic_μ end_POSTSUBSCRIPT italic_f ( bold_italic_μ ) roman_Tr [ caligraphic_M start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( bold_italic_μ ) italic_ρ ] , (S9)

where ρ𝜌\rhoitalic_ρ is the initial state, and Tr[I(𝝁)ρ]Trsuperscript𝐼𝝁𝜌\operatorname{Tr}\left[\mathcal{M}^{I}(\bm{\mu})\rho\right]roman_Tr [ caligraphic_M start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ( bold_italic_μ ) italic_ρ ] is the probability of the measurement outcome 𝝁𝝁\bm{\mu}bold_italic_μ in the ideal circuit.

Because of the post-selection, the distribution may not be normalized in general. The expected value in the normalized distribution is

Oidealsubscriptdelimited-⟨⟩𝑂ideal\displaystyle\langle O\rangle_{\rm ideal}⟨ italic_O ⟩ start_POSTSUBSCRIPT roman_ideal end_POSTSUBSCRIPT =\displaystyle== Oideal𝟙ideal,subscriptdelimited-⟨⟩𝑂idealsubscriptdelimited-⟨⟩𝟙ideal\displaystyle\frac{\langle O\rangle_{\rm ideal}}{\langle\openone\rangle_{\rm ideal% }},divide start_ARG ⟨ italic_O ⟩ start_POSTSUBSCRIPT roman_ideal end_POSTSUBSCRIPT end_ARG start_ARG ⟨ blackboard_1 ⟩ start_POSTSUBSCRIPT roman_ideal end_POSTSUBSCRIPT end_ARG , (S10)

and we have the expression of 𝟙idealsubscriptdelimited-⟨⟩𝟙ideal\langle\openone\rangle_{\rm ideal}⟨ blackboard_1 ⟩ start_POSTSUBSCRIPT roman_ideal end_POSTSUBSCRIPT by substituting f=1𝑓1f=1italic_f = 1 into Eq. (S9). In the following, we focus on Odelimited-⟨⟩𝑂\langle O\rangle⟨ italic_O ⟩, and the result can be applied to 𝟙delimited-⟨⟩𝟙\langle\openone\rangle⟨ blackboard_1 ⟩.

With a noisy circuit, the expected value of the observable is

Odelimited-⟨⟩𝑂\displaystyle\langle O\rangle⟨ italic_O ⟩ =\displaystyle== 𝝁f(𝝁)Tr[(𝝁)ρ],subscript𝝁𝑓𝝁Tr𝝁𝜌\displaystyle\sum_{\bm{\mu}}f(\bm{\mu})\operatorname{Tr}\left[\mathcal{M}(\bm{% \mu})\rho\right],∑ start_POSTSUBSCRIPT bold_italic_μ end_POSTSUBSCRIPT italic_f ( bold_italic_μ ) roman_Tr [ caligraphic_M ( bold_italic_μ ) italic_ρ ] , (S11)

where Tr[(𝝁)ρ]Tr𝝁𝜌\operatorname{Tr}\left[\mathcal{M}(\bm{\mu})\rho\right]roman_Tr [ caligraphic_M ( bold_italic_μ ) italic_ρ ] is the probability of the measurement outcome 𝝁𝝁\bm{\mu}bold_italic_μ in the noisy circuit. To explicitly express the observable as a function of error probabilities, we rewrite each actual operation in the circuit as

jsubscript𝑗\displaystyle\mathcal{M}_{j}caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT =\displaystyle== j(0)+pjj(1)=bj=0,1pjbjj(bj),superscriptsubscript𝑗0subscript𝑝𝑗superscriptsubscript𝑗1subscriptsubscript𝑏𝑗01superscriptsubscript𝑝𝑗subscript𝑏𝑗superscriptsubscript𝑗subscript𝑏𝑗\displaystyle\mathcal{M}_{j}^{(0)}+p_{j}\mathcal{M}_{j}^{(1)}=\sum_{b_{j}=0,1}% p_{j}^{b_{j}}\mathcal{M}_{j}^{(b_{j})},caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT + italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT = ∑ start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = 0 , 1 end_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) end_POSTSUPERSCRIPT , (S12)

where j(0)=jIsuperscriptsubscript𝑗0superscriptsubscript𝑗𝐼\mathcal{M}_{j}^{(0)}=\mathcal{M}_{j}^{I}caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT = caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT is the ideal operation, and

j(1)superscriptsubscript𝑗1\displaystyle\mathcal{M}_{j}^{(1)}caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT =\displaystyle== jEjIsuperscriptsubscript𝑗𝐸superscriptsubscript𝑗𝐼\displaystyle\mathcal{M}_{j}^{E}-\mathcal{M}_{j}^{I}caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_E end_POSTSUPERSCRIPT - caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT (S13)

is the deviation from the ideal operation. Substituting this expression of jsubscript𝑗\mathcal{M}_{j}caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, we have

Odelimited-⟨⟩𝑂\displaystyle\langle O\rangle⟨ italic_O ⟩ =\displaystyle== 𝝁b1,b2,,bN=0,1[j=1Npjbj(𝝁)]subscript𝝁subscriptformulae-sequencesubscript𝑏1subscript𝑏2subscript𝑏𝑁01delimited-[]superscriptsubscriptproduct𝑗1𝑁superscriptsubscript𝑝𝑗subscript𝑏𝑗𝝁\displaystyle\sum_{\bm{\mu}}\sum_{b_{1},b_{2},\ldots,b_{N}=0,1}\left[\prod_{j=% 1}^{N}p_{j}^{b_{j}}(\bm{\mu})\right]∑ start_POSTSUBSCRIPT bold_italic_μ end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_b start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT = 0 , 1 end_POSTSUBSCRIPT [ ∏ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( bold_italic_μ ) ] (S14)
×a(b1,b2,,bN;𝝁),absent𝑎subscript𝑏1subscript𝑏2subscript𝑏𝑁𝝁\displaystyle\times a(b_{1},b_{2},\ldots,b_{N};\bm{\mu}),× italic_a ( italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_b start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT ; bold_italic_μ ) ,

where

a(b1,b2,,bN;𝝁)𝑎subscript𝑏1subscript𝑏2subscript𝑏𝑁𝝁\displaystyle a(b_{1},b_{2},\ldots,b_{N};\bm{\mu})italic_a ( italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_b start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT ; bold_italic_μ ) (S15)
=\displaystyle== f(𝝁)g(𝝁)Tr[N(bN)(𝝁)1(b1)(𝝁)ρ].𝑓𝝁𝑔𝝁Trsuperscriptsubscript𝑁subscript𝑏𝑁𝝁superscriptsubscript1subscript𝑏1𝝁𝜌\displaystyle f(\bm{\mu})g(\bm{\mu})\operatorname{Tr}\left[\mathcal{M}_{N}^{(b% _{N})}(\bm{\mu})\cdots\mathcal{M}_{1}^{(b_{1})}(\bm{\mu})\rho\right].italic_f ( bold_italic_μ ) italic_g ( bold_italic_μ ) roman_Tr [ caligraphic_M start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_b start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT ) end_POSTSUPERSCRIPT ( bold_italic_μ ) ⋯ caligraphic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) end_POSTSUPERSCRIPT ( bold_italic_μ ) italic_ρ ] .

Eventually, we can rewrite the expression as

O=Oideal+k=1Nak,delimited-⟨⟩𝑂subscriptdelimited-⟨⟩𝑂idealsuperscriptsubscript𝑘1𝑁subscript𝑎𝑘\displaystyle\langle O\rangle=\langle O\rangle_{\rm ideal}+\sum_{k=1}^{N}a_{k},⟨ italic_O ⟩ = ⟨ italic_O ⟩ start_POSTSUBSCRIPT roman_ideal end_POSTSUBSCRIPT + ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , (S16)

where

aksubscript𝑎𝑘\displaystyle a_{k}italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT =\displaystyle== 𝝁j1<j2<<jksubscript𝝁subscriptsubscript𝑗1subscript𝑗2subscript𝑗𝑘\displaystyle\sum_{\bm{\mu}}\sum_{j_{1}<j_{2}<\cdots<j_{k}}∑ start_POSTSUBSCRIPT bold_italic_μ end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT < italic_j start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT < ⋯ < italic_j start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT (S17)
pj1(𝝁)pj2(𝝁)pjk(𝝁)a(j1,j2,,jk;𝝁),subscript𝑝subscript𝑗1𝝁subscript𝑝subscript𝑗2𝝁subscript𝑝subscript𝑗𝑘𝝁𝑎subscript𝑗1subscript𝑗2subscript𝑗𝑘𝝁\displaystyle p_{j_{1}}(\bm{\mu})p_{j_{2}}(\bm{\mu})\cdots p_{j_{k}}(\bm{\mu})% a(j_{1},j_{2},\ldots,j_{k};\bm{\mu}),italic_p start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_μ ) italic_p start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_μ ) ⋯ italic_p start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_μ ) italic_a ( italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_j start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_j start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ; bold_italic_μ ) ,

and

a(j1,j2,,jk)𝑎subscript𝑗1subscript𝑗2subscript𝑗𝑘\displaystyle a(j_{1},j_{2},\ldots,j_{k})italic_a ( italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_j start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_j start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) (S18)
=\displaystyle== a(b1,b2,,bN)|bj=1 iff j{j1,j2,,jN}.evaluated-at𝑎subscript𝑏1subscript𝑏2subscript𝑏𝑁subscript𝑏𝑗1 iff 𝑗subscript𝑗1subscript𝑗2subscript𝑗𝑁\displaystyle a(b_{1},b_{2},\ldots,b_{N})|_{b_{j}=1\text{ iff }j\in\{j_{1},j_{% 2},\ldots,j_{N}\}}.italic_a ( italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_b start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT ) | start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = 1 iff italic_j ∈ { italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_j start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_j start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT } end_POSTSUBSCRIPT .

Here, we have used that 𝝁a(0,0,,0;𝝁)=Oidealsubscript𝝁𝑎000𝝁subscriptdelimited-⟨⟩𝑂ideal\sum_{\bm{\mu}}a(0,0,\ldots,0;\bm{\mu})=\langle O\rangle_{\rm ideal}∑ start_POSTSUBSCRIPT bold_italic_μ end_POSTSUBSCRIPT italic_a ( 0 , 0 , … , 0 ; bold_italic_μ ) = ⟨ italic_O ⟩ start_POSTSUBSCRIPT roman_ideal end_POSTSUBSCRIPT.

In ZNE, we mitigate errors by boosting the error probability. If we boost errors in each operation with the same factor of r𝑟ritalic_r (i.e. pjrpjsubscript𝑝𝑗𝑟subscript𝑝𝑗p_{j}\rightarrow rp_{j}italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT → italic_r italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT for all j𝑗jitalic_j), the computing result becomes a polynomial function of r𝑟ritalic_r, O(r)=Oideal+k=1Nakrkdelimited-⟨⟩𝑂𝑟subscriptdelimited-⟨⟩𝑂idealsuperscriptsubscript𝑘1𝑁subscript𝑎𝑘superscript𝑟𝑘\langle O\rangle(r)=\langle O\rangle_{\rm ideal}+\sum_{k=1}^{N}a_{k}r^{k}⟨ italic_O ⟩ ( italic_r ) = ⟨ italic_O ⟩ start_POSTSUBSCRIPT roman_ideal end_POSTSUBSCRIPT + ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_r start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT. This formula holds for both NISQ and FTQC circuits.

S2 Correlations in logical errors of qLDPC codes

Unlike the surface code, many qLDPC codes encode a large number of logical qubits in a single block of physical qubits. Errors in these logical qubits could be significantly correlated, tending to occur simultaneously on multiple logical qubits, as illustrated in Fig. S2222. In the [[72,12,6]]delimited-[]72126[[72,12,6]][ [ 72 , 12 , 6 ] ] code [22], the probability that 4444 to 7777 logical qubits fail simultaneously is higher than other cases, indicating significant correlations. For such correlated errors, measuring their error rates is extremely challenging. Even assuming Pauli errors, obtaining a model of correlated errors requires measuring 4k1superscript4𝑘14^{k}-14 start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT - 1 error rates, where k𝑘kitalic_k is the number of logical qubits in a block.

Refer to caption
Figure S2222: Numerical results of multi-qubit correlations in a bivariate bicycle code. We take the [[72,12,6]]delimited-[]72126[[72,12,6]][ [ 72 , 12 , 6 ] ] code in the bivariate bicycle family and a physical error rate of p=0.002𝑝0.002p=0.002italic_p = 0.002 per operation as an example. Using Monte Carlo simulation, we evaluate the number of logical qubits Nbitsubscript𝑁𝑏𝑖𝑡N_{bit}italic_N start_POSTSUBSCRIPT italic_b italic_i italic_t end_POSTSUBSCRIPT that are affected by logical errors simultaneously in each trial. The probability of each qubit number is ploted.

S3 Device performance

Two quantum processors, named Processor I and Processor II, are used in this work for experiments of repetition code and surface code, respectively. Processor I consists of a 6×6666\times 66 × 6 transmon qubit array, from which 13 qubits are utilized in the experiment. Processor II has the same architecture as Processor I, but with a qubit array of 11×11111111\times 1111 × 11. In the distance-3 surface code experiments, 17 qubits are used. The median T1subscript𝑇1T_{1}italic_T start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and T2SEsuperscriptsubscript𝑇2SET_{2}^{\rm SE}italic_T start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_SE end_POSTSUPERSCRIPT times for the qubits used in Processor I (Processor II) are 124124124124 μ𝜇\muitalic_μs and 11111111 μ𝜇\muitalic_μs (128128128128 μ𝜇\muitalic_μs and 16161616 μ𝜇\muitalic_μs), respectively. In this work, we adopt two different readout schemes: (1) conventional measurement for multi-round experiments; (2) using state |2ket2\ket{2}| start_ARG 2 end_ARG ⟩ to reduce measurement errors for one-round experiments. The gate and measurement errors of two processors are visualized in Fig. S3333.

Refer to caption
Figure S3333: Cumulative distributions of operation error rates measured on Processor I and Processor II. Red: Pauli errors for simultaneous single-qubit gates. Black: Pauli errors for simultaneous CZ gates. Blue: average identification errors for measurement where additional microwave pulses are applied to excite each qubit from the state |1ket1\ket{1}| start_ARG 1 end_ARG ⟩ to the state |2ket2\ket{2}| start_ARG 2 end_ARG ⟩ before performing the dispersive readout (Method I). Green: average identification errors for conventional measurement (Method II). Dashed lines: median values.
Table S1: Qubit parameters, coherence properties and gate performance for Processor I and Processor II.
Processor I Processor II
Parameter Median Mean Stdev. Median Mean Stdev.
Qubit idle frequency, ωidle/2πsubscript𝜔idle2𝜋\omega_{\rm idle}/2\piitalic_ω start_POSTSUBSCRIPT roman_idle end_POSTSUBSCRIPT / 2 italic_π (GHz) 4.000 3.962 0.128 4.123 4.120 0.131
Qubit anharmonicity, α/2π𝛼2𝜋\alpha/2\piitalic_α / 2 italic_π (MHz) -197.3 -197.1 1.6 -214.6 -214.7 2.3
Readout frequency, ωr/2πsubscript𝜔r2𝜋\omega_{\rm r}/2\piitalic_ω start_POSTSUBSCRIPT roman_r end_POSTSUBSCRIPT / 2 italic_π (GHz) 6.316 6.299 0.095 6.416 6.416 0.086
Energy relaxation time, T1subscript𝑇1T_{1}italic_T start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT (μ𝜇\muitalic_μs) 124.36 111.82 33.22 127.89 132.44 24.32
Spin-echo dephasing time, T2SEsuperscriptsubscript𝑇2SET_{2}^{\rm SE}italic_T start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_SE end_POSTSUPERSCRIPT (μ𝜇\muitalic_μs) 11.34 11.72 2.35 15.55 18.18 7.81
Readout error ersubscript𝑒re_{\rm r}italic_e start_POSTSUBSCRIPT roman_r end_POSTSUBSCRIPT using Method I / Method II (%) 0.47 / 0.87 0.52 / 0.82 0.16 / 0.18 0.87 / - 0.94 / - 0.30 / -
1Q XEB Pauli error, e1subscript𝑒1e_{1}italic_e start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT (%) 0.085 0.079 0.021 0.055 0.052 0.019
2Q XEB Pauli error, e2subscript𝑒2e_{2}italic_e start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT (%) 0.56 0.56 0.14 0.37 0.40 0.17

S4 Generating circuit instances and calculating the expectations

To controllably amplify existing errors, we choose to insert an operation drawn from the list of {I,X,Y,Z}𝐼𝑋𝑌𝑍\{I,X,Y,Z\}{ italic_I , italic_X , italic_Y , italic_Z } during the operational stage for each qubit. The most intuitive way to generate adequate circuit instances for the estimation of ZLdelimited-⟨⟩subscript𝑍L\langle Z_{\rm L}\rangle⟨ italic_Z start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT ⟩ (or XLdelimited-⟨⟩subscript𝑋L\langle X_{\rm L}\rangle⟨ italic_X start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT ⟩) — randomly selecting the operation with assigned probabilities for each circuit instance — can be inefficient. For example, in the scenario of using a distance-3 (7) repetition code, around 80%percent8080\%80 % (60%percent6060\%60 %) of circuit instances are devoid of any error injection when p=3.6%𝑝percent3.6p=3.6\%italic_p = 3.6 % and r=1𝑟1r=1italic_r = 1. In addition, circuit instances with the presence of multiple injected errors, which can potentially lead to the failure of quantum error correction and contribute to the infidelity of estimation, are unlikely to be chosen. Alternatively, we can iterate over all possible circuit instances and numerically construct the desired circuit list by calculating the weighted average. For example, in the experiment related to Fig. 2 of the main text, all 43superscript434^{3}4 start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT circuit instances were experimentally implemented for each value of r𝑟ritalic_r, and the resulting data were processed to estimate the expectations of Z0subscript𝑍0Z_{0}italic_Z start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. However, in repetition and surface codes, the large number of circuit instances makes full implementation impractical. To address this challenge, we employ a method to select the most likely circuit instances as follows.

First, we determine the total number of circuit instances, which scales proportionally with the code distance to ensure more accurate estimations (see Table S2). Second, we consider the case of injecting k𝑘kitalic_k errors into the circuit and calculate the required number of circuit instances based on the error-injection probability rp𝑟𝑝rpitalic_r italic_p. Circuit instances are then randomly selected without replacement until the desired number is reached or all possible instances have been selected. The number of remaining circuit instances to be determined is updated accordingly. Finally, the second and third steps are iteratively repeated, starting with k=0𝑘0k=0italic_k = 0 and continuing until the total number is reached.

The above procedure assumes perfect operations and measurements. In practice, however, when the number of possible circuit instances for injecting k𝑘kitalic_k errors is small — such as in the case of k=0𝑘0k=0italic_k = 0 — the statistical error due to finite measurement shots can proportionately impact the overall estimation accuracy, scaling with the assigned weight. To reduce this effect, we introduce a key modification to the procedure: the circuit number for each k𝑘kitalic_k is required to exceed a predefined threshold, such as 1%percent11\%1 % of the total circuit number. Equivalently, this modification increases the number of measurement shots, thereby reducing statistical errors and improving the accuracy of the estimation.

Table S2: Number of circuit instances used to estimate the expectation values.
1-round 2-round 3-round 4-round
distance-3 repetition code 1000100010001000 1000100010001000 1500150015001500 2000200020002000
distance-5 repetition code 3500350035003500 5000500050005000 5000500050005000 6000600060006000
distance-7 repetition code 6000600060006000 6000600060006000 6000600060006000 7000700070007000
distance-3 surface code 4000400040004000 - - -

We estimate the expectation values, such as ZLdelimited-⟨⟩subscript𝑍L\langle Z_{\rm L}\rangle⟨ italic_Z start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT ⟩ and XLdelimited-⟨⟩subscript𝑋L\langle X_{\rm L}\rangle⟨ italic_X start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT ⟩, by averaging measured results across all generated circuit instances with the respective weights. Specifically, we denote Ok,c,ssubscript𝑂𝑘𝑐𝑠O_{k,c,s}italic_O start_POSTSUBSCRIPT italic_k , italic_c , italic_s end_POSTSUBSCRIPT as the corresponding outcome with k𝑘kitalic_k errors are injected into the circuit, with c𝑐citalic_c representing the circuit index and s𝑠sitalic_s the index of measurement shot. The expectation value is then calculated as

O¯=1Sk,c,sP(k)Ok,c,sC(k),¯𝑂1𝑆subscript𝑘𝑐𝑠𝑃𝑘subscript𝑂𝑘𝑐𝑠𝐶𝑘\overline{O}=\frac{1}{S}\sum_{k,c,s}P(k)\frac{O_{k,c,s}}{C(k)},over¯ start_ARG italic_O end_ARG = divide start_ARG 1 end_ARG start_ARG italic_S end_ARG ∑ start_POSTSUBSCRIPT italic_k , italic_c , italic_s end_POSTSUBSCRIPT italic_P ( italic_k ) divide start_ARG italic_O start_POSTSUBSCRIPT italic_k , italic_c , italic_s end_POSTSUBSCRIPT end_ARG start_ARG italic_C ( italic_k ) end_ARG , (S19)

where S𝑆Sitalic_S represents the number of measurement shots for each circuit instance, C(k)𝐶𝑘C(k)italic_C ( italic_k ) denotes the number of chosen circuits with k𝑘kitalic_k errors injected, and P(k)𝑃𝑘P(k)italic_P ( italic_k ) is the respective weight. The value of S𝑆Sitalic_S for experiments of repetition code and surface code is 150150150150.

S5 More experimental details

Standard error of the measured expectations. As mentioned in Section S4, S𝑆Sitalic_S measurement shots are taken for each circuit. It allows us to directly calculate the standard deviation of O¯¯𝑂\overline{O}over¯ start_ARG italic_O end_ARG (see Eq. S19) without requiring additional experimental effort. First, we calculate the average expectation value for each shot index s𝑠sitalic_s as

O¯s=k,cP(k)Ok,c,sC(k),subscript¯𝑂𝑠subscript𝑘𝑐𝑃𝑘subscript𝑂𝑘𝑐𝑠𝐶𝑘\overline{O}_{s}=\sum_{k,c}P(k)\frac{O_{k,c,s}}{C(k)},over¯ start_ARG italic_O end_ARG start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_k , italic_c end_POSTSUBSCRIPT italic_P ( italic_k ) divide start_ARG italic_O start_POSTSUBSCRIPT italic_k , italic_c , italic_s end_POSTSUBSCRIPT end_ARG start_ARG italic_C ( italic_k ) end_ARG , (S20)

where the sum is taken over the error count k𝑘kitalic_k and circuit instance c𝑐citalic_c. Next, the unbiased standard deviation of O¯¯𝑂\overline{O}over¯ start_ARG italic_O end_ARG can be obtained by

σ[O¯]=1S(S1)s(O¯sO¯)2.𝜎delimited-[]¯𝑂1𝑆𝑆1subscript𝑠superscriptsubscript¯𝑂𝑠¯𝑂2\sigma\left[\overline{O}\right]=\sqrt{\frac{1}{S(S-1)}\sum_{s}\left(\overline{% O}_{s}-\overline{O}\right)^{2}}.italic_σ [ over¯ start_ARG italic_O end_ARG ] = square-root start_ARG divide start_ARG 1 end_ARG start_ARG italic_S ( italic_S - 1 ) end_ARG ∑ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( over¯ start_ARG italic_O end_ARG start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT - over¯ start_ARG italic_O end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG . (S21)

We note that the standard errors are relatively small for both the repetition code and the surface code, making them difficult to visualize clearly in the main figures. See Fig. S4444 for an enlarged view of the standard errors for the one-round repetition code.

Refer to caption
Figure S4444: Standard error of the measured ZLdelimited-⟨⟩subscript𝑍L\langle Z_{\rm L}\rangle⟨ italic_Z start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT ⟩. All insets have a spread of 0.0040.0040.0040.004 along the y𝑦yitalic_y-axis, with the data points positioned at the center of the figures.

Uncorrected expectation values for the repetition code. In Fig. S5555, we observe that the uncorrected expectation values for different repetition code distances nearly overlap. The slight differences observed for large noise scaling factors are mainly due to the randomness of the circuit instances, which also influence the simulation results.

Refer to caption
Figure S5555: Uncorrected results of the one-round repetition code. For different repetition code distances, the uncorrected expectation values of ZLsubscript𝑍LZ_{\rm L}italic_Z start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT remain nearly the same. Dashed lines: noisy numerical simulations.

Investigating the scalability of ZNE with a fixed error rate per parity-check round. Unlike in the main text, where the unit error probability for each M𝑀Mitalic_M is adjusted to maintain a fixed total error rate (see Table S3), in this study, the unit error probability in each parity-check round is fixed at p=3.6%𝑝percent3.6p=3.6\%italic_p = 3.6 % and then the performance of ZNE is evaluated. As depicted in Fig. S6666, the relative bias, represented by δ/δ0𝛿subscript𝛿0\delta/\delta_{0}italic_δ / italic_δ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, consistently remains below 1, confirming the effectiveness of ZNE. The ZNE results demonstrate consistent trends for different code distances and parity-check round numbers, with a larger code distance resulting in a smaller sampling overhead. Additionally, the sampling overhead remains nearly unchanged for different numbers of parity-check rounds. These findings demonstrate the scalability of ZNE for error correction circuits, even with increasing circuit complexity.

Table S3: Unit error probabilities. For each M𝑀Mitalic_M, the unit error probability is chosen so that the total error rate is fixed.
Round, M𝑀Mitalic_M 1 2 3 4
Unit error probability, p𝑝pitalic_p (%percent\%%) 13.6 9.4 7.2 5.7
Refer to caption
Figure S6666: Relative bias and sampling overhead of the ZNE. For the implementation of ZNE, we use K=1𝐾1K=1italic_K = 1. Note that δ/δ0𝛿subscript𝛿0\delta/\delta_{0}italic_δ / italic_δ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is presented in a linear scale, whereas η𝜂\etaitalic_η is shown in a logarithmic scale. Inset: biases without ZNE.

Preparation of an arbitrary logical state in the distance-3 surface code. In this work, an arbitrary logical state in the distance-3 surface code is prepared using the circuit illustrated in Fig. S7777. For specific cases, such as the initialization of states |0Lketsubscript0L\ket{0_{\rm L}}| start_ARG 0 start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT end_ARG ⟩ and |+LketsubscriptL\ket{+_{\rm L}}| start_ARG + start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT end_ARG ⟩, the circuit can be further simplified, as described in Ref. [13]. Additionally, the implementation of state initialization in a configuration where two-qubit gates can be performed between nearest-neighbor data qubits is detailed in Ref. [40].

Refer to caption
Figure S7777: Circuit for the initialization of an arbitrary logical state. The preparation of an arbitrary logical state |ψL=α|0L+β|1Lketsubscript𝜓L𝛼ketsubscript0L𝛽ketsubscript1L\ket{\psi_{\rm L}}=\alpha\ket{0_{\rm L}}+\beta\ket{1_{\rm L}}| start_ARG italic_ψ start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT end_ARG ⟩ = italic_α | start_ARG 0 start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT end_ARG ⟩ + italic_β | start_ARG 1 start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT end_ARG ⟩ proceeds as follows. First, a single-qubit gate initializes data qubit D4𝐷4D4italic_D 4 in the state |ψ=α|0+β|1ket𝜓𝛼ket0𝛽ket1|\psi\rangle=\alpha|0\rangle+\beta|1\rangle| italic_ψ ⟩ = italic_α | 0 ⟩ + italic_β | 1 ⟩, with Hadamard gates applied to four representative qubits. Next, six CNOT gates (highlighted in red) are used to create a GHZ-like state — α|000+β|111𝛼ket000𝛽ket111\alpha|000\rangle+\beta|111\rangleitalic_α | 000 ⟩ + italic_β | 111 ⟩ — on the vertically aligned data qubits D1𝐷1D1italic_D 1, D4𝐷4D4italic_D 4, and D7𝐷7D7italic_D 7. Finally, multiple layers of CNOT gates are sequentially applied to construct the desired logical state. In particular, for the preparation of state |0Lketsubscript0L\ket{0_{\rm L}}| start_ARG 0 start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT end_ARG ⟩, the six CNOT gates highlighted in red are unnecessary and can be omitted.

More ZNE results in the distance-3 surface code. The expectation values of the ZLsubscript𝑍LZ_{\rm L}italic_Z start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT and XLsubscript𝑋LX_{\rm L}italic_X start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT observables are measured as a function of the noise scaling factor r𝑟ritalic_r for the states |0Lketsubscript0L\ket{0_{\rm L}}| start_ARG 0 start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT end_ARG ⟩ and |+LketsubscriptL\ket{+_{\rm L}}| start_ARG + start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT end_ARG ⟩, respectively (Fig. S8888). These results are consistent with those obtained for |ψLketsubscript𝜓L\ket{\psi_{\rm L}}| start_ARG italic_ψ start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT end_ARG ⟩ in the main text but specifically address either bit-flip error or phase-flip error. The measured values of the XLsubscript𝑋LX_{\rm L}italic_X start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT observable, with the initial state set to |ψL=cosπ6|0L+sinπ6|1Lketsubscript𝜓L𝜋6ketsubscript0L𝜋6ketsubscript1L|\psi_{\rm L}\rangle=\cos\frac{\pi}{6}|0_{\rm L}\rangle+\sin\frac{\pi}{6}|1_{% \rm L}\rangle| italic_ψ start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT ⟩ = roman_cos divide start_ARG italic_π end_ARG start_ARG 6 end_ARG | 0 start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT ⟩ + roman_sin divide start_ARG italic_π end_ARG start_ARG 6 end_ARG | 1 start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT ⟩, are also shown in the figure. All δ𝛿\deltaitalic_δ-η𝜂\etaitalic_η scatter plots demonstrate the effectiveness of ZNE in the surface code.

Refer to caption
Figure S8888: Experimental results of ZNE for the |0Lketsubscript0L\ket{0_{\rm L}}| start_ARG 0 start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT end_ARG ⟩ and |+LketsubscriptL\ket{+_{\rm L}}| start_ARG + start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT end_ARG ⟩ states in the distance-3 surface code. ab, Measured expectation values of the ZLsubscript𝑍LZ_{\rm L}italic_Z start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT observable for the logical state |0Lketsubscript0L\ket{0_{\rm L}}| start_ARG 0 start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT end_ARG ⟩ and corresponding scatter plots of the bias δ𝛿\deltaitalic_δ and sampling overhead η𝜂\etaitalic_η. cf, Similar to panels a and b, but with the observable being XLsubscript𝑋LX_{\rm L}italic_X start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT and the initial states being |+LketsubscriptL\ket{+_{\rm L}}| start_ARG + start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT end_ARG ⟩ (panels c and d) and |ψL=cosπ6|0L+sinπ6|1Lketsubscript𝜓L𝜋6ketsubscript0L𝜋6ketsubscript1L|\psi_{\rm L}\rangle=\cos\frac{\pi}{6}|0_{\rm L}\rangle+\sin\frac{\pi}{6}|1_{% \rm L}\rangle| italic_ψ start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT ⟩ = roman_cos divide start_ARG italic_π end_ARG start_ARG 6 end_ARG | 0 start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT ⟩ + roman_sin divide start_ARG italic_π end_ARG start_ARG 6 end_ARG | 1 start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT ⟩ (panels e and f).

Numerical simulations as benchmark. We conduct numerical simulations using two popular open-source frameworks: Stim [36] and Qiskit [42]. For simulations of the repetition code, where the initial preparation of the logical state (|0L=|00ketsubscript0Lket00\ket{0_{\rm L}}=\ket{0\cdots 0}| start_ARG 0 start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT end_ARG ⟩ = | start_ARG 0 ⋯ 0 end_ARG ⟩) is nearly perfect, errors introduced by the gate and measurement imperfections are then modeled using a depolarizing channel and a bit-flip channel, respectively. The error rates for these models are experimentally calibrated (see Fig. S3333 and Table S1). In contrast, for simulations of the surface code, where initial states are imperfectly generated via multiple layers of single-qubit and two-qubit gates, the introduced imperfection is simplified by a single-qubit depolarizing channel uniformly applied to each qubit. The depolarizing rate is calibrated by matching the expectation value of an observable with error injection disabled in the circuit. In this work, |0Lketsubscript0L\ket{0_{\rm L}}| start_ARG 0 start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT end_ARG ⟩ is used as the initial state, and ZLsubscript𝑍LZ_{\rm L}italic_Z start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT as the observable, resulting in a calibrated depolarizing rate of 0.0750.0750.0750.075. As shown in Fig. 4 of the main text, this noise model closely aligns with the experimental data.

S6 Protocol

To implement the K𝐾Kitalic_Kth-order ZNE, we use K+1𝐾1K+1italic_K + 1 data points from experiments, {(r0,O(r0)),(r1,O(r1)),,(rK,O(rK))}subscript𝑟0delimited-⟨⟩𝑂subscript𝑟0subscript𝑟1delimited-⟨⟩𝑂subscript𝑟1subscript𝑟𝐾delimited-⟨⟩𝑂subscript𝑟𝐾\{(r_{0},\langle O\rangle(r_{0})),(r_{1},\langle O\rangle(r_{1})),\ldots,(r_{K% },\langle O\rangle(r_{K}))\}{ ( italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , ⟨ italic_O ⟩ ( italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) , ( italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⟨ italic_O ⟩ ( italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ) , … , ( italic_r start_POSTSUBSCRIPT italic_K end_POSTSUBSCRIPT , ⟨ italic_O ⟩ ( italic_r start_POSTSUBSCRIPT italic_K end_POSTSUBSCRIPT ) ) }. Here, O(r)delimited-⟨⟩𝑂𝑟\langle O\rangle(r)⟨ italic_O ⟩ ( italic_r ) denotes the expected value of the observable when error probabilities are amplified by a factor of r𝑟ritalic_r, and r0=1subscript𝑟01r_{0}=1italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 1 corresponds to the case without noise amplification. Using these data points and the fitting formula O(r)=Oem+k=d/2d/2+K1akrkdelimited-⟨⟩𝑂𝑟subscriptdelimited-⟨⟩𝑂emsuperscriptsubscript𝑘𝑑2𝑑2𝐾1superscriptsubscript𝑎𝑘superscript𝑟𝑘\langle O\rangle(r)=\langle O\rangle_{\rm em}+\sum_{k=\lceil d/2\rceil}^{% \lceil d/2\rceil+K-1}a_{k}^{\prime}r^{k}⟨ italic_O ⟩ ( italic_r ) = ⟨ italic_O ⟩ start_POSTSUBSCRIPT roman_em end_POSTSUBSCRIPT + ∑ start_POSTSUBSCRIPT italic_k = ⌈ italic_d / 2 ⌉ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⌈ italic_d / 2 ⌉ + italic_K - 1 end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_r start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT, we can compute OIsuperscriptdelimited-⟨⟩𝑂𝐼\langle O\rangle^{I}⟨ italic_O ⟩ start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT as

Oem=k=0KbkO(rk)subscriptdelimited-⟨⟩𝑂emsuperscriptsubscript𝑘0𝐾subscript𝑏𝑘delimited-⟨⟩𝑂subscript𝑟𝑘\displaystyle\langle O\rangle_{\rm em}=\sum_{k=0}^{K}b_{k}\langle O\rangle(r_{% k})⟨ italic_O ⟩ start_POSTSUBSCRIPT roman_em end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_k = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT italic_b start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ⟨ italic_O ⟩ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) (S22)

where bk=adj(V)0ksubscript𝑏𝑘adjsubscript𝑉0𝑘b_{k}=\mathrm{adj}(V)_{0k}italic_b start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = roman_adj ( italic_V ) start_POSTSUBSCRIPT 0 italic_k end_POSTSUBSCRIPT, and V𝑉Vitalic_V is a Vandermonde-like matrix with elements

Vij={1,j=0,rid/2+j1,j>0.subscript𝑉𝑖𝑗cases1𝑗0superscriptsubscript𝑟𝑖𝑑2𝑗1𝑗0\displaystyle V_{ij}=\begin{cases}1,&j=0,\\ r_{i}^{\lceil d/2\rceil+j-1},&j>0.\end{cases}italic_V start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = { start_ROW start_CELL 1 , end_CELL start_CELL italic_j = 0 , end_CELL end_ROW start_ROW start_CELL italic_r start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⌈ italic_d / 2 ⌉ + italic_j - 1 end_POSTSUPERSCRIPT , end_CELL start_CELL italic_j > 0 . end_CELL end_ROW (S23)

We characterize the performance of ZNE using two metrics: the bias δ=|OemOideal|𝛿subscriptdelimited-⟨⟩𝑂emsubscriptdelimited-⟨⟩𝑂ideal\delta=|\langle O\rangle_{\rm em}-\langle O\rangle_{\rm ideal}|italic_δ = | ⟨ italic_O ⟩ start_POSTSUBSCRIPT roman_em end_POSTSUBSCRIPT - ⟨ italic_O ⟩ start_POSTSUBSCRIPT roman_ideal end_POSTSUBSCRIPT | characterizes the accuracy, and the sampling overhead η𝜂\etaitalic_η characterizes the cost. Next, we focus on the overhead η𝜂\etaitalic_η. Suppose Ntotsubscript𝑁𝑡𝑜𝑡N_{tot}italic_N start_POSTSUBSCRIPT italic_t italic_o italic_t end_POSTSUBSCRIPT is the total number of circuit shots used to acquire the K+1𝐾1K+1italic_K + 1 data points. To reduce the variance in ZNE, we allocate the shots among the data points according to the importance sampling principle. Specifically, the number of shots used to measure O(rk)delimited-⟨⟩𝑂subscript𝑟𝑘\langle O\rangle(r_{k})⟨ italic_O ⟩ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) is Nk=|bk|k=0K|bk|Ntotsubscript𝑁𝑘subscript𝑏𝑘superscriptsubscript𝑘0𝐾subscript𝑏𝑘subscript𝑁𝑡𝑜𝑡N_{k}=\frac{|b_{k}|}{\sum_{k=0}^{K}|b_{k}|}N_{tot}italic_N start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = divide start_ARG | italic_b start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_k = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT | italic_b start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | end_ARG italic_N start_POSTSUBSCRIPT italic_t italic_o italic_t end_POSTSUBSCRIPT. The variance in the measurement of O(rk)delimited-⟨⟩𝑂subscript𝑟𝑘\langle O\rangle(r_{k})⟨ italic_O ⟩ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) is then given by σk2=1O2(rk)Nksubscriptsuperscript𝜎2𝑘1superscriptdelimited-⟨⟩𝑂2subscript𝑟𝑘subscript𝑁𝑘\sigma^{2}_{k}=\frac{1-\langle O\rangle^{2}(r_{k})}{N_{k}}italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = divide start_ARG 1 - ⟨ italic_O ⟩ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) end_ARG start_ARG italic_N start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG, and the variance of Oemsubscriptdelimited-⟨⟩𝑂em\langle O\rangle_{\rm em}⟨ italic_O ⟩ start_POSTSUBSCRIPT roman_em end_POSTSUBSCRIPT is thus σem2=k=0K|bk|2σk2subscriptsuperscript𝜎2emsuperscriptsubscript𝑘0𝐾superscriptsubscript𝑏𝑘2subscriptsuperscript𝜎2𝑘\sigma^{2}_{\rm em}=\sum_{k=0}^{K}|b_{k}|^{2}\sigma^{2}_{k}italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_em end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_k = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT | italic_b start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT. Here we have assumed that the observable O𝑂Oitalic_O is a Pauli operator. Without ZNE, if the Ntotsubscript𝑁𝑡𝑜𝑡N_{tot}italic_N start_POSTSUBSCRIPT italic_t italic_o italic_t end_POSTSUBSCRIPT shots are used to measure O(r0)delimited-⟨⟩𝑂subscript𝑟0\langle O\rangle(r_{0})⟨ italic_O ⟩ ( italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ), the variance is σraw2=1O2(r0)Ntotsubscriptsuperscript𝜎2raw1superscriptdelimited-⟨⟩𝑂2subscript𝑟0subscript𝑁𝑡𝑜𝑡\sigma^{2}_{\rm raw}=\frac{1-\langle O\rangle^{2}(r_{0})}{N_{tot}}italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_raw end_POSTSUBSCRIPT = divide start_ARG 1 - ⟨ italic_O ⟩ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_ARG start_ARG italic_N start_POSTSUBSCRIPT italic_t italic_o italic_t end_POSTSUBSCRIPT end_ARG. The sampling overhead reads η=σem2σraw2𝜂subscriptsuperscript𝜎2emsubscriptsuperscript𝜎2raw\eta=\frac{\sigma^{2}_{\rm em}}{\sigma^{2}_{\rm raw}}italic_η = divide start_ARG italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_em end_POSTSUBSCRIPT end_ARG start_ARG italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_raw end_POSTSUBSCRIPT end_ARG, which is the ratio of the two variances.

S7 Application to large-scale logical circuits

We take the surface code as an example to illustrate that our method can be applied to logical circuits at large scale. Assume we have a logic target circuit consisting of N𝑁Nitalic_N logical gates. Similar to physical operations (see Sec. S1), the j𝑗jitalic_jth logical gate reads

j=(1Pj)jI+PjjE,subscript𝑗1subscript𝑃𝑗superscriptsubscript𝑗𝐼subscript𝑃𝑗superscriptsubscript𝑗𝐸\displaystyle\mathcal{M}_{j}=(1-P_{j})\mathcal{M}_{j}^{I}+P_{j}\mathcal{M}_{j}% ^{E},caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = ( 1 - italic_P start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT + italic_P start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_E end_POSTSUPERSCRIPT , (S24)

where Pjsubscript𝑃𝑗P_{j}italic_P start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT represents the logical error rate of the gate. We remark that the logical error rate depends on the physical error rate and, therefore, is a function of the noise amplification factor r𝑟ritalic_r.

The expected value of the observable O𝑂Oitalic_O computed with the circuit is

Odelimited-⟨⟩𝑂\displaystyle\langle O\rangle⟨ italic_O ⟩ =\displaystyle== Tr(ON21ρ)Tr𝑂subscript𝑁subscript2subscript1𝜌\displaystyle\operatorname{Tr}(O\mathcal{M}_{N}\cdots\mathcal{M}_{2}\mathcal{M% }_{1}\rho)roman_Tr ( italic_O caligraphic_M start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT ⋯ caligraphic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT caligraphic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_ρ ) (S25)
=\displaystyle== (1j=1NPj)Oideal+j=1NPjOj+R1,1superscriptsubscript𝑗1𝑁subscript𝑃𝑗subscriptdelimited-⟨⟩𝑂idealsuperscriptsubscript𝑗1𝑁subscript𝑃𝑗subscriptdelimited-⟨⟩𝑂𝑗subscript𝑅1\displaystyle(1-\sum_{j=1}^{N}P_{j})\langle O\rangle_{\rm ideal}+\sum_{j=1}^{N% }P_{j}\langle O\rangle_{j}+R_{1},( 1 - ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ⟨ italic_O ⟩ start_POSTSUBSCRIPT roman_ideal end_POSTSUBSCRIPT + ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ⟨ italic_O ⟩ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + italic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ,

where

Oidealsubscriptdelimited-⟨⟩𝑂ideal\displaystyle\langle O\rangle_{\rm ideal}⟨ italic_O ⟩ start_POSTSUBSCRIPT roman_ideal end_POSTSUBSCRIPT =\displaystyle== Tr(ONI2I1Iρ)Tr𝑂superscriptsubscript𝑁𝐼superscriptsubscript2𝐼superscriptsubscript1𝐼𝜌\displaystyle\operatorname{Tr}(O\mathcal{M}_{N}^{I}\cdots\mathcal{M}_{2}^{I}% \mathcal{M}_{1}^{I}\rho)roman_Tr ( italic_O caligraphic_M start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ⋯ caligraphic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT caligraphic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT italic_ρ ) (S26)

is the ideal result,

Ojsubscriptdelimited-⟨⟩𝑂𝑗\displaystyle\langle O\rangle_{j}⟨ italic_O ⟩ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT =\displaystyle== Tr(ONIjE2I1Iρ)Tr𝑂superscriptsubscript𝑁𝐼superscriptsubscript𝑗𝐸superscriptsubscript2𝐼superscriptsubscript1𝐼𝜌\displaystyle\operatorname{Tr}(O\mathcal{M}_{N}^{I}\cdots\mathcal{M}_{j}^{E}% \cdots\mathcal{M}_{2}^{I}\mathcal{M}_{1}^{I}\rho)roman_Tr ( italic_O caligraphic_M start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT ⋯ caligraphic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_E end_POSTSUPERSCRIPT ⋯ caligraphic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT caligraphic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT italic_ρ ) (S27)

is the result when the j𝑗jitalic_jth logical gate has errors, and the remainder term has the upper bound

R1subscript𝑅1\displaystyle R_{1}italic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT \displaystyle\leq O[j=1N(1+2Pj)(1+2j=1NPj)]norm𝑂delimited-[]superscriptsubscriptproduct𝑗1𝑁12subscript𝑃𝑗12superscriptsubscript𝑗1𝑁subscript𝑃𝑗\displaystyle\|O\|\left[\prod_{j=1}^{N}(1+2P_{j})-(1+2\sum_{j=1}^{N}P_{j})\right]∥ italic_O ∥ [ ∏ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT ( 1 + 2 italic_P start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) - ( 1 + 2 ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ] (S28)
\displaystyle\leq O[e2Ptot(1+2Ptot)],norm𝑂delimited-[]superscript𝑒2subscript𝑃𝑡𝑜𝑡12subscript𝑃𝑡𝑜𝑡\displaystyle\|O\|\left[e^{2P_{tot}}-(1+2P_{tot})\right],∥ italic_O ∥ [ italic_e start_POSTSUPERSCRIPT 2 italic_P start_POSTSUBSCRIPT italic_t italic_o italic_t end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - ( 1 + 2 italic_P start_POSTSUBSCRIPT italic_t italic_o italic_t end_POSTSUBSCRIPT ) ] ,

where Ptot=j=1NPjsubscript𝑃𝑡𝑜𝑡superscriptsubscript𝑗1𝑁subscript𝑃𝑗P_{tot}=\sum_{j=1}^{N}P_{j}italic_P start_POSTSUBSCRIPT italic_t italic_o italic_t end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT is the total error rate.

Now, we approximate each logical error rate with a polynomial, i.e.

Pj(r)=k=d/2d/2+K1aj,krk+Δj(r),subscript𝑃𝑗𝑟superscriptsubscript𝑘𝑑2𝑑2𝐾1subscript𝑎𝑗𝑘superscript𝑟𝑘subscriptΔ𝑗𝑟\displaystyle P_{j}(r)=\sum_{k=\lceil d/2\rceil}^{\lceil d/2\rceil+K-1}a_{j,k}% r^{k}+\Delta_{j}(r),italic_P start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_r ) = ∑ start_POSTSUBSCRIPT italic_k = ⌈ italic_d / 2 ⌉ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⌈ italic_d / 2 ⌉ + italic_K - 1 end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT italic_r start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT + roman_Δ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_r ) , (S29)

and Δj(r)subscriptΔ𝑗𝑟\Delta_{j}(r)roman_Δ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_r ) denotes the error in the approximation. With this approximation, we can re-express the expected value as

O(r)delimited-⟨⟩𝑂𝑟\displaystyle\langle O\rangle(r)⟨ italic_O ⟩ ( italic_r ) =\displaystyle== Oideal+k=d/2d/2+K1akrksubscriptdelimited-⟨⟩𝑂idealsuperscriptsubscript𝑘𝑑2𝑑2𝐾1subscript𝑎𝑘superscript𝑟𝑘\displaystyle\langle O\rangle_{\rm ideal}+\sum_{k=\lceil d/2\rceil}^{\lceil d/% 2\rceil+K-1}a_{k}r^{k}⟨ italic_O ⟩ start_POSTSUBSCRIPT roman_ideal end_POSTSUBSCRIPT + ∑ start_POSTSUBSCRIPT italic_k = ⌈ italic_d / 2 ⌉ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⌈ italic_d / 2 ⌉ + italic_K - 1 end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_r start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT (S30)
+R1(r)+R2(r),subscript𝑅1𝑟subscript𝑅2𝑟\displaystyle+R_{1}(r)+R_{2}(r),+ italic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_r ) + italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_r ) ,

where

ak=j=1N(OjOideal)aj,ksubscript𝑎𝑘superscriptsubscript𝑗1𝑁subscriptdelimited-⟨⟩𝑂𝑗subscriptdelimited-⟨⟩𝑂idealsubscript𝑎𝑗𝑘\displaystyle a_{k}=\sum_{j=1}^{N}(\langle O\rangle_{j}-\langle O\rangle_{\rm ideal% })a_{j,k}italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT ( ⟨ italic_O ⟩ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - ⟨ italic_O ⟩ start_POSTSUBSCRIPT roman_ideal end_POSTSUBSCRIPT ) italic_a start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT (S31)

and

R2(r)subscript𝑅2𝑟\displaystyle R_{2}(r)italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_r ) =\displaystyle== j=1N(OjOideal)Δj(r).superscriptsubscript𝑗1𝑁subscriptdelimited-⟨⟩𝑂𝑗subscriptdelimited-⟨⟩𝑂idealsubscriptΔ𝑗𝑟\displaystyle\sum_{j=1}^{N}(\langle O\rangle_{j}-\langle O\rangle_{\rm ideal})% \Delta_{j}(r).∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT ( ⟨ italic_O ⟩ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - ⟨ italic_O ⟩ start_POSTSUBSCRIPT roman_ideal end_POSTSUBSCRIPT ) roman_Δ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_r ) . (S32)

Substituting the expression of O(r)delimited-⟨⟩𝑂𝑟\langle O\rangle(r)⟨ italic_O ⟩ ( italic_r ) into the K𝐾Kitalic_Kth-order extrapolation formula Eq. (S22), we obtain the bias

δ𝛿\displaystyle\deltaitalic_δ =\displaystyle== |k=0Kbk[R1(rk)+R2(rk)]|δ~1+δ~2,superscriptsubscript𝑘0𝐾subscript𝑏𝑘delimited-[]subscript𝑅1subscript𝑟𝑘subscript𝑅2subscript𝑟𝑘subscript~𝛿1subscript~𝛿2\displaystyle\left|\sum_{k=0}^{K}b_{k}[R_{1}(r_{k})+R_{2}(r_{k})]\right|\leq% \tilde{\delta}_{1}+\tilde{\delta}_{2},| ∑ start_POSTSUBSCRIPT italic_k = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT italic_b start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT [ italic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) + italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ] | ≤ over~ start_ARG italic_δ end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + over~ start_ARG italic_δ end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , (S33)

where

δ~1subscript~𝛿1\displaystyle\tilde{\delta}_{1}over~ start_ARG italic_δ end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT =\displaystyle== Ok=0K|bk|[e2Ptot(rk)(1+2Ptot(rk))]norm𝑂superscriptsubscript𝑘0𝐾subscript𝑏𝑘delimited-[]superscript𝑒2subscript𝑃𝑡𝑜𝑡subscript𝑟𝑘12subscript𝑃𝑡𝑜𝑡subscript𝑟𝑘\displaystyle\|O\|\sum_{k=0}^{K}|b_{k}|\left[e^{2P_{tot}(r_{k})}-\left(1+2P_{% tot}(r_{k})\right)\right]∥ italic_O ∥ ∑ start_POSTSUBSCRIPT italic_k = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT | italic_b start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | [ italic_e start_POSTSUPERSCRIPT 2 italic_P start_POSTSUBSCRIPT italic_t italic_o italic_t end_POSTSUBSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) end_POSTSUPERSCRIPT - ( 1 + 2 italic_P start_POSTSUBSCRIPT italic_t italic_o italic_t end_POSTSUBSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ) ] (S34)

and

δ~2subscript~𝛿2\displaystyle\tilde{\delta}_{2}over~ start_ARG italic_δ end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT =\displaystyle== 2NO|k=0KbkΔj(rk)|2𝑁norm𝑂superscriptsubscript𝑘0𝐾subscript𝑏𝑘subscriptΔ𝑗subscript𝑟𝑘\displaystyle 2N\|O\|\left|\sum_{k=0}^{K}b_{k}\Delta_{j}(r_{k})\right|2 italic_N ∥ italic_O ∥ | ∑ start_POSTSUBSCRIPT italic_k = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT italic_b start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT roman_Δ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) | (S35)

are upper bounds of contributions by R1subscript𝑅1R_{1}italic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and R2subscript𝑅2R_{2}italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT terms, respectively. The first term δ~1subscript~𝛿1\tilde{\delta}_{1}over~ start_ARG italic_δ end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is the error due to the second-order contribution of logical error rates, i.e.

δ~1subscript~𝛿1\displaystyle\tilde{\delta}_{1}over~ start_ARG italic_δ end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT similar-to-or-equals\displaystyle\simeq 2Ok=0K|bk|Ptot(rk)2.2norm𝑂superscriptsubscript𝑘0𝐾subscript𝑏𝑘subscript𝑃𝑡𝑜𝑡superscriptsubscript𝑟𝑘2\displaystyle 2\|O\|\sum_{k=0}^{K}|b_{k}|P_{tot}(r_{k})^{2}.2 ∥ italic_O ∥ ∑ start_POSTSUBSCRIPT italic_k = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT | italic_b start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | italic_P start_POSTSUBSCRIPT italic_t italic_o italic_t end_POSTSUBSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT . (S36)

If we choose a sufficiently large code distance such that Ptot(rk)subscript𝑃𝑡𝑜𝑡subscript𝑟𝑘P_{tot}(r_{k})italic_P start_POSTSUBSCRIPT italic_t italic_o italic_t end_POSTSUBSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) is much smaller than one, the first term is much smaller than the raw bias before error mitigation, which has the upper bound

δ~0subscript~𝛿0\displaystyle\tilde{\delta}_{0}over~ start_ARG italic_δ end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT =\displaystyle== O[e2Ptot(1)1]2OPtot(1).similar-to-or-equalsnorm𝑂delimited-[]superscript𝑒2subscript𝑃𝑡𝑜𝑡112norm𝑂subscript𝑃𝑡𝑜𝑡1\displaystyle\|O\|\left[e^{2P_{tot}(1)}-1\right]\simeq 2\|O\|P_{tot}(1).∥ italic_O ∥ [ italic_e start_POSTSUPERSCRIPT 2 italic_P start_POSTSUBSCRIPT italic_t italic_o italic_t end_POSTSUBSCRIPT ( 1 ) end_POSTSUPERSCRIPT - 1 ] ≃ 2 ∥ italic_O ∥ italic_P start_POSTSUBSCRIPT italic_t italic_o italic_t end_POSTSUBSCRIPT ( 1 ) . (S37)

The second term δ~2subscript~𝛿2\tilde{\delta}_{2}over~ start_ARG italic_δ end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is the error due to approximating logical error rates Pj(r)subscript𝑃𝑗𝑟P_{j}(r)italic_P start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_r ) with polynomials. In the limit that the physical error rate p𝑝pitalic_p approaches zero, the logical error rate behaves as Pj(r)pd/2proportional-tosubscript𝑃𝑗𝑟superscript𝑝𝑑2P_{j}(r)\propto p^{\lceil d/2\rceil}italic_P start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_r ) ∝ italic_p start_POSTSUPERSCRIPT ⌈ italic_d / 2 ⌉ end_POSTSUPERSCRIPT [43, 44], suggesting that the K=1𝐾1K=1italic_K = 1 extrapolation is sufficient in the low physical error rate regime. When p𝑝pitalic_p is finite, the logical error rate deviates from Pj(r)pd/2proportional-tosubscript𝑃𝑗𝑟superscript𝑝𝑑2P_{j}(r)\propto p^{\lceil d/2\rceil}italic_P start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_r ) ∝ italic_p start_POSTSUPERSCRIPT ⌈ italic_d / 2 ⌉ end_POSTSUPERSCRIPT, and we have to adapt the extrapolation function accordingly. Next, we use numerical calculations to estimate the bias in the polynomial extrapolation.

Refer to caption
Figure S9999: Numerical results of ZNE on a circuit of N𝑁Nitalic_N surface-code idle gates. We take the physical error rate p=103𝑝superscript103p=10^{-3}italic_p = 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT, which is amplified to rkpsubscript𝑟𝑘𝑝r_{k}pitalic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_p in ZNE; we choose the amplification factors rk=k1/d/2subscript𝑟𝑘superscript𝑘1𝑑2r_{k}=k^{1/\lceil d/2\rceil}italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = italic_k start_POSTSUPERSCRIPT 1 / ⌈ italic_d / 2 ⌉ end_POSTSUPERSCRIPT, where k=1,2,,K+1𝑘12𝐾1k=1,2,\ldots,K+1italic_k = 1 , 2 , … , italic_K + 1. The bias before ZNE is δ0=|O(1)O(0)|subscript𝛿0delimited-⟨⟩𝑂1delimited-⟨⟩𝑂0\delta_{0}=|\langle O\rangle(1)-\langle O\rangle(0)|italic_δ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = | ⟨ italic_O ⟩ ( 1 ) - ⟨ italic_O ⟩ ( 0 ) |, where O(0)delimited-⟨⟩𝑂0\langle O\rangle(0)⟨ italic_O ⟩ ( 0 ) is the ideal value.

To estimate the bias and cost numerically, we consider a quantum memory circuit consisting of applying N𝑁Nitalic_N idle gates on a logical qubit and take the logical error rate per gate reported in Ref. [27]. We choose the observable as Y𝑌Yitalic_Y, such that the expectation value is affected by both logical X𝑋Xitalic_X and Z𝑍Zitalic_Z errors. Let PL(r)subscript𝑃𝐿𝑟P_{L}(r)italic_P start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ( italic_r ) be the sum of logical X𝑋Xitalic_X and Z𝑍Zitalic_Z error rates. The observable expectation value is taken as O(r)=[12PL(r)]Ndelimited-⟨⟩𝑂𝑟superscriptdelimited-[]12subscript𝑃𝐿𝑟𝑁\langle O\rangle(r)=[1-2P_{L}(r)]^{N}⟨ italic_O ⟩ ( italic_r ) = [ 1 - 2 italic_P start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ( italic_r ) ] start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT. Then, we apply ZNE to O(r)delimited-⟨⟩𝑂𝑟\langle O\rangle(r)⟨ italic_O ⟩ ( italic_r ), and the results are shown in Fig. S9999.

Although the above numerical results are obtained from quantum memory circuits, we conjecture that general computing circuits yield similar results. Memory circuits are composed of repeatedly applying parity-check measurements on a fixed lattice. Practical protocols of quantum computing in surface codes include, for instance, braiding transformation [15] and lattice surgery [41], in which circuits are also composed of repeatedly applying parity-check measurements, however, on a lattice that deforms between certain parity-check cycles. Because the dominant operations are the same in memory circuits and such computing circuits, they should yield similar behavior of logical error rates (as functions of physical error rates and the code distance). In addition to braiding transformation and lattice surgery, universal quantum computing in surface codes also requires magic state injection and distillation [45, 46, 15]. Magic state errors decrease with the level of distillation. As long as the distillation level is adequately high, the post-distillation magic state errors are dominated by logical errors in lattice surgery operations, and the contribution of raw magic state errors (errors in injected magic states before distillation) is negligible; otherwise, we may need to modify the extrapolation function to include the impact of raw magic state errors: suppose the distilled magic state error rate is pLproportional-toabsentsuperscript𝑝𝐿\propto p^{L}∝ italic_p start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT, where L𝐿Litalic_L is an integer depending on the distillation level, we can add the term pLsuperscript𝑝𝐿p^{L}italic_p start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT to the extrapolation function.

So far, we have only considered polynomial extrapolation functions. The use of other extrapolation functions could improve the performance. In ZNE on NISQ circuit, it has been shown experimentally that the exponential extrapolation outperforms polynomial extrapolation [47, 8]. For example, the logical error rate function reported in Ref. [27] is a candidate worth exploring. Given such candidate functions, we can benchmark and verify them through Clifford circuits [23, 24]: we can simulate the Clifford circuits on a classical computer to obtain the ideal result Oidealsubscriptdelimited-⟨⟩𝑂ideal\langle O\rangle_{\rm ideal}⟨ italic_O ⟩ start_POSTSUBSCRIPT roman_ideal end_POSTSUBSCRIPT, such that we can evaluate the bias as well as the cost. In this way, we can compare the candidate the functions and identify the suitable ones.