This post is kindly contributed by Sean Barrett, commenting on my recent paper "Error tolerance and tradeoffs in loss- and failure-tolerant quantum computing schemes". Warning: this is a technical post.
I think you make some interesting points in this paper, which are well worth bearing in mind for anyone developing error tolerant schemes for quantum computation.
I just have a few comments.
You use the somewhat pejorative term "exponential blow out" to describe how the dephasing noise is amplified in schemes where the gates are non deterministic, but fail in a heralded way - in particular you argue that this is as a result of the extra measurements that must be performed to remove redundant qubits from the cluster. You say that the effective error teleported to the remaining qubit is
peff = 1 - (1-perr)^n
where peff is the effective error rate on the remaining unmeasured qubit, perr is the physical depolarising error per qubit, and n is the number of redundant qubits that must be removed.
There's a couple of things to say about this equation. Firstly, it's not immediately clear to me that this applies to all graph states, or just to GHZ-like graphs in which one qubit is connected to all the others. It would be nice to see a more explict argument for how errors are "teleported" in linear (1-d) cluster states, for example. One other point in this regard is that the paper Phys. Rev. A 73, 012304 (2006), we give a procedure for building 2-d cluster states, which is quite different from the one you analyse (I think the difference ultimately stems from the fact that Duan and Raussendorf have different physical assumptions about what happens when their gates fail). I would be interested to know how the errors accumulate using this method (which is also quite similar to the one given by Browne and Rudolph in Phys. Rev. Lett. 95, 010501 (2005)).
Secondly, your claim of an "exponential blow out" is based on the fact that the "overhead cost" n appears in the exponent of this expression. Later, you point out that n is proptional to 1/pgate, where pgate is the success probability of performing a single 2-qubit gate. What I want to point out is that your expression for peff is exactly what one would expect for the total probability of obtaining at least one error after n Bernoulli trials of a process where the error probability is *fixed* for each trial. This is a very common situation, for example, if I buy a lottery ticket each week for n weeks, the total probability of me winning the jackpot at least once will be given by your expression if I interpret perr as the probability of my winning it on any given week. In this situation, is it reasonable to say that there is an "exponential blowout" in the probability of my winning the lottery jackpot at least once? Only if I keep buying tickets for millions of weeks. For any shorter timescale, most people would reasonably describe my total chances as increasing
*linearly* with n.
A little bit of manipulation on your expression shows that (assuming
perr< <1) we can write it as
peff = 1-exp( -perr*n)
Now if we take n = A/pgate, (which you argue is the scaling for the particular cluster-growing scheme you discuss), where A is some constant factor, it's clear that the error really grows linearly with 1/pgate, and the exponential behaviour only kicks in when A/pgate ~= perr. If A is a constant of order 1, what this says is that we only have a significant problem if the success probability for the gate is the same order of magnitude as the depolarizing error probability.
For larger values of pgate, what we have is linear scaling in 1/pgate. This is not so surprising, although it is still perhaps something that should be an important consideration in a particular implementation using non-deterministic gates. But it's not really a problem that's specific to these sorts of schemes. Indeed, in most implementations that have been proposed, that I am aware of, there is some kind of time or space overhead relative the the idealised circuit model quantum computer. For instance, many implementations are retricted to nearest-neighbour qubit interactions. In this case, to perform a gate between an arbitrary pair of qubits in the computer, they must be brought close together, either using SWAP operations on fixed qubits, or, as in the case of ion trap experiments, physically moving the qubit-carrying entities around. Clearly, there's a time overhead in doing either of these, that's proportional to the physical size of the computer. While we're waiting for the qubits to be moved around, they will be accumulating depolarizing errors due to the interaction with the environment. One could also argue that this overhead ultimately leads to an "exponential blow out" in the depoloarizing noise, although I would prefer not to be so pessimistic!
I also wanted to make a couple of points about the distributed QC schemes I've been involved with, e.g. ref  of your paper, and Phys. Rev. A 73, 012304 (2006), which are perhaps relevant in the light of your work. In these schemes we have a distributed architecture, in which qubits comprise of trapped atoms, ions, quantum dots, or some other solid state gadget. Each qubit sits a long way from the others, so that it can be easily controlled by local control fields, and so that the decoherence processes can be well characterised and understood. We introduce entanglement between pairs of qubits using a single- (or few-) photon interference effect, which has this property that it does not always work (either because of photon loss, or in the case of our first paper, because the success probabiliy is inherently less than 1/2.) Now, of course this non-deterministic entangling scheme introduces overheads into our computing scheme, relative to one in which the gate operations are deterministic, and therefore may lead to the sort of error amplification you describe. But the hope is that there are other aspects of these schemes that more than compensate for this problem.
For instance, because we can engineer each qubit in isolation from the others, one can imagine that this makes the process of characterising decoherence and eliminating error processes easier (e.g. process tomography on a single isolated qubit is much easier than on a register containing thousands of qubits which may have residual interactions with oneanother). So we might be able to make perr very small indeed.
Furthermore, since we are not limited to nearest neighbour interactions, we don't have a particular problem with the overhead involved in entangling distant qubits (although of course we do, in a sense, transmit entanglement via photons, so ultimately the finite speed of light will become a problem...).
Finally, because we are computing with graph/cluster states, there are some tricks we can do to reduce the "hidden overheads" inherent in many algorithms and protocols - e.g. we can potentially elimninate all the Clifford group parts of the algorithm by directly preparing appropriate graph states.
See the comments section below for follow up discussion on this