Anonymity in blockchain part 2: zk-snarks

By |2018-07-09T08:21:06+00:00July 6th, 2018|

In my previous article, we explored the concept of anonymity in blockchains and why it is generally a difficult problem. If you haven’t yet read it, then I suggest you do before continuing.  But, if you’re too lazy to refer to it, here’s a quick recap:

Remember that blockchains are inherently public and transparent, which are concepts that certainly go against the grain of anonymity. If we take the image below as an example (found on the Bitcoin blockchain explorer), we can see both which public addresses are involved in the transaction and the transaction amount. Just like this Bitcoin transaction, we (and everyone else, hence the “public” descriptor) can see the details of every transaction that has ever taken place since the inception of Bitcoin, now a decade ago.

Although these public addresses give us a cloak of pseudo-anonymity to hide behind, we showed that Bitcoin is a lot less anonymous than you think. Transaction graph analysis and other techniques have unearthed deep meaning as to who actually sits behind these public addresses and the patterns of their operation. Whether you are purchasing Justin Bieber’s latest album, or worried about Mark Zuckerberg spying on you again, there are a myriad of valid reasons why people are interested in anonymous transactions. Thus, we introduced the concept of a zero-knowledge proof in the last article. In this article however, we’re going to dig into more of the details of how one could ‘implement’ a system of anonymous blockchain transactions using something termed ‘zk-snarks’ (zero-knowledge non-interactive succinct arguments of knowledge). Which is exactly what Zcash implemented (outlined in my previous article).

The entire premise of the zero-knowledge proof is to prove something whilst conveying zero-knowledge in process. How does this help with anonymous blockchain transactions? We want to prove that a transaction is valid, whilst conveying ‘zero’ knowledge regarding the transaction i.e. the sending/receiving public address and the amount.

In order to figure out the construction of the zk-snark for anonymous transactions, we first need to further explore the function used to transfer funds between addresses. Below we have the ‘transferFrom’ function for the standard ERC20 token implemented in solidity.

Function taken from the StandardToken.sol contract implemented by Zeppelin Solidity

The function takes in 3 public parameters, namely; the address to transfer the funds from, the address to transfer the funds to, and the value of the transaction. It then returns a boolean, true if the transaction was successful, otherwise false. The first 3 lines of the function are important ‘require’ statements that ensure 3 essential things:

  1. In the first require statement, we check the user isn’t sending funds to the null address 0 by mistake (this is nice because remember there is no bank to phone up and ask for transactions to be reversed).
  2. The second require statement checks to see that the value being sent is less than or equal to the value that is held in that public address – this is important as we obviously don’t want people to be able to send more funds than they have.
  3. The final require statement uses ‘msg.sender’ which is essentially used to check that the transferFrom function is being called by the same person who is transferring the funds. This means that you’re not able to just call this function and initiate a transfer of funds from some random address to yours.

Once these checks have taken place (i.e. the transaction has been deemed valid), the value in question is subtracted from the source address and this value is added to the destination address.

Now that we understand how the transactions take place in the general case, we again understand the difficulty of anonymous transactions. In the above function, we need to explicitly see the address the transaction is coming from, as well as the value, so that we can check the address has enough funds for the transaction in question. So how could we implement the above transaction scheme without anyone finding out these details?

Zcash have implemented their system in such a manner that there are four possible cases that can occur:

  1. Transactions are public as seen in, say, Bitcoin.
  2. Transactions are completely anonymous (from address, to address & value are all shielded)
  3. Input is shielded (from address is shielded, to address and value are transparent)
  4. Outputs are shielded (to address and value are shielded, from address is transparent)

This scheme of shielded inputs/outputs and transparent inputs/outputs are nice for a variety of use-cases. In our code example of explaining zk-snarks below, we choose to showcase the third variant above, where the from address is shielded, and the to address and value are transparent. This scenario would make sense, say, if someone wanted to make an anonymous donation to a charity, but would like the proof of the donation and the amount to be publicly visible. Sit tight, here is a zk-snark.

Generator (C ‘circuit’, λ ‘toxic waste’){
(pk, vk) = G( λ, C)}

Prover ( _from ‘secret input’, _to ‘public input’, _value ‘public input’) {
ℿ= P(pk, _from, _to, _value)}

Verifier {
V(vk, _to, _value, ℿ) == (∃  _from  s.t  C(_from, _to,_value)}

Okay, let’s walk through this step by step. The first function is the generator function. It’s a once-off function and used to initialize the whole setup. Firstly, what is C? C, the circuit, is essentially the program that we would like to run. In this case, C is equal to our transferFrom function that we introduced earlier in the article. Why do we use the term circuit? Well, as Vitalik Buterin points out, zk-snarks cannot be applied to any computational problem directly; rather, you have to convert the problem into the right “form” for the problem to operate on. That means our current transferFrom function won’t do in its current form. Instead we need to follow a non-trivial process of converting it to a something called a ‘quadratic arithmetic program’ (QAP).

(If you ever dabbled in some computational complexity theory, then you may be interested to know that QAP are NP-complete. As you know, this allows polynomial time Karp reductions from any NP problem to that NP-complete problem. You can find the juicy details here.)

The pipeline of this process can be seen in the diagram below, and here you can now see whence we ascertain the term “circuit”. If you want to get the dirty details on how this transformation actually works, head over to Vitalik’s blog post.

Okay now that we have dealt with the first part, let’s move onto λ, the ‘toxic waste’. First off, it is termed toxic waste, as it something that needs to be disposed of very, very carefully, and it presents an extreme danger if it is leaked. λ is the initial parameters, along with the circuit C, used to generate pk, vk, the public proving key and public verifying key (we’ll get to this later). These initial parameters, λ, need to be disposed of, because if an adversary knew this parameter they could essentially spoof arbitrary proofs (basically, send money from whoever to themselves, yikes!). This is why this step is often called the ‘once-off trusted set-up’. We need to trust that whoever performed this step, properly disposed of λ.

How did Zcash do this? They did a multi-party computation ceremony. If you have a couple of days to kill, read the technical white paper on how this was done here. Otherwise, the excerpt below should suffice in explaining their aim.

“In order to ensure the toxic waste does not come into existence, our team (Zcash) designed a multi-party computation (MPC) protocol which allows multiple independent parties to collaboratively construct the parameters. This protocol has the property that, in order to compromise the final parameters, all of the participants would have to be compromised or dishonest.”

They even posted pictures of them angle grinding the hardware used in the ceremony afterward to further augment the validity of the process. Call me naive, but this has me pretty convinced that the toxic waste is gone for good.

The outcome of step 1 is that we now have pk, vk, the public proving key and public verifying key. These will be needed in the proving and verifying functions which we get onto next.

Let’s now turn our attention toward the prover function.

Prover ( _from ‘secret input’, _to ‘public input’, _value ‘public input’) {
ℿ= P(pk, _from, _to, _value)}

As we described earlier, in this case, the sender doesn’t want anyone to know what his address is, thus this is a secret input. The value and to address are both public inputs. The user would use these three parameters along with the public proving key, pk, generated in the previous step in order to produce a proof, ℿ. This proof generation is generally the computationally expensive part of the process. For a single input and output address, it would generally take around 45 seconds to produce the proof. In the original whitepaper, the benchmark was 2 minutes for 2 input and 2 output addresses. Right now this seems a smidge problematic but nothing too disastrous.

We are going to treat the actual mathematics behind the prover function as a black box, otherwise, this article will never end. We are also going to do the same for the Verifier function below. If you want to unpack these black boxes, this is the best resource I’ve come across so far.

Let’s now move onto the verifier function.

Verifier {
V(vk, _to, _value, ℿ) == (∃  _from  s.t  C(_from, _to,_value)}

This function uses the public verification key along with the public inputs, the to address and value, and finally the proof generated in the previous step. Although proof creation is computationally expensive, verification is extremely fast – which is critical. This means that miners can quickly determine the validity of transactions before including them in the next block.

That’s it! Above are the basics behind zk-snarks and how it is used in order to keep your transactions anonymous. What next? Right now Zcash have already shown that anonymous blockchain transactions are reality. What about integrating zk-snarks with the likes of Ethereum? At this stage, it is still infeasible. Since zk-snarks requires some pretty computationally intense mathematics (pairing functions & elliptic curve operations), there is currently not enough gas in a single block to fulfill these operations. Great strides are being made toward solving the problem. In Byzantinum, the first Metropolis hard fork, various changes enabled pre-compiled contracts that allow for operations used in zk-snark verification (pairing functions & elliptic curve operations). Another solution is ZoKrates, a proof-of-concept implementation, which has not been tested for production. It is able to generate a non-interactive zero-knowledge proof and a Smart Contract that does the on-chain verification.

I hope you enjoyed the reading and managed to learn something interesting from this!

About the Author:

JonJon
JonJon is a blockchain analyst at Newtown Partners. He assists with token economy design, feasibility studies and blockchain research. He has worked on multiple projects including Civic and the LGBT token. He holds a Bachelor of Business Science degree in Computer Science, with undergraduate majors in Mathematics, Finance, and Computer Science. JonJon is currently completing his Masters in FinTech at UCT. JonJon has a passion for yoga, coffee, and CrytpoKitties.