a16z: Why Encrypted Mempools Are Not a Panacea for MEV?

Odaily星球日报 2025年07月16日 12:01

微信扫一扫
分享到朋友或朋友圈

Original Authors: Pranav Garimidi, Joseph Bonneau, Lioba Heimbach, a16z

Original Translation: Saoirse, Foresight News

In blockchain, the maximum value that can be earned by deciding which transactions are packed into a block, which are excluded, or by adjusting the order of transactions is called 'Maximum Extractable Value', or MEV for short. MEV is prevalent in most blockchains and has been a topic of widespread attention and discussion in the industry.

Note: This article assumes that readers already have a basic understanding of MEV. Some readers may first read our MEV popular science article.

Many researchers have raised a clear question when observing the MEV phenomenon: Can encryption technology solve this problem? One solution is to use an encrypted mempool: users broadcast encrypted transactions, which are only disclosed after the order is finalized. This way, the consensus protocol must 'blindly select' the order of transactions, which seems to prevent the exploitation of MEV opportunities during the ordering phase.

Unfortunately, from both practical and theoretical perspectives, encrypted mempools cannot provide a general solution to the MEV problem. This article will explain the difficulties and explore feasible design directions for encrypted mempools.

How Encrypted Mempools Work

There have been many proposals for encrypted mempools, but the general framework is as follows:

Users broadcast encrypted transactions.
Encrypted transactions are submitted to the chain (in some proposals, transactions need to undergo verifiable random shuffling first).
When the block containing these transactions is finalized, the transactions are decrypted.
Finally, these transactions are executed.

It is important to note that step 3 (transaction decryption) has a critical issue: who is responsible for decryption? What if decryption is not completed? A simple idea is to let users decrypt their own transactions (in this case, there is no need for encryption, just a hidden commitment). But this approach has vulnerabilities: attackers may implement speculative MEV.

In speculative MEV, attackers guess that a certain encrypted transaction contains MEV opportunities, then encrypt their own transactions and try to insert them into advantageous positions (such as before or after the target transaction). If the transactions are arranged in the expected order, the attacker will decrypt and extract MEV through their own transactions; if not, they will refuse to decrypt, and their transactions will not be included in the final blockchain.

Perhaps penalties can be imposed on users who fail to decrypt, but implementing this mechanism is extremely difficult. The reason is: the penalty for all encrypted transactions must be uniform (after all, encrypted transactions cannot be distinguished), and the penalty must be severe enough to deter speculative MEV even in the face of high-value targets. This would result in a large amount of funds being locked up, and these funds need to remain anonymous (to avoid revealing the association between transactions and users). More troublesome is that if real users cannot decrypt normally due to program bugs or network failures, they will also suffer losses.

Therefore, most schemes suggest that when encrypting transactions, it must be ensured that they can be decrypted at a certain future time, even if the transaction initiator is offline or refuses to cooperate. This goal can be achieved in the following ways:

Trusted Execution Environments (TEEs): Users can encrypt transactions to a key held by a secure area of a Trusted Execution Environment (TEE). In some basic versions, the TEE is only used to decrypt transactions after a specific point in time (this requires the TEE to have time awareness internally). More complex schemes have the TEE responsible for decrypting transactions and building blocks, sorting transactions based on arrival time, fees, and other criteria. Compared with other encrypted mempool schemes, the advantage of TEEs is that they can directly process plaintext transactions, reducing redundant information on the chain by filtering out transactions that would roll back. But the shortcoming of this method is its reliance on hardware trustworthiness.

Secret-sharing and threshold encryption: In this scheme, users encrypt transactions to a certain key, which is jointly held by a specific committee (usually a subset of validators). Decryption requires meeting a certain threshold condition (for example, two-thirds of the committee members agree).

When using threshold decryption, the carrier of trust shifts from hardware to the committee. Supporters argue that since most protocols already assume that validators have an 'honest majority' characteristic in the consensus mechanism, we can make a similar assumption that the majority of validators will remain honest and will not decrypt transactions in advance.

However, it is important to note a key difference here: these two trust assumptions are not the same concept. Consensus failures such as blockchain forks are publicly visible (belonging to 'weak trust assumptions'), while a malicious committee privately decrypting transactions in advance leaves no public evidence, making such attacks undetectable and unpunishable (belonging to 'strong trust assumptions'). Therefore, although on the surface, the security assumptions of the consensus mechanism and the encryption committee seem consistent, in practice, the assumption that 'the committee will not collude' is much less credible.

Time-lock and delay encryption: As an alternative to threshold encryption, the principle of delay encryption is: users encrypt transactions to a certain public key, and the private key corresponding to this public key is hidden in a time-lock puzzle. A time-lock puzzle is a cryptographic puzzle that encapsulates a secret, the content of which can only be revealed after a preset time, more specifically, the decryption process requires repeatedly performing a series of non-parallelizable computations. Under this mechanism, anyone can solve the puzzle to obtain the key and decrypt the transaction, but only after completing a sufficiently long slow (essentially serial) computation, ensuring that the transaction cannot be decrypted before final confirmation. The strongest form of this cryptographic primitive is to generate such puzzles publicly through delay encryption technology; it can also be approximately implemented by a trusted committee using time-lock encryption, although at this point its advantages over threshold encryption are debatable.

Whether using delay encryption or having a trusted committee perform the computation, such schemes face many practical challenges: first, because the delay inherently relies on the computation process, it is difficult to ensure the accuracy of the decryption time; second, these schemes require relying on specific entities running high-performance hardware to efficiently solve the puzzles, although anyone can take on this role, how to incentivize participation remains unclear; finally, in such designs, all broadcast transactions will be decrypted, including those never finally written into a block. Schemes based on thresholds (or witness encryption) may only decrypt those transactions successfully included.

Witness encryption: The last and most advanced cryptographic scheme is to use 'witness encryption' technology. Theoretically, the mechanism of witness encryption is: after encrypting information, only those who know the 'witness information' corresponding to a specific NP relationship can decrypt it. For example, information can be encrypted such that only those who can solve a certain Sudoku puzzle or provide the original image of a certain hash value can complete the decryption.

(Note: An NP relationship is the correspondence between a 'problem' and an 'answer that can be quickly verified')

For any NP relationship, similar logic can be implemented through SNARKs. It can be said that witness encryption essentially encrypts data in a form that only allows subjects who can prove through SNARK that specific conditions are met to decrypt. In the context of encrypted mempools, a typical example of such conditions is: transactions can only be decrypted after the block is finally confirmed.

This is a highly promising theoretical primitive. In fact, it is a universal scheme, with committee-based and delay-based methods being just specific forms of its application. Unfortunately, we currently do not have any practical witness-based encryption schemes. Moreover, even if such schemes exist, it is hard to say that they would have more advantages than committee-based methods in proof-of-stake chains. Even if witness encryption is set to 'only decryptable after transactions are ordered in a finally confirmed block', a malicious committee can still privately simulate the consensus protocol to forge the final confirmation status of transactions, then use this private chain as a 'witness' to decrypt transactions. At this point, having the same committee use threshold decryption can achieve the same level of security with much simpler operations.

However, in proof-of-work consensus protocols, the advantages of witness encryption are more significant. Because even if the committee is completely malicious, it cannot privately mine multiple new blocks at the current blockchain head to forge the final confirmation status.

Technical Challenges Faced by Encrypted Mempools

Several practical challenges limit the ability of encrypted mempools to prevent MEV. Overall, information confidentiality itself is a difficult problem. It is worth noting that encryption technology is not widely used in the Web3 field, but our decades of practice in deploying encryption technology in networks (such as TLS/HTTPS) and private communications (from PGP to modern encrypted messaging platforms like Signal and WhatsApp) have fully exposed the difficulties: encryption is a tool for protecting confidentiality, but it cannot provide absolute guarantees.

First, certain entities may directly access the plaintext of user transactions. In typical scenarios, users usually do not encrypt transactions themselves but delegate this task to wallet service providers. As a result, wallet service providers can access the plaintext of transactions and may even use or sell this information to extract MEV. The security of encryption always depends on all entities that have access to the keys. The scope of key control is the boundary of security.

Beyond this, the biggest problem is metadata, the unencrypted data surrounding the encrypted payload (transactions). Searchers can use this metadata to infer transaction intentions and then implement speculative MEV. It is important to note that searchers do not need to fully understand the content of transactions or guess correctly every time. For example, as long as they can reasonably judge that a certain transaction is a buy order from a specific decentralized exchange (DEX), it is enough to launch an attack.

We can divide metadata into several categories: one is the classic problems inherent in encryption technology, and the other is the unique problems of encrypted mempools.

Transaction size: Encryption itself cannot hide the size of the plaintext (it is worth noting that the formal definition of semantic security explicitly excludes hiding the size of the plaintext). This is a common attack vector in encrypted communications. A typical case is that even after encryption, eavesdroppers can still determine the content being played on Netflix in real time by the size of each data packet in the video stream. In encrypted mempools, certain types of transactions may have unique sizes, thereby leaking information.
Broadcast time: Encryption also cannot hide time information (another classic attack vector). In Web3 scenarios, certain senders (such as in structured selling scenarios) may initiate transactions at fixed intervals. Transaction times may also be related to other information, such as activities on external exchanges or news events. A more subtle use of time information is arbitrage between centralized exchanges (CEX) and decentralized exchanges (DEX): sorters can insert transactions created as late as possible to take advantage of the latest CEX price information; at the same time, sorters can exclude all other transactions broadcast after a certain point in time (even if encrypted), ensuring that their transactions exclusively benefit from the latest price advantage.
Source IP address: Searchers can monitor peer-to-peer networks and track source IP addresses to infer the identity of transaction senders. This problem was discovered in the early days of Bitcoin (more than a decade ago). If specific senders have fixed behavior patterns, this is extremely valuable to searchers. For example, knowing the sender's identity can associate encrypted transactions with decrypted historical transactions.
Transaction sender and fee/gas information: Transaction fees are a unique type of metadata in encrypted mempools. In Ethereum, traditional transactions include the on-chain sender address (used to pay fees), the maximum gas budget, and the unit gas fee the sender is willing to pay. Similar to source network addresses, sender addresses can be used to associate multiple transactions and real entities; gas budgets can hint at transaction intentions. For example, interacting with a specific DEX may require a recognizable fixed amount of gas.

Complex searchers may combine multiple types of metadata to predict transaction content.

In theory, this information can be hidden, but at the cost of performance and complexity. For example, padding transactions to a standard length can hide size but wastes bandwidth and on-chain space; adding delay before sending can hide time but increases latency; submitting transactions through anonymous networks like Tor can hide IP addresses, but this brings new challenges.

The most difficult metadata to hide is transaction fee information. Encrypting fee data brings a series of problems to block builders: first is the spam problem. If transaction fee data is encrypted, anyone can broadcast malformed encrypted transactions, which will be sorted but cannot pay fees, and no one can be held accountable after decryption and non-execution. This may be solved by SNARKs, proving that the transaction format is correct and funds are sufficient, but this would significantly increase overhead.

Second is the efficiency of block building and fee auctions. Builders rely on fee information to create profit-maximizing blocks and determine the current market price of on-chain resources. Encrypting fee data would disrupt this process. One solution is to set a fixed fee for each block, but this is economically inefficient and may give rise to a secondary market for transaction packaging, contrary to the original intention of encrypted mempools. Another solution is to conduct fee auctions through secure multi-party computation or trusted hardware, but both methods are extremely costly.

Finally, secure encrypted mempools would increase system overhead in many ways: encryption would increase chain latency, computational load, and bandwidth consumption; how to combine with important future goals such as sharding or parallel execution is still unclear; it may also introduce new failure points for liveness (such as decryption committees in threshold schemes, delay function solvers); at the same time, design and implementation complexity would also significantly increase.

Many problems of encrypted mempools are similar to the challenges faced by blockchains aimed at ensuring transaction privacy (such as Zcash and Monero). If there is any positive significance, it is that solving all the challenges of encryption technology in MEV mitigation will incidentally clear the obstacles for transaction privacy.

Economic Challenges Faced by Encrypted Mempools

Finally, encrypted mempools also face economic challenges. Unlike technical challenges, which can be gradually mitigated with sufficient engineering effort, these economic challenges are fundamental limitations and are extremely difficult to solve.

The core problem of MEV stems from information asymmetry between transaction creators (users) and MEV opportunity miners (searchers and block builders). Users usually do not know how much extractable value is contained in their transactions, so even if there is a perfect encrypted mempool, they may still be induced to disclose decryption keys in exchange for a reward lower than the actual MEV value, a phenomenon that can be called 'incentive decryption'.

This scenario is not hard to imagine, as similar mechanisms like MEV Share already exist in reality. MEV Share is an order flow auction mechanism that allows users to selectively submit transaction information to a pool, where searchers compete for the right to exploit the MEV opportunities of that transaction. The winning bidder extracts MEV and then returns part of the profit (the bid amount or a certain proportion of it) to the user.

This model can be directly adapted to encrypted mempools: users need to disclose decryption keys (or partial information) to participate. But most users are unaware of the opportunity cost of participating in such mechanisms. They only see the immediate returns and are happy to disclose information. There are similar cases in traditional finance: for example, the zero-commission trading platform Robinhood, whose profit model is precisely to sell user order flow to third parties through 'payment-for-order-flow'.

Another possible scenario is: large builders, under the pretext of censorship, force users to disclose transaction content (or related information). Anti-censorship is an important and controversial topic in the Web3 field, but if large validators or builders are legally bound (such as by the regulations of the U.S. Office of Foreign Assets Control, OFAC) to enforce censorship lists, they may refuse to process any encrypted transactions. Technically, users may be able to use zero-knowledge proofs to confirm that their encrypted transactions comply with censorship requirements, but this would add additional costs and complexity. Even if the blockchain has strong anti-censorship properties (ensuring that encrypted transactions are necessarily included), builders may still prioritize known plaintext transactions at the front of the block and place encrypted transactions at the end. Therefore, transactions that need to ensure execution priority may ultimately be forced to disclose their content to builders.

Other Efficiency Challenges

Encrypted mempools would increase system overhead in many obvious ways. Users need to encrypt transactions, and the system also needs to decrypt them in some way, which would increase computational costs and may also increase transaction volume. As mentioned earlier, handling metadata would further exacerbate these overheads. However, there are also some efficiency costs that are not so obvious. In the financial field, if prices reflect all available information, the market is considered efficient; delays and information asymmetry lead to market inefficiency. This is the inevitable result brought by encrypted mempools.

This inefficiency would lead to a direct consequence: increased price uncertainty, which is the direct product of the additional delay introduced by encrypted mempools. Therefore, transactions that fail due to exceeding price slippage tolerance may increase, thereby wasting on-chain space.

Similarly, this price uncertainty may also give rise to speculative MEV transactions, which attempt to profit from on-chain arbitrage. It is worth noting that encrypted mempools may make such opportunities more common: due to execution delays, the current state of decentralized exchanges (DEX) becomes more obscure, which is likely to lead to decreased market efficiency and price differences between different trading platforms. Such speculative MEV transactions would also waste block space, as they often terminate execution if no arbitrage opportunities are found.

Conclusion

The original intention of this article is to sort out the challenges faced by encrypted mempools so that people can turn their attention to the research and development of other solutions, but encrypted mempools may still become part of the MEV governance solution.

A feasible idea is a hybrid design: some transactions achieve 'blind ordering' through encrypted mempools, while others adopt other ordering schemes. For specific types of transactions (such as buy and sell orders from large market participants who have the ability to carefully encrypt or pad transactions and are willing to pay higher costs to avoid MEV), hybrid design may be a suitable choice. For highly sensitive transactions (such as repair transactions for vulnerable security contracts), this design also has practical significance.

However, due to technical limitations, high engineering complexity, and performance overhead, encrypted mempools are unlikely to become the 'universal solution for MEV' that people expect. The community needs to develop other solutions, including MEV auctions, application-layer defense mechanisms, and shortening final confirmation times. MEV will remain a challenge for some time to come, requiring in-depth research to find a balance among various solutions to mitigate its negative effects.