Anoma

Anoma is an intent-centric, privacy-preserving protocol for decentralized counterparty discovery, solving, and multi-chain atomic settlement. To learn more about Anoma's vision, take a look at the Anoma Vision Paper.

These documents describe the motivation, design rationale, architecture, and protocol details of Anoma. They are intended to be both minimal, in that no more is said than necessary, and complete, in that enough is said to define precisely what Anoma is. Anoma is a protocol, it is not an implementation. Documentation for the implementation of the Anoma protocol by Heliax can be found here.

Anoma is free, in both the senses of "free speech" and "free beer". The source for these documents is on Github, permissively licensed, and they can be forked or edited as you like. At present, this particular repository is stewarded by the Anoma Foundation. Contributions are welcome.

This specification is designed be readable in both breadth-first and depth-first manners. If you want to understand the broad motivation and architecture of Anoma, read the motivation and architecture overviews. If you want to dive into the protocol details of a particular aspect of Anoma, such as the state architecture, or release line, such as V1, you can start directly with the document in question; overview documents are self-contained and/or crosslinked where necessary, and contain pointers to subprotocols and details thereof. A sitewide search function is available at the upper left.

NOTE: These documents are not yet complete. If you're reading this page right now, there's a high chance you might be interested in the V1 release line, slated for use in the Namada instance.

Happy reading!

Motivation

For now, see the Anoma vision paper.

Releases

In order to facilitate progressive deployment and iteration, the Anoma protocols are organised into a series of releases. Releases combine a selection of subcomponents of the Anoma protocols (which are themselves independently versioned) into a unified, compatible whole designed both to be architecturally self-contained and to provide a coherent product proposition.

Major release lines are defined by their product propositions - for example, once V1 is released, Heliax will continue to support, improve performance, and add features relevant to multi-asset shielded transfers, but features providing for a substantially different product proposition will be slated for other release lines. This method of organisation is not a position on how particular instances of the Anoma protocols should evolve or upgrade, it is just a choice to cleanly separate different protocol version lines. Note, however, that the architectural capabilities of subsequent major version releases subsume previous ones - everything V1 can do, V2 can do as well - the version lines are a temporally scoped mode of organisation, Anoma is designed to converge to a singular suite of modular protocols.

At present, there are three major releases planned:

  • V1: V1 provides for multi-asset shielded transfers, with assets from any connected chain sharing the same anonymity set, on top of a basic proof-of-stake Tendermint BFT stack.
  • V2: V2 provides for programmable private bartering, counterparty discovery, and settlement, all on top of a bespoke heterogenous consensus system.
  • V3: V3 provides for explicit information flow control and multiparty private state transitions.

V1

What is Anoma v1?

Namada is the first release protocol version and the first fractal instance of the Anoma protocol. Namada is a sovereign proof-of-stake blockchain, using Tendermint BFT consensus, that enables multi-asset private transfers for any native or non-native asset using a multi-asset shielded pool derived from the Sapling circuit. Namada features full IBC protocol support, a natively integrated Ethereum bridge, a modern proof-of-stake system with automatic reward compounding and cubic slashing, and a stake-weighted governance signalling mechanism. Users of shielded transfers are rewarded for their contributions to the privacy set in the form of native protocol tokens. A multi-asset shielded transfer wallet is provided in order to facilitate safe and private user interaction with the protocol.

How does Namada relate to Anoma?

Namada is two things:

  • The first major release version of the Anoma protocol.
  • The first fractal instance launched as part of the Anoma network.

The Anoma protocol is designed to facilitate the operation of networked fractal instances, which intercommunicate but can utilise varied state machines and security models. Different fractal instances may specialise in different tasks and serve different communities. The Namada instance will be the first such fractal instance, and it will be focused exclusively on the use-case of private asset transfers.

Raison d'être

Safe and user-friendly multi-asset privacy doesn't yet exist in the blockchain ecosystem. Up until now you had the choice to build a sovereign chain that reissues assets (e.g. Zcash) or to build a privacy preserving solution on existing chains (e.g. Tornado Cash on Ethereum). Both have large trade-offs: in the former case users don't have assets that they actually want to use and in the latter case the restrictions of existing platforms mean that users leak a ton of metadata and the protocols are expensive and clunky to use.

Namada can support any asset on an IBC-compatible blockchain and assets (such as ERC20 tokens) sent over a custom Ethereum bridge that reduces transfer costs and streamlines UX as much as possible. Once assets are on Namada, shielded transfers are cheap and all assets contribute to the same anonymity set.

Namada is also a helpful stepping stone to finalise, test, and launch a protocol version that is simpler than the full Anoma protocol but still encapsulates a unified and useful set of features. There are reasons to expect that it may make sense for a fractal instance focused exclusively on shielded transfers to exist in the long-term, as it can provide throughput and user-friendliness guarantees which are more difficult to provide with a more general platform. Namada is designed so that it could evolve into such an instance.

Features

Base Ledger

Cryptography

Interoperability

Economics

User interfaces

Later components

  • Mobile clients (Android/iOS)
  • More IBC adapters to different chains & ecosystems
  • Light clients, indexers with proofs
  • Private bridges

V2

V2 includes:

  • Counterparty discovery, matchmaking, and settlement
  • Transparent validity predicates
  • Ferveo for front-running prevention
  • Taiga
  • J-group
  • Typhon

Product: self-sovereign community economic infrastructure

  • Private transfers
  • Private bartering
  • Local identity
  • Local currency issuance
  • Local UBI
  • Public goods funding

V3

V3 includes:

  • FHE/MPC multi-party private state transitions

Product:

  • Limited economic statistics
  • Private auctions
  • Private multi-party stateful contracts

Architecture

Architectural subcomponents:

Clients

Web Wallet UI and Features

The application is divided to 4 main sections:

  • LockScreen
  • AccountOverview
  • Staking, Governance, Public Goods Funding
  • Settings

These are further divided to individual screens or flows (comprising several screens) grouping activities that belong together. The user stories below are associated with visual representation of the views. They are either wireframes or placeholder designs. When the view is represented with a wireframe, there is likely a ready design for the view. When there is no ready designs the developers can develop the features using the placeholder designs.

Example: The wireframe for staking view would look like this:

And the placeholder design for the same feature looks like this:

Here is the example user stories for that view:

User can:

  • see an overview of own balances (staked, available, ...)
  • see own active staking positions
  • see listing and be able to search all validators
  • easily be able to filter validators by state (active, inactive, ...)

Views

LockScreen

When the user accesses the wallet for the first time there is a need to create a new account. This screen gives the user to possibility to do so or unlock the wallet by using an existing account.

LockScreen

Wireframe User can:

  • can to unlock the wallet by entering the master password
  • can to start a flow to create a new account

AccountOverview

This is the most important part of the application and the part where the user spends the most time. Here the user performs the most common tasks such as creating transactions. Only one account is selected as a time and the selected account is indicated here.

AccountOverview

Wireframe User can:

  • see the aggregated balance in fiat currency
  • can see the currently selected account address
  • can navigate to Settings/Accounts for changing the account
  • can see a listing of all hold tokens and their logos, balances, names

AccountOverview/TokenDetails

Wireframe User can:

  • can see the balance of token in native and fiat currency
  • can navigate to AccountOverview/TokenDetails/Receive for receiving tokens
  • can navigate to AccountOverview/TokenDetails/Send for sending tokens
  • can see a listing of past transaction of the current account and selected token

AccountOverview/TokenDetails/Receive

Wireframe User can:

  • see QR code of the address
  • see address as a string and copy it by clicking button

AccountOverview/TokenDetails/Send

Wireframe 1 Wireframe 2 Wireframe 3 User can: view 1:

  • see the balance of the token in current account
  • enter details: transfer amount, recipient address, memo
  • can select to perform the transaction as shielded

view 2:

  • see a summary of the transaction details
  • clear indication whether the transaction is transparent of shielded
  • select a gas fee
  • see an option in gas fees that is specific for shielded transactions
  • see a transaction summary including gas fee

view 3:

  • see a confirmation once the transaction is confirmed
  • be abel to navigate to see the new transaction in the block explorer
  • be able to navigate back to AccountOverview/TokenDetails

StakingGovernancePgf

Aside of AccountOverview this is a part that the user is likely visiting frequently. When user clicks the main menu Staking & Governance a sub menu with 3 options (Staking, Governance, Public Goods Funding) opens. Staking is selected as a default.

Staking/Overview

designs

User can:

  • see an overview of own balances (staked, available, ...)
  • see own active staking positions
  • see the state of all the staking position (pending, staked, unbonding with remaining time)
  • see listing and be able to search all validators
  • easily be able to filter validators by state (active, inactive, ...)

Staking/ValidatorDetails

designs

User can:

  • see all information in chain about the validator
  • see a logo of the validator
  • see a and click link to validators website
  • see all staking positions with the current validator
  • see the state of all the staking position (pending, staked, unbonding with remaining time)
  • see all unclaimed rewards with the current validator
  • open a modal to manage new staking, unstake, and claim rewards

Governance/Proposals

designs

User can:

  • see a listing of the latest proposals and their statuses
  • filter by proposal status
  • search by proposal title
  • navigate to the details of any proposal
  • navigate to a view to create new proposal

Governance/ProposalDetails

designs

User can:

  • see all the details of the proposal
  • can vote on proposal if vote is open and the user has not voted yet
  • can see all voting details of the proposal
  • can see full description

Governance/AddProposal

designs

User can:

  • enter the details (TBD) of the proposal
  • see a summary of the proposal
  • submit the proposal
  • be prompted for a payment by the wallet

PublicGoodsFunding/Overview

designs

User can:

  • see a list of current council members
  • see a list of the latest continuous funding
  • see a list of the latest retrospective funding
  • navigate to see current and past council members
  • navigate to see all continuous funding
  • navigate to see all retrospective funding

PublicGoodsFunding/Council

designs

User can:

  • see the details of the councils, including their funding, budget, members, ...
  • As a default see the current council is being displayed
  • select a tab "Past" and see all the past councils
  • Select any of the past council in the table and see it's details
  • navigate to governance vote for the council
  • navigate to see the details of continuous and retrospective funding of the funding of the council
  • navigate to the council member view to see details about the council members

PublicGoodsFunding/ContinuousFunding

designs

User can:

  • See all the funding
  • filter by: all, active, past
  • navigate to the council details that approved this funding
  • navigate to block explorer to see the transaction for the payments

PublicGoodsFunding/RetrospectiveFunding

designs

User can:

  • See all the funding
  • filter by: all, upcoming
  • navigate to the council details that approved this funding
  • navigate to block explorer to see the transaction for the payments

Settings

This is a part of the application that is visited less often. This is where the user can change settings of select the active account.

Settings

Wireframe User can:

  • Navigate to Settings/Accounts
  • Navigate to Settings/WalletSettings

Settings/WalletSettings

Wireframe User can:

  • see and change the fiat currency to display in various locations in the app where amounts are being displayed in fiat currency
  • Default fiat currency is USD

Settings/Accounts

Wireframe User can:

  • select an account by clicking it, when it becomes visibly selected
  • can navigate to Settings/AccountSettings for changing the settings of certain account
  • can navigate to Settings/Accounts/NewAccount/Start for adding a new account to the wallet

Settings/Accounts/NewAccount

Wireframe 1 Wireframe 2 Wireframe 3 Wireframe 4 Wireframe 5 User can:

view 1:

  • see a welcome screen that explain the flow

view 2:

  • enter an alias to the account
  • enter and confirm a password
  • select the length of the seed phrase (12 or 24 words)

view 3:

  • see a seed phrase that was generated
  • copy the seed phrase to clipboard

view 4:

  • enter a randomly requested word from the set of words. ("please enter word #5")

view 5:

  • see a confirmation that the account was created
  • navigate to AccountOverview and so that the newly created account becomes the selected account

Settings/AccountSettings

Wireframe User can:

  • Rename the selected account
  • display the seed phrase, user is being prompted for a password
  • delete account, user is prompted to input a security text to prevent an accidental deletion
  • select the network

IBC Protocol

The web wallet must be able to transfer token amounts to other chains via the Inter-Blockchain Communication Protocol (IBC).

We need to be able to support the following:

  • Fungible token transfer (ICS020) from Namada to other Anoma chains
  • Fungible token transfer (ICS020) from Namada to Cosmos

What the UI will need to display to the user:

  • Select a chain (chain ID) as destination
  • Enter a channel ID for destination (e.g., channel-0)
  • Specify a receiver address
  • Specify a token
  • Specify an amount to transfer

The web wallet will need to construct a MsgTransfer struct, which will get wrapped in a normal, signed transaction and broadcasted to the source ledger (this struct is passed into the Tx data):

#![allow(unused)]
fn main() {
MsgTransfer {
	source_port: String,
	source_channel: String,
	token: Option<Coin>,
	sender: Signer,
	receiver: Signer,
	timeout_height: Height,
	timeout_timestamp: Timestamp
}
}

A populated MsgTransfer with a disabled block-height timeout (instead using a timestamp timeout), may look like the following:

#![allow(unused)]
fn main() {
MsgTransfer {
	source_port: PortId("transfer"),
	source_channel: ChannelId("channel-0"),
	token: Some(Coin {
		denom: "atest1v4ehgw36x3prswzxggunzv6pxqmnvdj9xvcyzvpsggeyvs3cg9qnywf589qnwvfsg5erg3fkl09rg5",
		amount: "1.23456"
	}),
	sender: Signer( "atest1v4ehgw36xvmrgdfsg9rrwdzxgfprq32yxvensdjxgcurxwpeg5mrxdpjxfp5gdp3xqu5gs2xd8k4aj"
	),
	receiver: Signer( "atest1d9khqw36xu6njwp4x5eyz334g4zrjvz9gyungv6p8yurys3jxymrxvzy89pyzv2pxaprzsfedvglv2"
	),
	timeout_height: Height {
		revision: 0,
		height: 0
	},
	timeout_timestamp: Timestamp {
		time: Some(Time(PrimitiveDateTime {
			date: Date {
				year: 2022,
				ordinal: 124
			},
			time: Time {
				hour: 14,
				minute: 15,
				second: 33,
				nanosecond: 0
			}
		}))
	}
}
}

NOTE Unlike with tx_transfer, the amount we pass with the Token is not submitted in micro-units, but as a regular f32 value. No conversion is needed in the web wallet.

Once this transaction is unwrapped and validated, apply_tx will invoke IBC.dispatch() (see: https://github.com/anoma/anoma/blob/master/wasm/wasm_source/src/tx_ibc.rs).

When this is executed on the source chain, the balance will be deducted on the source account, so we need to reflect this in the interface. If the transaction succeeds, query the balance for that token and display to the user.

Testing

Instructions for setting up local Namada chains, along with the Hermes relatyer (ibc-rs) can be found here:

https://hackmd.io/@heliax/BJ5Gmyxrq

The wallet UI will need to be configured to connect to the source chain from which you want to transfer tokens. The user will have to enter a valid channel ID in the interface, in addition to an established address on the destination chain (the receiver).

Configuration

The wallet web app should accept a configuration per-environment that will contain not only the default network, but the possible destination networks that the user can transfer tokens to. We need the following information for each, at a minimum:

  • A user-friendly alias naming the network
  • Destination URL
  • Destination Port
  • A non-default portId, if necessary, though in most cases, the default of transfer would likely be used.

Resources

Cosmos relayers:

Client Application

React Web Application

  • Built with TypeScript
  • State-management with Redux Toolkit (@reduxjs/toolkit)
  • CRA (create-react-app) scripts v5 with Craco to enable yarn workspaces (monorepo package management)
  • wasm-react-scripts - enabling WebAssembly files into the Webpack pipeline
  • Styled-Componenents for all application/component styling

WebAssembly Library

Much of the core functionality of the web app requires either direct interfacing with types from the Anoma codebase, or other Rust libraries that provide encryption, key-management, mnemonic-generation, etc., that are more easily and robustly handled in the Rust ecosystem than that of TypeScript.

The primary functionality that we currently pull from anoma involves constructing transactions. The web wallet interface should be able to serialize the data broadcast to the ledger for different transactions, and this requires items to be serialized within the WebAssembly code. We created anoma-lib, which houses wrapped Anoma types (wrapped when some work is needed to get it to work well with wasm), and the logic needed for us to be able to interface with it from TypeScript.

The Rust source code anoma-lib is structured as follows:

.
├── types
│   ├── address.rs
│   ├── keypair.rs
│   ├── mod.rs
│   ├── transaction.rs
│   ├── tx.rs
│   └── wrapper.rs
├── account.rs
├── lib.rs
├── transfer.rs
├── utils.rs

Here, we have several types that are essentially built on top of anoma types, allowing us to interface easily from the client app, such as address, keypair, tx, and wrapper, then a generic transaction type that handles the logic common to all transactions. Essentially, we want these types to handle any serialization that the anoma types require entirely within the wasm, then later translate the results into something the client can understand.

Outside of types, we have an account.rs file that allows us to call account functions, such as initialize (to construct an "init-account" transaction), from the client app. transfer.rs is similar, in that it provides the bridge for the client to issue a transfer transaction. Additional transactions can be easily created in this way, with a specific differences being handled in a top level Rust source file, the common logic of transactions handled by types/transaction, and any types that need extra work in order to be useful to the client being added as well to types.

Interfacing between the Client and WebAssembly

When compiling the wasm utilizing wasm-pack, we get the associated JavaScript source to interact with the WebAssembly output, as well as a TypeScript type definition file. When we set the wasm-pack target to web, we get an additional exported init function, which is a promise that resolves when the wasm is fully loaded, exposing the memory variable. In most cases we shouldn't need to interact directly with the memory of the wasm, but by awaiting the init() call, we can immediately execute any of the wasm methods.

In the case of anoma-lib, there is a corresponding class that initializes and exposes the features of the wasm in anoma-wallet, called AnomaClient. (NOTE: This is one use case for wasm, but we may have any number of wasm projects that the wallet can utilize). Exposing the features through a TypeScript class is a good opportunity to move from Rust-style "snake-casing" to camel-casing (most common in TypeScript), and any additional type definitions we can add at this level as well.

The goal of bridging wasm and the client TypeScript application should be to make its usage as straightforward as any TypeScript class. It should also be fairly easy for the developer to add new features to the Rust source and quickly bring that into the client app.

Dealing with Rust types in TypeScript

One of the challenges of working with WebAssembly is how we might go about handling types from Rust code. We are limited to what JavaScript can handle, and often when serializing output from the wasm, we'll choose a simple type like string or number, or send the data as a byte array (very common, especially when dealing with numbers larger than JavaScript can handle by default). Sending raw data to the client is often a decent solution, then any encoding we prefer we can enact on the client-side (hexadecimal, base58, base64, etc), and choosing a Rust type like Vec<u8> makes this straight-forward. (More to come on this topic in the future)

There is much more nuance to handling types from Rust wasm in TypeScript when working with wasm-bindgen, and more information can be found at the following URL:

https://rustwasm.github.io/wasm-bindgen/reference/types.html

Testing with WebAssembly

The wallet-interface should be able to run within the Jest testing framework. This is made possibly by switching our wasm-pack target and rebuilding before the test is run, as tests run within NodeJS. So, instead of the following:

wasm-pack build ../anoma-lib/ --out-dir ../anoma-wallet/src/lib/anoma --out-name anoma --target web

Key Derivation (transparent addresses)

Given a master seed (a 12 or 24 word bip39 mnemonic), the user should be able to derive additional accounts deterministically.

The wallet currently implements functionality to derive bip32 addresses following bip44 paths for slip-0044 registered coin types, using hardened addresses.

The bulk of this funcionality resides in anoma-apps/anoma-lib/lib/src/wallet.rs (https://github.com/heliaxdev/anoma-apps/blob/main/packages/anoma-lib/lib/src/wallet.rs). Creating a new Wallet struct with a provided mnemonic generates a seed byte vector and establishes a root extended key. Calling the derive method on that Wallet providing a derivation path will give us the following struct:

#![allow(unused)]
fn main() {
pub struct DerivedAccount {
    address: String,          // p2pkh address
    wif: String,              // Address in Wallet Import Format (WIF)
    private_key: Vec<u8>,     // Extended Private key
    public_key: Vec<u8>,      // Extended Public key
    secret: Vec<u8>,          // ed25519 secret key
    public: Vec<u8>,          // ed25519 public key
}
}

The ed25519 keys can then be used to initialize an account on the ledger to receive an Established Address.

Deriving Shielded Addresses

TBD

Resources

<span class="katex"><span class="katex-html" aria-hidden="true"></span></span>

Persistence of User Wallet

The state of the user's wallet, consisting of their master seed, along with any accounts derived from that seed, should be stored locally in a safe manner. As this requires the use of localStorage, all data should be encrypted.

Presently, this challenge is being addressed by using the user's password (specified when creating their master seed) to encrypt/decrypt the mnemonic seed, as well as unlocking the state of their wallet. The accounts in the state are being persisted via redux-persist, with an ecryption transform that handles the encrypting and decrypting of all data stored in localStorage.

The mnemonic is stored separately from the accounts data. In anoma-apps/packages/anoma-lib/lib/types/mnemonic.rs implementation of Mnemonic, we provide the ability to specify a password allowing us to retrieve a storage value of the mnemonic, which is encrypted before saving to localStorage. When the wallet is locked, the user must provide a password, which is validated by attempting to decrypt the stored mnemonic. If successful, the password is used to either generate an encrypted Redux persistence layer, or decrypt the existing one, restoring the user's wallet state.

redux-persist gives us the ability to specify which sub-sections of the state should be persisted. Presently, this is only enabled for any derived account data. From the persisted store, we can establish a persistor, which can be passed into a PersistGate component that will only display its children once the state is retrieved and decrypted from storage.

If we wanted to export the state of the user's accounts, this would be trivial, and simply a matter of exporting a JSON file containing the JSON.stringifyed version of their accounts state. Some work would need to be done in order to restore the data into Redux, however.

The localStorage state is stored in one of three places, depending on your environment:

  • persist:anoma-wallet - Production
  • persist:anoma-wallet-dev - Devnet
  • persist:anoma-wallet-local - Local ledger

This allows us to keep our wallet state in sync with multiple ledgers while testing.

Restoring the accounts state from file

The user should have the ability to save the state of their accounts in their wallet to a JSON file. It is relatively trivial to take a snapshot of the accounts state once the user is authenticated.

Technically, this will likely involve a process by which, following the upload of the file and successful parsing, the existing persist:anoma-wallet storage is cleared, and when the store is initialized, we pass the parsed accounts state in to configureStore by way of the preloadedState parameter. This will only happen once, and on subsequent calls to the makeStore function, it should hydrate from the encrypted value in local storage.

Refer to the following to see how our present makeStore Redux store factory functions:

https://github.com/heliaxdev/anoma-apps/blob/9551d9d0f20b291214357bc7f4a5ddc46bdc8ee0/packages/anoma-wallet/src/store/store.ts#L18-L50

This method currently accepts a secretKey as required by the encryptTransform, and checks the environment variables REACT_APP_LOCAL and NODE_ENV to determine where the store gets saved in localStorage. This is mostly useful for local testing where you may want to switch between connecting to a local ledger or a testnet, and want to keep your local stores in sync with both.

Challenges

As a secret is required to unlock the persisted store, this store must be instantiated dynamically once a password is entered and validated. In the current implementation of the wallet, any routes that will make use of the Redux store are loaded asynchronously. When they are loaded, the store is initialized with the user's password (which is passed in through the Context API in React, separate from the Redux state).

Resources

Using JSON RPC to Communicate with Ledger

To query values from the ledger, the web-wallet must issue JSON RPC calls to the Tendermint abci_query endpoint over HTTP, which if running the ledger locally, would look like:

http://localhost:26657/abci_query/

To handle this in the wallet, we can make use of existing functionality from cosmjs, namely, the RpcClient and WebsocketClient.

RPC HTTP Client

Over HTTP, using the abci_query endpoint, we can query the ledger by providing a path to the storage value we wish to query. Here are some examples:

  • Query balance: value/#{token_address}/balance/#{owner_address}
  • Query epoch: epoch
  • Is known address?: has_key/#{address}/?

There are many other types of queries in addition to abci_query that can be issued to Tendermint. See https://docs.tendermint.com/master/rpc/ for more information.

WebSocket Client

The most interesting type of interaction with the ledger thus far is via WebSockets. The goal of the implementation in anoma-wallet is to allow us to provide listeners so that we can update the React app according to activity on the ledger. The core functionality of the implementation on the client is as follows:

public async broadcastTx(
  hash: string,
  tx: Uint8Array,
  { onBroadcast, onNext, onError, onComplete }: SubscriptionParams
): Promise<SocketClient> {
  if (!this._client) {
    this.connect();
  }

  try {
    const queries = [`tm.event='NewBlock'`, `{TxResponse.Hash}='{hash}'`];
    this.client
      ?.execute(
        createJsonRpcRequest("broadcast_tx_sync", { tx: toBase64(tx) })
      )
      .then(onBroadcast)
      .catch(onError);

    this.client
      ?.listen(
        createJsonRpcRequest("subscribe", {
          query: queries.join(" AND "),
        })
      )
      .addListener({
        next: onNext,
        error: onError,
        complete: onComplete,
      });

    return Promise.resolve(this);
  } catch (e) {
    return Promise.reject(e);
  }
}

There are a few key things happening here. Once we have constructed a transaction, we receive a transaction hash and a Uint8Array containing the bytes of the wrapped and signed transaction. We first execute the request to broadcast_tx_sync, which can take an onBroadcast callback from the client to listen to the initial response from the ledger. We provide the tx data in base64 format as an argument.

Following that, we subcribe to events on the ledger using a query containing tm.event='NewBlock' AND applied.hash='transaction_hash_value', then then register the following listeners so that we may trigger activity in the front-end app:

  • onNext - called when we receive a NewBlock event that matches our hash
  • onError - called in the event of an error
  • onComplete - called when the websocket closes

The way this library in anoma-wallet/src/lib/ is implemented, we can also determine when we want to disconnect the WebSocket. For instance, if for some reason we want to issue a series of transactions in succession, we could feasibly leave the connection open, then close after the final transaction is complete. Alternatively, and in most cases, we would simply close the connection when we are finished with a single transaction, which would then trigger the onComplete callback.

See Transparent Transactions for more information on how the transactions are initially constructed.

Transparent Transactions

Table of Contents

Constructing Transparent Transactions

The web-wallet will need to support many transactions. As the data that gets submitted to the ledger is most easily constructed from anoma types, we perform the assembly of the transaction with in WebAssembly using Rust so that we may natively interact with anoma. The role of wasm in this scenario is to provide two pieces of data to the client (which will handle the broadcasting of the transaction), which are:

  1. hash - the hash of the transaction
  2. data - A byte array of the final wrapped and signed transaction

The following outlines how we can construct these transactions before returning them to the client.

Part 1 - Token Transfer Transactions

There are a few steps involved in creating and signing a transaction:

  1. Create an anoma::proto::Tx struct and sign it with a keypair
  2. Wrap Tx with a anoma::types::transaction::WrapperTx struct which encrypts the transaction
  3. Create a new anoma::proto::Tx with the new WrapperTx as data, and sign it with a keypair (this will be broadcast to the ledger)

1.1 - Creating the anoma::proto::Tx struct

The requirements for creating this struct are as follow:

  • A pre-built wasm in the form of a byte array (this is loaded in the client as a Uint8Array type to pass to the wasm)
  • A serialized anoma::types::token::Transfer object which contains the following:
    • source - source address derived from keypair
    • target - target address
    • token - token address
    • amount - amount to transfer
  • A UTC timestamp. NOTE this is created when calling proto::Tx::new(), however, this is incompatible with the wasm in runtime (time is undefined). Therefore, we need to get a valid timestamp from js_sys:
#![allow(unused)]
fn main() {
// anoma-lib/src/util.rs

pub fn get_timestamp() -> DateTimeUtc {
    let now = js_sys::Date::new_0();

    let year = now.get_utc_full_year() as i32;
    let month: u32 = now.get_utc_month() + 1;
    let day: u32 = now.get_utc_date();
    let hour: u32 = now.get_utc_hours();
    let min: u32 = now.get_utc_minutes();
    let sec: u32 = now.get_utc_seconds();

    let utc = Utc.ymd(year, month, day).and_hms(hour, min, sec);
    DateTimeUtc(utc)
}
}

Creating the types::token::Transfer struct to pass in as data:

In wasm:

#![allow(unused)]
fn main() {
// anoma-lib/src/transfer.rs

let transfer = token::Transfer {
    source: source.0,
    target: target.0,
    token: token.0.clone(),
    amount,
};

// The data we pass to proto::Tx::new
let data = transfer
    .try_to_vec()
    .expect("Encoding unsigned transfer shouldn't fail");
}

In Anoma CLI: https://github.com/anoma/anoma/blob/f6e78278608aaef253617885bb7ef95a50057268/apps/src/lib/client/tx.rs#L406-L411

Creating and signing the proto::Tx struct

In wasm:

#![allow(unused)]
fn main() {
// anoma-lib/src/types/tx.rs

impl Tx {
    pub fn new(tx_code: Vec<u8>, data: Vec<u8>) -> proto::Tx {
        proto::Tx {
            code: tx_code,
            data: Some(data),
            timestamp: utils::get_timestamp(),
        }
    }
}
}

NOTE Here we provide a work around to an issue with proto::Tx::new() in wasm - instead of calling the method directly on Tx, we create a new implementation that returns a proto::Tx, with the timestamp being set using js_sys in order to make this wasm-compatible.

In Anoma CLI: https://github.com/anoma/anoma/blob/f6e78278608aaef253617885bb7ef95a50057268/apps/src/lib/client/tx.rs#L417-L419

1.2 - Creating the anoma::types::transaction::WrapperTx struct

The requirements for creating this struct are as follows:

  • A transaction::Fee type, which contains:
    • amount - the Fee amount
    • token - the address of the token
  • epoch - The ID of the epoch from query
  • gas_limit - This contains a u64 value representing the gas limit
  • tx - the proto::Tx type we created earlier.

In wasm:

#![allow(unused)]
fn main() {
// anoma-lib/src/types/wrapper.rs

transaction::WrapperTx::new(
    transaction::Fee {
        amount,
        token: token.0,
    },
    &keypair,
    storage::Epoch(u64::from(epoch)),
    transaction::GasLimit::from(gas_limit),
    tx,
)
}

NOTE Here we can directly invoke WrapperTx::new, so we only need to concern ourselves with convering the JavaScript-provided values into the appropriate types.

In Anoma CLI: https://github.com/anoma/anoma/blob/f6e78278608aaef253617885bb7ef95a50057268/apps/src/lib/client/tx.rs#L687-L696

1.3 - Create a new Tx with WrapperTx and sign it

Here we create a WrapperTx type, and with that we create a new Tx type (our wrapped Tx type) with the WrapperTx as the data, and empty vec![] for code, and a new timestamp, and then we sign it.

In wasm:

#![allow(unused)]
fn main() {
// anoma-lib/src/types/wrapper.rs -> sign()

(Tx::new(
    vec![],
    transaction::TxType::Wrapper(wrapper_tx)
        .clone()
        .try_to_vec().expect("Could not serialize WrapperTx")
)).sign(&keypair)
}

We can summarize a high-level overview of the entire process from the anoma-lib/src/types/transaction.rs implementation:

#![allow(unused)]
fn main() {
let source_keypair = Keypair::deserialize(serialized_keypair)?;
let keypair = key::ed25519::Keypair::from_bytes(&source_keypair.to_bytes())
    .expect("Could not create keypair from bytes");

let tx = Tx::new(
    tx_code,
    data,
).sign(&keypair);

let wrapper_tx = WrapperTx::new(
    token,
    fee_amount,
    &keypair,
    epoch,
    gas_limit,
    tx,
);

let hash = wrapper_tx.tx_hash.to_string();
let wrapper_tx = WrapperTx::sign(wrapper_tx, &keypair);
let bytes = wrapper_tx.to_bytes();

// Return serialized wrapped & signed transaction as bytes with hash
// in a tuple:
Ok(Transaction {
    hash,
    bytes,
})
}

In Anoma CLI: https://github.com/anoma/anoma/blob/f6e78278608aaef253617885bb7ef95a50057268/apps/src/lib/client/tx.rs#L810-L814

Part 2 - Initialize Account Transaction

Constructing an Initialize Account transaction follows a similar process to a transfer, however, in addition to providing a tx_init_account wasm, we need to provide the vp_user wasm as well, as this is required when constructing the transaction:

#![allow(unused)]
fn main() {
// anoma-lib/src/account.rs

let vp_code: Vec<u8> = vp_code.to_vec();
let keypair = &Keypair::deserialize(serialized_keypair.clone())
    .expect("Keypair could not be deserialized");
let public_key = PublicKey::from(keypair.0.public.clone());

let data = InitAccount {
    public_key,
    vp_code: vp_code.clone(),
};
}

Following this, we will pass data into to our new transaction as before, along with tx_code and required values for WrapperTx, returning the final result in a JsValue containing the transaction hash and returned byte array.

Submitting Transparent Transactions

See RPC for more information on HTTP and WebSocket RPC interaction with ledger.

Shielded Transfers In Web Client

Shielded transfers are based on MASP and allows users of Anoma to performs transactions where only the recipient, sender and a holder of a viewing key can see the transactions details. It is based on the specifications defined at Shielded execution.

Codebase

The code for interacting with the shielded transfers is split in 2 places:

  • anoma-wallet (TypeScript)
    • capturing the user interactions
    • providing user feedback
    • fetching the existing MASP transactions from the ledger
  • masp-web (Rust)
    • generating shielded transfers
    • encrypting/decrypting data
    • utilising MASP crate
packages
│   ├── masp-web                # MASP specific Rust code
│   ├── anoma-wallet            # anoma web wallet

High level data flow in the client

In the current implementation whenever a user start to perform a new action relating to shielded transfers, such as creating a new transfer or retrieving of the shielded balance, the client fetches all existing shielded transfers from the ledger. In the current form this is done in a non optimal way where the already fetched past shielded transactions are not persisted in the client. They are being fetched every time and only live shortly in the memory as raw byte arrays in the form they come in from the ledger. The life time in the client is: between the fetching in the TypeScript code and then being passed and being scanned/decrypted by MASP protocol in the Rust code.

This process can be further optimized:

  • Anoma CLI already does caching of fetched transfers, so that logic can be ru-used by providing virtual filesystem (for example memfs) implementation to Rust:
  • Likely the scanning can already start parallel while the fetching is running and if a sufficient amount of notes are found in scanning the fetching could be terminated.

Relation to MASP/Anoma CLI

The feature set and logic between the CLI and the web client should be the same. There are however a few differences in how they work, they are listed here:

  • When optimizing the shielded interaction. We need to fetch and persist the existing shielded transfers in the client. For this the CLI is using the file system of the operating system while the web client will either have to store that data directly to the persistance mechanism of the browser (localhost or indexedDB) or to those through a virtual filesystem that seems compliant to WASI interface.
  • In the current state the network calls will have to happen from the TypeScript code outside of the Rust and WASM. So any function calls to the shielded transfer related code in Rust must accept arrays of byte arrays that contains the newly fetched shielded transfers.
  • There are limitations to the system calls when querying the CPU core count in the web client, so the sub dependencies of MASP using Rayon will be limited.

The API

The consumer should use the npm package @anoma/masp-web that lives next to the other packages in the anoma-apps monorepo. It exposes the following:

getMaspWeb

  • A util to return an instance of MaspWeb and ensure it is initiated. It it was retrieved and initiated earlier the existing instance is returned.
async (): Promise<MaspWeb>

MaspWeb

  • this contains the methods to perform the shielded transaction related activities.
  • the is a utility method getMaspWeb() exported that returns an instance of MaspWeb and ensures it is instantiated.

The class exposes the following methods:

generateShieldedTransaction

generateShieldedTransaction = async (
    nodesWithNextId: NodeWithNextId[],
    amount: bigint,
    inputAddress: string | undefined,
    outputAddress: string,
    transactionConfiguration: TransactionConfiguration
  ): Promise<Uint8Array>

getShieldedBalance

getShieldedBalance = async (
    nodesWithNextId: NodeWithNextId[],
    inputAddress: string,
    transactionConfiguration: TransactionConfiguration
  ): Promise<string>

createShieldedMasterAccount

  • needs to add the return type to reflect derived from Rust packages/masp-web/lib/src/shielded_account/mod.rs:ShieldedAccount
createShieldedMasterAccount = (
    alias: string,
    seedPhrase: string,
    password?: string
): any

// the return type if from Rust code
// packages/masp-web/lib/src/shielded_account/mod.rs:ShieldedAccount
//
// pub struct ShieldedAccount {
//     viewing_key: String,
//     spending_key: String,
//     payment_address: String,
// }

decodeTransactionWithNextTxId

  • Utility that decodes the fetched shielded transactions from the ledger and returns in format that contains the shielded transaction and the id for fetching the next one.
decodeTransactionWithNextTxId = (byteArray: Uint8Array): NodeWithNextId

type NodeWithNextId = {
  node: Uint8Array;
  nextTransactionId: string;
};

The above is wrapping the below described Rust API, which is not intended to be used independently at the moment.

Underlying Rust code

currently the masp-web exposes the following API

create_master_shielded_account

#![allow(unused)]
fn main() {
* creates a shielded master account
* takes an optional password
pub fn create_master_shielded_account(
    alias: String,
    seed_phrase: String,
    password: Option<String>,
) -> JsValue
}

get_shielded_balance

  • returns a shielded balance for a spending_key_as_string token_address pair
  • requires the past transfers as an input
#![allow(unused)]
fn main() {
pub fn get_shielded_balance(
    shielded_transactions: JsValue,
    spending_key_as_string: String,
    token_address: String,
) -> Option<u64>
}

create_shielded_transfer

  • returns a shielded transfer, based on the passed in data
  • requires the past transfers as an input
#![allow(unused)]
fn main() {
pub fn create_shielded_transfer(
    shielded_transactions: JsValue,
    spending_key_as_string: Option<String>,
    payment_address_as_string: String,
    token_address: String,
    amount: u64,
    spend_param_bytes: &[u8],
    output_param_bytes: &[u8],
) -> Option<Vec<u8>>
}

NodeWithNextId

  • This is a utility type that is used when the TypeScript code is fetching the existing shielded transfers and extracting the id of the next shielded transfer to be fetched. The returned data from ledger is turned to this type, so that the TypeScript can read the id of the next transfer and fetch it.
#![allow(unused)]
fn main() {
pub struct NodeWithNextId {
    pub(crate) node: Option<Vec<u8>>,
    pub(crate) next_transaction_id: Option<String>,
}
}

NodeWithNextId::decode_transaction_with_next_tx_id

  • accepts the raw byte array returned from the ledger when fetching for shielded transfers and returns NodeWithNextId as JsValue
#![allow(unused)]
fn main() {
pub fn decode_transaction_with_next_tx_id(transfer_as_byte_array: &[u8]) -> JsValue
}

Web explorer interface

  • Block explorer
    • Display PoS state
    • Display governance state
    • Display transparent transfers
    • Display transfers in and out of the MASP
    • Display total values for the MASP
    • Allows tx hashes of shielded transfers to be looked up for confirmation

Consensus

Ferveo

Typhon

Summary

Typhon stores, orders, and executes transactions on Anoma blockchains. It is intended as a replacement for Tendermint. We have a brief overview presentation of some of the features of Typhon here..

Typhon can be broken down into (roughly) 3 layers:

  • a mempool, which stores received transactions
  • a consensus, which orders transactions stored by the mempool, and
  • an execution engine, which executes the transactions on the state machine.

We expect each Anoma participant (validator) will run processes in all three layers. layer diagram Above, we use "client" to refer to matchmakers, ferveo, or anyone else who generates transactions to be ordered. The "critical path" is shown in thicker arrows, with other crucial messages shown in narrower arrows.

Mempool

Validators receive transactions from clients, store them, and make them available for the execution engine to read. The mempool protocol, which is based on Narwhal also produces a DAG of headers, which reference batches of transactions (via hash), and prove that those transactions are available for the execution engine. These headers are ultimately what the consensus decides on, in order to establish a total order of transactions. Read more here.

Consensus

Our consensus is based on Heterogeneous Paxos. Validators choose a totally ordered sequence of headers from the mempool DAG. This establishes a total order of transactions for the execution engine to execute. Read more here.

Execution Engine

Given a total order of transactions, the execution engine updates and stores the "current" state of the virtual machine, using as much concurrency as possible. Proofs from the execution engine allow light clients to read the current state. When the execution engine has finished with a transaction, it communicates to the mempool that the transaction can be garbage-collected from storage. Read more here.

Mempool

Summary

Validators run the mempool protocol. They receive transactions from clients, store them, and make them available for the execution engine to read. The mempool protocol, which is based on Narwhal also produces a DAG of headers, which reference batches of transactions (via hash), and prove that those transactions are available for the execution engine. These headers are ultimately what the consensus decides on, in order to establish a total order of transactions.

Heterogeneous Narwhal

The core idea here is that we run an instance of Narwhal for each learner. For chimera chains, an "atomic batch" of transactions can be stored in any involved learner's Narwhal.

We also make 2 key changes:

  • The availability proofs must show that any transaction is sufficiently available for all learners. This should not be a problem, since in Heterogeneous Paxos, for any connected learner graph, any learner's quorum is a weak quorum for all learners.
  • Whenever a validator's Narwhal primary produces a batch, it must link in that batch not only to a quorum of that learner's block headers from the prior round, but also to the most recent batch this validator has produced for any learner. This ensures that, within a finite number of rounds (3, I think), any transaction batch referenced by a weak quorum of batches in its own Narwhal will be (transitively) referenced by all batches in all Narwhals for entangled learners.

Overview

Like Narwhal, Heterogeneous Narwhal Validators have multiple concurrent processes (which can even run on separate machines). Each validator has one primary process and many worker processes. When a client submits a transaction, they first send it to a worker process.

Workers

Worker processes ensure transactions are available. Transactions are batched, and erasure-coded (possibly simply replicated) across a weak quorum for every learner of workers, and only signed hashes of those batches are sent to primaries. This separates the high-bandwidth work of replicating transactions from the ordering work of the primaries.

Primaries

Primary processes establish a partial order of transaction batches (and by extension transactions), in the form of a structured DAG. The DAG proceeds in rounds for each learner: each primary produces at most one block for each (correct) learner in each round. That block references blocks from prior rounds.

Primaries assemble headers (both their own and for other primaries) from collections of worker hashes, and references to prior blocks. They then sign votes, stating that they will not vote for conflicting headers, and (optionally) that their workers have indeed stored the referenced transactions. Primaries collect votes concerning their own headers, producing blocks: aggregated signatures showing a header is unique.

More formally, we present the Heterogeneous Narwhal protocol as the composition of two crucial pieces: the Heterogeneous Narwhal Availability protocol, and the Heterogeneous Narwhal Integrity protocol.

Vocabulary

  • Learners dictate trust decisions: just like in Heterogeneous Paxos, we use a Learner Graph. In diagrams, we usually represent learners with colors (red and blue).
  • quorum Quorum: a set of validators sufficient for a Learner to make blocks. Each Learner has a set of quorums.
  • Intact Learner: any 2 quorums for an Intact Learner have a correct validator in their intersection. Most of our guarantees apply only to Intact Learners.
  • Entangled Learners: a pair of learners A and B are entangled if, for any quorum Qa of A, and any quorum Qb of B, the intersection of Qa and Qb contains a correct validator. Some guarantees apply pairwise to Entangled Learners: they are, in a sense, guaranteed to agree on stuff.
  • weak quorum Weak Quorum: a set of validators that intersects every quorum. Weak Quorums are Learner-specific, so when we say weak quorum for every learner we mean a set of validators that intersects every quorum of every Learner.
  • transaction Transaction: data from clients to be ordered. We do not specify how it's formatted.
  • Batch: a set of transactions collected by a Worker.
  • erasure share Erasure Share: data transmitted to a weak quorum of listening workers, such that any Quorum of validators can re-construct the original data (Transaction or Batch of Transactions).
  • worker hash Worker Hash: a signed digest of a batch of transactions collected by (and signed) by a worker.
  • header Headers have:
    • an associated Primary (who "created" this header)
    • a set of Worker Hashes (from workers on the same validator as this primary)
    • an Availability Certificate for the previous Header issued by this primary
    • at most one Signed Quorum for each Learner
  • availability certificate Availability Certificate: an aggregation of signatures from a Weak Quorum attesting that everything referenced by a particular Header is available. Must include a signature from the Header's primary.
  • block Block: an aggregation of Header signatures from a quorum of a specific learner attesting that they will not attest to any conflicting header. Also includes an Availability Certificate. Should include all signatures a primary has gathered for that header at the time (signatures in the Availability Certificate count).
  • signed quorum Signed Quorum: a quorum of blocks with the same learner and round, signed by a primary. These are referenced in headers.

Data Structure

Heterogeneous Narwhal Availability Protocol

Availability Protocol Time-Space Diagram (note the giant curly-brace represents a Weak Quorum of validators)

Batches and Worker Hashes

When a worker has collected a batch of transactions, it transmits erasure shares (possibly full copies) of those transactions to other workers on a weak quorum for every learner of validators. What's important about this erasure coding is that any Quorum of any Learner can reconstruct every transaction. Furthermore, workers must be able to verify that they are in fact storing the correct Erasure Share of the data referenced in the Worker Hash. One way to accomplish this is to transmit a complete copy of all the data to an entire Weak Quorum for every Learner.

In fact, rather than wait until a batch is complete to start transmitting, workers can stream erasure shares as they receive transactions. When it has completed a batch, a worker also transmits a signed Worker Hash to those other workers, and its own primary. We do not specify when workers should complete batches, but perhaps it should be after some timeout, or perhaps primaries should signal workers to complete batches. Batches should not be empty.

Signed Quorums and Headers

Primaries ultimately produce blocks for each round, for each Learner, and send those blocks to other Primaries. When a primary for validator V has received blocks for learner L and round R from an entire quorum of validators for learner L, it signs that collection, producing a Signed Quorum object, which identifies the validator V, the learner L, and the round R. The Signed Quorum is then broadcast (or erasure coded) to primaries on a weak quorum for every learner of validators. Much like batches, it is important that any Quorum for any Learner can re-construct the entire Signed Quorum.

Periodically, each primary P produces Headers. Each Header contains:

  • a set of signed Worker Hashes, all signed by P's validator
  • a hash referencing at most one Signed Quorum per Learner, all signed by P
  • an Availability Certificate (we'll get to how those are made shortly) for the previous Header P issued. Headers should be relatively small. Each primary then sends the header to all the other primaries.

When a Primary receives a Header, it can produce an Availability Vote (which is a digital signature) iff

  • the primary has stored its share of all Signed Quorums referenced,
  • the primary has received messages from its workers indicating that they have stored their shares of all the Batches referenced The Availability Votes are then transmitted to the Header's Primary.

When a primary receives Availability Votes for a Header from a weak quorum for every learner, it can aggregate those signatures to produce an Availability Certificate, which proves that the Header (and its contents) are available to any Quorum. Availability Certificates should be small. Note that, if primaries broadcast Availability Certificates as soon as they produce them, other primaries may have all the components necessary to "receive" a Header even before the Header's Primary actually sends it. Specifically, they may have:

  • Signed Batch Headers from their listening Workers
  • Signed Quorum shares received earlier from the Primary
  • Availability Certificate received earlier from the Primary

Heterogeneous Narwhal Integrity Protocol

So far, only Signed Quorums have been Learner-specific: everything else requires a weak quorum for every learner. However, in the Integrity Protocol, almost everything is Learner-specific. Furthermore, Workers are not involved in the Integrity Protocol: only Primaries. Integrity Protocol Time-Space Diagram Each Header H features a predecessor H': the availability certificate in H references the header H'. When a Primary receives a Header H, it can produce an Integrity Vote iff it has not produced an Integrity vote for any other Header with the same predecessor as H In essence, this means that each correct Primary signs, for each other (even incorrect) Primary, a unique chain of Headers. This will ensure that no primary can produce conflicting blocks for entangled Learners. Integrity Votes are transmitted back to the Primary associated with the Header. In practice, a Integrity and Availability votes may be combined for Primaries who can cast both.

For each Header it produces, a Primary can calculate its Learner Vector: this represents, for each Learner, the highest round number of any quorum referenced in this Header or its ancestors (its predecessor, of its predecessor's predecessor, etc.). If, for some Learner L, a header H has a greater round number R in its Learner Vector for L than did H's predecessor, then the Primary can produce a Block for learner R and round L. Intuitively, a Primary produces a block whenever it gets a quorum for a Learner in a latest round.

A block for learner L includes an Availability Certificate, as well as an aggregated signature formed from the Integrity Votes of (at least) a quorum (for learner L) for the same Header. Blocks are transmitted to all other Primaries, who use them to form Signed Quorums.

If a Primary uses the same Header to make blocks for multiple Learners, each block it produces must use a superset of signatures as the previous. This ensures that if the Primary produces a block for Learner A and then a block for learner B, the Block for learner B effectively includes the block for learner A. We can use this when we later establish a total ordering: any reference to the learner B block also effectively references the learner A block.

Here is an example timeline of a Primary producing headers, availability certificates, and blocks. Blocks are color coded by learner and include a round number. Headers display Learner Vectors. Single Primary Timeline

DAG Properties

Independently, the blocks for each Learner form a DAG with the same properties as in the original Narwhal: Blue DAG (In these diagrams, blocks reference prior blocks from the same Primary; I just didn't draw those arrows)

Note that blocks reference a quorum of blocks from the previous round. This does not require that the same primary produced a block for the previous round.
In round 5, Primary 3 can produce a block if it has received a quorum of round 4 blocks from other Primaries.

Of course, primaries do not necessarily produce blocks for the same round at the same literal time. Here we see primaries producing blocks for round 3 for red learner at different times, depending on when they finish batches, or receive a round 2 quorum, or enough votes: Blue DAG In Heterogeneous Narwhal, these two DAGs are being created simultaneously (using the same sequence of Headers from each Primary, and many of the same Votes): Blue and Red DAG Note that round numbers for different learners do not have to be close to each other. Red round 3 blocks are produced after blue round 5 blocks, and that's ok.

Furthermore, rounds of different learners are not totally ordered. Red round 3 cannot really be said to happen before, or after, blue round 4.

Fair Broadcast

In Homogeneous Narwhal, any block which is referenced by a weak quorum in the following round will be (transitively) referenced by all blocks thereafter. Heterogeneous Narwhal has analogous guarantees:

Any block for learner A referenced by a weak quorum for learner A will, after 3 rounds, be (transitively) referenced by all future blocks of learners entangled with A.

Specifically, such a block B in round R, will be (transitively) referenced by all A-blocks in round R+2.

Consider the first round for learner B using at least a quorum of headers either used in A round R+2 or after their primaries' headers for A round R+2. Given that Learner B is entangled with A, any B-quorum for this round will be a descendant of an A-block from round R+2, and therefore, of B.

Consensus

Leader Path

In order to establish a total order of transactions, we use Heterogeneous Paxos to decide on an ever-growing path through the DAG (for each Learner). Heterogeneous Paxos guarantees that, if two Learners are entangled, they will decide on the same path. In order to guarantee liveness (and fairness) for each Learner's transactions, we require that:

For any accurate learner L, if one of L's quorums remains live, and an entire quorum of L issues blocks for round R, consensus will eventually append one of L's round-R blocks, or one of its descendants, to L's path.

Crucially, if two learners are not entangled, and their blocks never reference each other, consensus should not forever choose blocks exclusively from one learner. This does require a minimal amount of fairness from consensus itself: as long as blocks for learner L keep getting proposed (indefinitely), consensus should eventually append one of them to the path.

Choosing a total order

Given a consensus-defined path, we can impose a total order on all transactions which are ancestors of any block in the path. We require only that, given some block B in the path, all transactions which are ancestors of B are ordered before all transactions which are not ancestors of B. Among the transactiosn which are ancestors of B but not of its predecessor in the path, total order can be imposed by some arbitrary deterministic function.

Heterogeneous Paxos

Summary

This specification intends to describe how heteregenous Paxos can be realized in the blockchain context. Given well-defined quorums, the protocol allows a known set of chains to construct and carry out atomic transactions on a so-called chimera chain, subject to certain conditions. The chimera chain guarantees safety and liveness subject to the usual assumption: at most a third of the stake belongs to Byzantine participants.

The protocol involves three kinds of agents: proposers, acceptors and learners. A node may play the role of any combination of the three kinds of agents.

Blocks are agreed upon in rounds.

Proposer initate a round by a proposing a block. Each block contains atomic batches of transactions (or even a single transaction). An atomic batch of transactions means that either all transactions in the batch are executed or none of them are executed.

Acceptors are agents who agree on the proposed blocks and an agent may be acceptor for more than one chain. No correct acceptor acts on any invalid block. This requires checking the validation of blocks or state transition function.

Learners are set of agents where this set is intrested in a particualar (combination of?) chain(s) meaning what the voting process decides for these chain(s). The definition of learners is based on the quorums and defined by the protocol, meaning that agents are not free not choose their own quorum setups. Acceptors need to be aware of the different definitions of learners in order to be able to know what correct behavior is defined as. This set of the agents for learners, might empty or have overlaps, the acceptors have to follow the deifnition regardless.

We briefly describe how the communication for a consensus round works. Suppose we have two learners and which refer to agents that are interested in blockchain and blockchain . Proposers propose a chimera block, by sending messages, each carrying a value and unique ballot number (round identifier), to acceptors. All acceptors in all involved chains () send messages to each other to communicate that they’ve received a message. When an acceptor receives a message for the highest ballot number it has seen from a learner ’s quorum of acceptors, it sends a message labeled with and that ballot number. There is one exception: once a safe acceptor sends a message for a learner , it never sends a message with a different value for a learner , unless one of the following is true:

  • It knows that a quorum of acceptors has seen messages with learner and ballot number higher than .
  • It has seen Byzantine behavior that proves and do not have to agree.

A learner decides on a block when it receives messages with the same proposed block and ballot number from one of its quorums of acceptors.

Preliminaries

  • Base chains are two or more independent chains between which we would like to carry out atomic transactions. The chains must protocol-wise adhere to some specific set of requirements to be compatible. For example, IBC support.
  • A chimera chain is a chain that allows atomic transactions to be carried out on objects from the base chains. It carries an additional consensus mechanism, that is dependent on the consensus of the base chains.

  • A learner is a client that is interested in the value decided by the voting process. Learners might be full nodes, light client nodes, or nodes from other chains.

  • An acceptor is an agent participating in the voting process of the consensus protocol. Non-byzantine acceptors are called real acceptors.

  • A quorum is a subset of acceptors sufficient to make a learner decide on a value. For a learner to maintain consistency (avoid deciding on two contradictory values), any two of its quorums must have a common real acceptor. Most chains achieve this practically by making the intersection of the quorums big enough, i.e. the acceptors in the intersection being backed by more than >1/3 stake of each chain under <1/3 Byzantine assumption. For example, suppose:

    • Base chain A has a total stake of .
    • Base chain B has a total stake of .
    • Any set of acceptors backed by is a quorum of chain A. This means would have to back unsafe acceptors for chain A to fork.
    • Any set of acceptors backed by is a quorum of chain B. This means woudl have to back unsafe acceptors for chain B to fork.
    • Suppose that, for every quorum of chain A, and every quorum of chain B, the acceptors in are backed by , and . This would mean that, in order for atomic batches on the chimera chain to lose atomicity, and would have to back unsafe acceptors.
      • When a batch loses atomicity, the transactions on one state machine (say, A) are executed, but not the transactions on the other state machine (say, B). However, each state machine itself remains consistent: neither A nor B forks.
      • This means some chimera chains offer atomicity with lower (yet well-defined) levels of integrity than their base chains' no-fork guarantees.
  • A proposer is an acceptor that may propose a new block according to the rules of the blockchain. For Typhon, the "blocks" of the consensus protocol are the headers produced by the mempool. A potential proposer would need (a) data availability, (b) the ability to sign messages, ( c) something at stake (to prevent spam) and (d) the ability to communicate with the acceptors. Acceptors that are in the overlaps of quorums may especially well suited to be proposers, but other Acceptors (or even other machines) might be proposers as well. The Heterogeneous Paxos technical report effectively uses weighted voting for select proposer, but perhaps there are interesting tricks with VRFs that would improve efficiency.

Assumptions

  1. Acceptors are staked: An acceptor has a certain amount of stake backing them. This stake is either fully their own stake or is partly delegated to them by other token holders.
  2. Quorums are known: Quorums are defined implicitly by the internal logic of each chain. For most proof-of-stake chains, a quorum is a subset of acceptor that have of stake backing them.
  3. Large Overlap of Quorums: A practical way to ensure a safe acceptors in the overlap between quorums. To guarantee atomicity with the same integrity as base chains, each quorum overlap must be backed by of the stake of each base chain.
  4. Connectivity: All acceptors in a chimera chain communicate with each other (even if that means talking to acceptors who are not on the same base chain). This communication can be indirect (i.e. gossiping through other acceptors), so long as all pairs of honest acceptors can always communicate.
  5. Only one learner per main chain: We model the learner graph as one learner per main chain, and then full nodes can instantiate their base chain's learner to make decisions. The reason is that in Heterogeneous Paxos the learner graph is a closed graph with known nodes, while in the blockchain context we have a open network where we don't know who all the learners are.

Chimera Chain

In this section we describe how chimera chains operate.

Upon wanting to include an atomic batch of transactions from the transaction pool into a block, a block proposer either creates a genesis block if there is no existing chimera chain or builds on top of an existing chimera chain.

Genesis block

In order to be safe, the guarantee we want here is that future quorum updates on the main chains are guaranteed to be received before voting happens on chimera chains.

  1. Create a genesis block that claims to be a "chimera chain" and allocate some unique name or index.
  2. Register (in a transaction) that genesis block on all base chains and allocate it some unique name, so they know to involve it in any future quorum updates. The guarantee we want here is that future quorum updates are guaranteed to be received before votes happen.
  3. The first block append on the chimera chain requires IBC messages from all base chains explaining that they have registered this genesis block.

Producing Blocks

The block consists of transactions from both chains or just on of the chains. The transaction can be bundled together to make sure they are carried out atomically.

It is possible that more than one block may be finalized like in Grandpa (from Polkadot project) decoupling block prduction from finalization. In thi scase, Heterogeneous Paxos would serve the roll of a "finality gadget" in Grandpa's terms, but blocks could be produced at any rate.

Moving Objects

Suppose state in each state machine is expressed in terms of objects (think object-oriented programming) with state and methods, and a location (each is located on a chain). Our ideas about "movable objects" should be applicable to any program entities which feature:

  • a unique identifier
  • a permanent set of public methods (an API) callable by users, or other methods on other objects in the same state machine, which can compute and return values
  • a set of private methods callable only by other methods on this object
  • mutable state (which only methods can mutate)

These are much like "smart contracts," which also carry internal mutable state and public APIs much like methods. An example of an "object" is any sort of account: multisignature account or single user account:

  • accounts usually have a unique identifier
  • public methods include "send money," which takes as input some signed authorization and deducts money from the account.
  • mutable state includes some value representing the amount of money in the account.

One might want to move an object to another state machine. For simplicity, let's suppose it's a state machine with the same quorums (on a different, possibly chimera, chain). This is not so very different from various chains trying to get a lock on an account (objects) and then perform atomic operations on them.

This "move" operation should require an IBC send, and then deleting the original object. Or instead of deleting the original object the original object can be made a pointer to the object which now lives primarily on the destination chain. On the destination state machine, an IBC receive should be followed by creating a new copy of the object. Any "object identifiers" found in pointers from other objects will have to be preserved, which could be tricky.

Permissions for moving Objects

We have to consider who is allowed to move which objects where. One way to do this is to have "move" simply be a "private method" of any object: the programmer has to specifically program in any methods that the transactions or other objects can call to move the object. The most straightforward private method definition would allow anyone (e.g.,owners) who can control the object be able to give permission for moving.

We may want to allow chains to specify limits on what objects can move onto that chain. For example, they may require that objects have a method that can move them back to a base chain.

Epoch Change

If new quorums for each base chain do not overlap on a real acceptor, atomicity cannot be guaranteed. We can no longer meaningfully commit atomic batches to the chimera chain. However, extensions to the chimera chain are consistent from the point of view of each base chain (but different base chains may "perceive" different extensions). This allows us to, for example, move objects to base chains even after the chimera chain loses atomic guarantees.

Kill Chains

Likewise, it's probably useful to kill chimera chains no one is using anymore. If we don't kill them, when quorums change all of chimera chains need to get an update, and that can become infeasible. One possibility is to allow chains with no objects on them (everything has moved out) to execute a special "kill" transaction that prohibits any future transactions, and sends an IBC message to all parent chains. On receipt, parent chains could remove that chimera chain from their registry.

Protocol Description

This section describes how a chimera block is appended to the existing chimera chain assuming all the setup has taken place.

For simplicity, this protocol is written as if we're deciding on one thing: the specific block that belongs on a specific blockchain at a specific height. All state and references are specific to this blockchain / height pair. We do not care about messages before this height. This height may be defined using the last finalized block in the blockchain.

Learner Graph

For each learner , we call its set of quorums .

For each pair of learners and , there are a specific set of conditions under which they require agreement. We enumerate these as , a set of "safe sets" of acceptors: whenever at least one set in is composed entirely of safe (non-byzantine, but possibly crashed) acceptors, and must not decide different things.

We designate the set of acceptors who are actually safe . This is of course determined at runtime, and unknown to anyone.

Messages

The protocol is carried out by sending and receiving messsages. The table below describes the structure of a typical heteregenous paxos message.

FieldTypeDescription
chainIdIdIdentifies this Chimera Chain
heightuint64Current height of the chimera chain
pktTypeenum PktType {1A, 1B, 2A}
ballotBallot = (Timestamp, Hash)This consists of the hash of the proposed block and the timestamp of the initiated ballot. There is a total lexicographical order on Ballot.
learnerId
sigSigThe signature field sig unforgeably identifying its sender, e.g., the message is signed, and the sender has a known PK in some PKI.
refsVec<Hash>The list of hashes of messages received previously.

In general, acceptor relay all sent or received messages to all learners and other acceptors. This ensures that any message received by a real acceptor is received by all real acceptors and learners.

Definition: Message Signer

Definition: Message Set Signers

We can define over sets of messages, to mean the set of signers of those messages:

Messages also contain a field refs, which includes chained hashes of every message the sender has sent or received since (and including) the last message it sent. As a result, we can define the transitive references of a message, which should include every message the sender has ever sent or received:

Definition: Transitive References

To ensure that acceptors and learners fully understand each message they receive, they delay doing any computation on it (sometimes called delivery) until they have received all the messages in refs. As a result, acceptors and learners will always process messages from any given sender in the order they were sent, but also from any messages that sender received, and recursively.

Consensus Round: Ballot

Next, we briefly describe how the communication for a consensus round works. Consensus is reached in four steps: proposing the chimera block, acknowledging receipt of proposals, establishing consensus, and termination.

Suppose we have two learners and from two different blockchains.

message: proposing blocks

A proposer proposes a new block by sending a message to all acceptors, which includes

  • a value (the atomic transaction for example)
  • a unique ballot number (round identifier)
  • a message containing the hash of the proposed block along with a time stamp of the ballot initiation.

If there is an existing chimera chain, the proposer can built upon that chain and if the proposer is starting a new chimera chain it needs to lock some funds for that.

Also, the acceptor needs to check validity as soon as possible: don't even "receive" an invalid proposal (or at least don't send a "1b" message in response).

message: acknowledging receiving proposals

On receipt of a message, an acceptor sends an ackowledgement of its receipt to all other acceptors and learners in the form of message.

message: establishing consensus

When an acceptor receives a message for the highest ballot number it has seen from a learner ’s quorum of acceptors, it sends a message labeled with and that ballot number.

There is one exception: once a safe acceptor sends a message for a learner , it never sends a message with a different value for a learner , unless one of the following is true:

  • It knows that a quorum of acceptors has seen messages with learner and ballot number higher than .
  • It has seen Byzantine behavior that proves and do not have to agree.
Specifics of establishing Consensus

In order to make a learner decide, we need to show that another, Entangled learner could not already have decided.

Definition: Entangled

In an execution, two learners are entangled if their failure assumptions matched the failures that actually happen:

If some learner does not agree with some other learner , then learner cannot possibly agree with both and .

Definition: Heterogeneous Agreement
  • Within an execution, two learners have agreement if all decisions for either learner have the same value.
  • A heterogeneous consensus protocol has agreement if, for all possible executions of that protocol, all entangled pairs of learners have agreement.
Definition: Accurate Learner

An accurate learner is entangled with itself:

A learner whose quorums contain too many failures is inaccurate. This is the equivalent of a chain that can fork.

In order to prevent entangled disagreement, we must define the conditions that will ultimately make learners decide:

Definition: Get1a

It is useful to refer to the that started the ballot of a message: the highest ballot number in its transitive references.

Definition: Ballot Numbers

The ballot number of a is part of the message, and the ballot number of anything else is the highest ballot number among the s it (transitively) references.

Definition: Value of a Message

The value of a is part of the message, and the value of anything else is the value of the highest ballot among the messages it (transitively) references.

Terminate: Finalizing blocks

A learner decides when it receives messages with the same ballot number from one of its quorums of acceptors.

If no decision can be reached within a certain time, proposers must begin a new round (with a higher timestamp, and thus a higher Ballot). Proposers can start a new round by proposing a new block or by trying to finalize the same block again (in case there was no consensus).

Definition: Decision

A learner decides when it has observed a set of messages with the same ballot, sent by a quorum of acceptors. This will allow the learner to decide on the value of the messages in the set. We call such a set a decision:

Now we are ready to discuss what makes a Well-Formed message. This requires considering whether two learners might be entangled, and (unless we can prove they are not engangled), whether one of them might have already decided:

Definition: Caught

Some behavior can create proof that an acceptor is Byzantine. Unlike Byzantine Paxos, our acceptors and learners must adapt to Byzantine behavior. We say that an acceptor is Caught in a message if the transitive references of the messages include evidence such as two messages, and , both signed by , in which neither is featured in the other's transitive references (safe acceptors transitively reference all prior messages).

Slashing: Caught proofs can be used for slashing.

Definition: Connected

When some acceptors are proved Byzantine, clearly some learners need not agree, meaning that isn't in the edge between them in the CLG: at least one acceptor in each safe set in the edge is proven Byzantine. Homogeneous learners are always connected unless there are so many failures no consensus is required.

Definition: Buried

A message can become irrelevant if, after a time, an entire quorum of acceptors has seen s with different values, the same learner, and higher ballot numbers. We call such a buried (in the context of some later message ):

Definition: Connected 2a Messages

Entangled learners must agree, but learners that are not connected are not entangled, so they need not agree. Intuitively, a message references a message to demonstrate that some learner may have decided some value. For learner , it can be useful to find the set of messages from the same sender as a message (and sent earlier) which are still unburied and for learners connected to . The cannot be used to make any new messages for learner that have values different from these messages.

Definition: Fresh

Acceptors send a message whenever they receive a message with a ballot number higher than they have yet seen. However, this does not mean that the 's value (which is the same as the 's) agrees with that of messages the acceptor has already sent. We call a message fresh (with respect to a learner) when its value agrees with that of unburied messages the acceptor has sent.

Definition: Quorums in Messages

messages reference quorums of messages with the same value and ballot. A 's quorums are formed from fresh messages with the same ballot and value.

Definition: Well-Formed

We define what it means for a message to be well-formed.

An acceptor who has received a sends a for every learner for which it can produce a Well-formed .

Before processing a received message, acceptors and learners check if the message is well-formed. Non-wellformed messages are rejected.

Incentive Model

Goal:

  • Incentivizing token holders to put down their stake for security
  • Disincentivizing byzantine behavior

Rewards:

  • Participating in consensus based on backing stake: this includes validating and voting
  • Producing blocks

Slashing: there are a number of offenses

  • Invalid blocks
  • Equivocation (caught)

Fees

The first question is once there is no demand for atomic batches of transactions, do we keep the chimera chain alive?

We need to figure out whether killing the chimera chain (and potentially requiring a new genesis later) is not too expensive.

The second question is whether we update the quroum changes whenever they happen or when we have a transaction?

The first option is expensive and can lead to attack if not handled well. We need to figure out whether the latter option is safe.

  • If the answer is yes for first question and we pick the first option for the second question:

Since all chimera chains need to be updated when the quorums on the base chains change, we need to figure out who pays for these updates. For example, a chimera chain might have had a block produced last week, but since the has been updated 200 times that is 200 blocks that do not have any transactions with transaction fees. If this is not paid by anyone, it becomes a burden for acceptors and an attack vector for the adversary.

We may need to add a "locked" fee for making new chimera chains. In particular, we don't want an attacker to make a lot of chains, forcing base chains to update all of them with each quorum change.

Alternatively, each "chimera chain" could keep an account on each base chain that is funded by a portion of transaction fees, from which a small fee is extracted with each validator update. When the account runs dry, the (parent)base chains are allowed to kill the chimera chain (even if there are still objects on there). We could allow anyone to contribute to these accounts. We could even prohibit anyone who did not contribute from sending IBC messages between chimera and parent chains.

Security Discussion

Note that the chimera cannot guarantee atomicty under the same adversary assumption as the base chains. For example, if we assume the adversary to control less than 1/3 of the stake to assure safety on the base chains, atomicity is not guaranteed for such an adversary but only a weaker one. This is important for users so they can decide whether for their transaction chimera chians would be secure enough.

By setting the quorums of each learner to be the same as the quorums of the corresponding base chain, we ensure that each learner's view is as consistent as the base chain. Specifically, two instantiations of the learner for some base chain should decide on the same blocks in any chimera chain, unless the adversary is powerful enough to fork chain .

"Accurate" (or "self-engangled") learners (defined above) correspond to base chains where the adversary is in fact powerful enough to fork the chain. Proving that a learner is accurate is equivalent to proving that its base chain cannot fork.

Two learners and corresponding to different base chains will decide on the same blocks (which is what makes atomic batches useful), so long as one of the safe sets in is composed entirely of safe acceptors. The stake backing these safe sets represents the "assurance" that atomic batches remain atomic, as well as the maximum slashing punishment should they lose atomicity.

Loss of atomicity is a bit like a "trusted bridge" that turns out not to be trustworthy: each state machine within the chimera chain has as much integrity as its base chain, but atomicity properties of multi-state-machine atomic batches have a lesser, but still well-defined, guarantee. Loss of atomicty allows double spending on the chimera chain. And while misbehavior has happened in such an attack it is not trivial to slash the misbehaving acceptor since according to each chain everything has been carried out correctly.

Open Challenges

Programming Model

Atomic Batches

We'll need a way to specify that a batch of transactions should be committed together or not at all. Ideally, this should communicate to the programmer how reliably this atomicity is (see "practical considerations" below). On an chimera chain, batches can include transactions from any of their "main chains". If we want to have match-making, transactions will need to be able to say something like "if I'm in an atomic batch that matches these criteria, then do this...".

Each atomic batch should be scheduled within one block.

(We encode transactions with Portobuff and Borsht.) We need define structures such that transactions can be bundled and cannot be carried out separately.

Atomic Communication

Transactions within an atomic batch need to be able to send information to each other (and back). For example, in a market with a fluctuating exchange rate, a market transaction could send a message to an account, which transfers money, and sends a message to another account, which transfers goods. We need a language in which this communication takes place with minimal constraints on the state machines involved, so we should probably adapt the IBC communication language.

We need to figure out inter-chain communication works for transactions communicating with each other within an atomic batch.

Can we have synchronous (in terms of blocks) IBC?

Yes: when the quorums involved in two chains are the same, then we can guarantee that (unless there are sufficient failures to fork the chain) an IBC message sent in one chain will arrive at the other chain (in a transaction) within a specific number (perhaps as few as 1?) of blocks. This is because some of the machines in the quorum that approved the "send" transaction must be in the quorum that decides the next block, so they can refuse to decide on blocks that lack the "receive" transaction.

Note that we may need to rate-limit sends to ensure that all sends can be received in time (availability).

Note also that blinding transactions makes this hard.

Match Making

We can in priciple do cross-chain match-making. If we want an on-chain market, an chimera chain might be a good place to do it. However, full nodes in the gossip layer might be able to gather sets of transactions that match the transactions' "If I'm in an atomic batch ..." criteria, bundle them into atomic batches, and then send those off to be committed.

We may want to incorporate some incentive scheme for good match-making. Matchmakers include nodes who are full nodes on both chains, and in principle could include any node who sees the request transactions.

Changing Quorums?

2 Phase Commit

We could require that any quorum-chainging transaction has to be 2-phase committed. Essentially, the "main chain" announces that it won't progress beyond a certain block until everyone has appended a new block that sets the (same) new quorums, and sends a response by IBC. It can then progress only with IBC responses from all the other chains that use these quorums.

Synchronous IBC

We may be able to leverage our "synchronous" IBC idea above for faster quorum changes. The difficulty is that it allows a finite number of blocks to be appended to the chimera chains before they receive the quorum change method. These chains can be arbitrarily slow, so that could take arbitrarily long.

Need to figure out inter-machine communication for acceptors, since they might run many machines.


Discussion Questions /Practical Considerations

  • Optimizing messgaing: Pipelining (from Hotstuff), Kauri, BFTree
  • Replicating state machines
  • Probems Tendermint has:
    • Doesnt allow many validators
    • Lightclient design
  • Optimizing recoveryfrom slow finalization: Separating block prosuction from finalizing, finalzing more than one block
  • ABCI ++? Another version of it
  • Look into other papers of Dalia Malkhi / Fast HotStuff?

Execution Engine

Summary

Given a total order (from the consensus) of transactions (from the mempool), the execution engine updates and stores the "current" state of the virtual machine, using as much concurrency as possible. Proofs from the execution engine allow light clients to read the current state. When the execution engine has finished with a transaction, it communicates to the mempool that the transaction can be garbage-collected from storage.

Vocabulary

  • Shards are processes that store and update state.

    • Different shards may be on different machines. Redistributing state between shards is called Re-Sharding.
    • Each Shard is specific to 1 learner. However, as an optimization, an implementation could conceivably use 1 process to do the work of 2 shards with different learners so long as those shards are identical, and fork that process if / when the learners diverge.
  • Executors are processes that actually run the VM and compute updates. Executors should probably be co-located with shards.

Either:

  • We assume the Mempool is using the Heterogeneous Narwhal setup?
    • In which case Consensus picks leaders in the DAG
  • Mempool is treated as some kind of black-box set of processes that can each transmit transactions to Shards.
    • In which case Consensus produces more detailed ordering information Perhaps we should have some general notion of Timestamp on transactions?

The VM is largely a black box: we assume we can give it a set of input key-value pairs and a transaction, and get output key-value pairs.

State

State is stored as mutable Values (unlimited size blobs of binary), each of which is identified with an immutable Key. If you want to mutate a Key associated with a specific Value, that's equivalent to deleting the Value associated with the old Key, and writing it to the new Key. Keys that have never had a Value written to them are mapped to an empty value.

For each Learner, all processes can map Transactions to a set of Shards whose state they read, and a set of shards whose state they write. This makes Re-Sharding challenging.

One way to implement this is to partition the space of Keys across Shards, and Label each Transaction with a Sub-Space of keys it touches. One possible Key-space would be to arrange Keys in some kind of a tree configuration.

Mempool Interface

We assume, for each Learner, that each transaction has a unique Executor: It would be more efficient if is co-located with a shard in or . As an optimization, we can have one process do the work of multiple learners' executors, so long as those learners are identical.

We assume that each transaction carries a timestamp: We assume that these timestamps have an unknown total order, and that Consensus and the Mempool can update Shards' knowledge of this total order. In particular, we assume that Consensus informs shards of an ever-growing prefix of this total order.

  • One way to accomplish this is simply to have each timestamp be the hash of the transaction, and have consensus stream a totally ordered list of all hashes included in the chain to all Shards. This may not be very efficient.
  • We could instead consider one timestamp to be definitely after another if it is a descendent in the Narwhal DAG. Narwhal workers could transmit DAG information to Shards, and shards would learn some partial ordering information before hearing from Consensus. Consensus could then transmit only the Narwhal blocks it decides on, and shards could determine a total ordering from there.

The Mempool transmits each transaction to its executor as soon as possible, using network primitives. The for each transaction that reads or writes state on shard , the Mempool also transmits to shard :

We assume that each Shard maintains Timestamps bound below which it will no longer receive new transactions. Specifically, a timestamp below which it will no longer receive new transactions that read from its state, and a timestamp below which it will no longer receive new transactions that write to its state. Generally, we expect that , but I don't know that we require this to be true. It should update this bound based on information from the Mempool. For example, it could maintain partial bounds from each mempool worker (updated whenever that mempool worker sends the Shard a message), and implement and as the greatest lower bound of all the partial bounds.

Consensus Interface

Consensus needs to update each Shard's knowledge of the total order of timestamps. In particular, we assume that Consensus informs shards of an ever-growing prefix of this total order.

  • One way to accomplish this is simply to have each timestamp be the hash of the transaction, and have consensus stream a totally ordered list of all hashes included in the chain to all Shards. This may not be very efficient.
  • We could instead consider one timestamp to be definitely after another if it is a descendent in the Narwhal DAG. Narwhal workers could transmit DAG information to Shards, and shards would learn some partial ordering information before hearing from Consensus. Consensus could then transmit only the Narwhal blocks it decides on, and shards could determine a total ordering from there.

Execution

For each learner , for each Transaction , executors wait to receive values for all keys in , then compute the transaction, and transmit to each shard any value stored on in .

Generally, transactions do not have side effects outside of state writes. However, we could in principle encode client reads as read-only transactions whose side-effect is sending a message, or allow for VMs with other side effects.

Executors can save themselves some communication if they're co-located with Shards. As an optimization, we can save on communication by combining messages for multiple learners if their content is identical and their shards are co-located. Likewise, we can save on computation by using one process to execute for multiple learners so long as they are identical.

State Updates

For each key in its state, each shard needs to establish a total order of all writes between and . Reads to each key need to be ordered with respect to writes to that key.

To accomplish this, each Shard maintains a Dependency Multi-Graph of all Shard Summaries they have received, where Summary depends on Summary if the Shard doesn't know that , and can read from a key to which can write. Specifically, if the Shard doesn't know that , then for each key that can write to and can read or write, create an edge labeled with . There can be cycles in the dependency multi-graph, but these will resolve as the Shard learns more about the total order from consensus.

Concurrently, for any Summary that no longer depends on any other Summary, if :

  • transmit the values written most recently before for any key on in to
  • upon receiving the values for any key on in from :
    • record that that value is written to key at .
    • delete edges labeled from the dependency graph.
    • As an optimization, we may want a compact "don't change this value" message.
  • When every value in has been updated, delete from the dependency graph.

Note that read-only transactions can arrive with timestamps before . These need to be added to the dependency graph and processed just like all other transactions.

Garbage Collection

Each Shard can delete all but the most recent value written to each key before .

Once all of a transaction's Executors (for all learners) have executed the transaction, we can garbage collect it. We no longer need to store that transaction anywhere.

Client Reads

Read-only transactions can, in principle, bypass Mempool and Consensus altogether: they only need to arrive at each of the relevant shards , and have a timestamp greater than . They could also be executed with a side effect, like sending a message to a client.

We can use these read-only transactions to construct checkpoints: Merkle roots of portions of state, building up to a Merkle root of the entire state.

Light client reads only need some kind of signed message produced by an executor from each of a weak quorum of validators. They do not, technically, need a Merkle root of the entire state at all. However, it may be more efficient to get a single signed message with a Merkle root of state, and then only one Validator needs to do the read-only transaction. To support this kind of thing, we may want to lag well behind , so we can do reads on recent checkpoints.

Consensus

Namada uses Tendermint Go through the tendermint-rs bindings.

Namada uses the P2P layer built into Tendermint Go.

Economics

Proof of Stake (PoS)

This section of the specification describes the proof-of-stake mechanism of Namada, which is largely modeled after Cosmos bonded proof-of-stake, but makes significant changes to bond storage representation, validator set change handling, reward distribution, and slashing, with the general aims of increased precision in reasoning about security, validator decentralisation, and avoiding unnecessary proof-of-stake-related transactions.

This section is split into three subcomponents: the bonding mechanism, reward distribution, and cubic slashing.

Introduction

Blockchain system rely on economic security to prevent abuse and for actors to behave according to protocol. The aim is that economic incentive promote correct and long-term operation of the system and economic punishments would discourage diverting from correct protocol execution either by mistake or with the intent to carrying out attacks. Many PoS blockcains rely on the 1/3 Byzantine rule, where they make the assumption the adversary cannot control more 2/3 of the total stake or 2/3 of the actors.

Goals of Rewards and Slashing: Liveness and Security

  • Security: Delegation and Slashing: we want to make sure validators backed by enough funds to make misbehaviour very expensive. Security is achieved by punishing (slashing) if they do. Slashing locked funds (stake) intends to disintensivize diverting from correct execution of protocol, which is this case is voting to finalize valid blocks.
  • Liveness: Paying Rewards. For continued operation of Namada we want to incentivize participating in consensus and delegation, which helps security.

Security

In blockchain system we do not rely on altruistic behavior but rather economic security. We expect the validators to execute the protocol correctly. They get rewarded for doing so and punished otherwise. Each validator has some self-stake and some stake that is delegated to it by other token holders. The validator and delegators share the reward and risk of slashing impact with each other.

The total stake behind consensus should be taken into account when value is transferred via a transaction. The total value transferred cannot exceed 2/3 of the total stake. For example, if we have 1 billion tokens, we aim that 300 Million of these tokens is backing validators. This means that users should not transfer more than 200 million of this token within a block.

Bonding mechanism

Epoch

An epoch is a range of blocks or time that is defined by the base ledger and made available to the PoS system. This document assumes that epochs are identified by consecutive natural numbers. All the data relevant to PoS are associated with epochs.

Epoched data

Epoched data are data associated with a specific epoch that are set in advance. The data relevant to the PoS system in the ledger's state are epoched. Each data can be uniquely identified. These are:

Changes to the epoched data do not take effect immediately. Instead, changes in epoch n are queued to take effect in the epoch n + pipeline_length for most cases and n + unboding_length for unbonding actions. Should the same validator's data or same bonds (i.e. with the same identity) be updated more than once in the same epoch, the later update overrides the previously queued-up update. For bonds, the token amounts are added up. Once the epoch n has ended, the queued-up updates for epoch n + pipeline_length are final and the values become immutable.

Entities

  • Validator: An account with a public consensus key, which may participate in producing blocks and governance activities. A validator may not also be a delegator.
  • Delegator: An account that delegates some tokens to a validator. A delegator may not also be a validator.

Additionally, any account may submit evidence for a slashable misbehaviour.

Validator

A validator must have a public consensus key. Additionally, it may also specify optional metadata fields (TBA).

A validator may be in one of the following states:

  • inactive: A validator is not being considered for block creation and cannot receive any new delegations.
  • candidate: A validator is considered for block creation and can receive delegations.

For each validator (in any state), the system also tracks total bonded tokens as a sum of the tokens in their self-bonds and delegated bonds. The total bonded tokens determine their voting voting power by multiplication by the votes_per_token parameter. The voting power is used for validator selection for block creation and is used in governance related activities.

Validator actions

  • become validator: Any account that is not a validator already and that doesn't have any delegations may request to become a validator. It is required to provide a public consensus key and staking reward address. For the action applied in epoch n, the validator's state will be set to candidate for epoch n + pipeline_length and the consensus key is set for epoch n + pipeline_length.
  • deactivate: Only a validator whose state at or before the pipeline_length offset is candidate account may deactivate. For this action applied in epoch n, the validator's account is set to become inactive in the epoch n + pipeline_length.
  • reactivate: Only an inactive validator may reactivate. Similarly to become validator action, for this action applied in epoch n, the validator's state will be set to candidate for epoch n + pipeline_length.
  • self-bond: A validator may lock-up tokens into a bond only for its own validator's address.
  • unbond: Any self-bonded tokens may be partially or fully unbonded.
  • withdraw unbonds: Unbonded tokens may be withdrawn in or after the unbond's epoch.
  • change consensus key: Set the new consensus key. When applied in epoch n, the key is set for epoch n + pipeline_length.

Active validator set

From all the candidate validators, in each epoch the ones with the most voting power limited up to the max_validator_slots parameter are selected for the active validator set. The active validator set selected in epoch n is set for epoch n + pipeline_length.

Delegator

A delegator may have any number of delegations to any number of validators. Delegations are stored in bonds.

Delegator actions

  • delegate: An account which is not a validator may delegate tokens to any number of validators. This will lock-up tokens into a bond.
  • undelegate: Any delegated tokens may be partially or fully unbonded.
  • withdraw unbonds: Unbonded tokens may be withdrawn in or after the unbond's epoch.

Bonds

A bond locks-up tokens from validators' self-bonding and delegators' delegations. For self-bonding, the source address is equal to the validator's address. Only validators can self-bond. For a bond created from a delegation, the bond's source is the delegator's account.

For each epoch, bonds are uniquely identified by the pair of source and validator's addresses. A bond created in epoch n is written into epoch n + pipeline_length. If there already is a bond in the epoch n + pipeline_length for this pair of source and validator's addresses, its tokens are incremented by the newly bonded amount.

Any bonds created in epoch n increment the bond's validator's total bonded tokens by the bond's token amount and update the voting power for epoch n + pipeline_length.

The tokens put into a bond are immediately deducted from the source account.

Unbond

An unbonding action (validator unbond or delegator undelegate) requested by the bond's source account in epoch n creates an "unbond" with epoch set to n + unbounding_length. We also store the epoch of the bond(s) from which the unbond is created in order to determine if the unbond should be slashed if a fault occurred within the range of bond epoch (inclusive) and unbond epoch (exclusive).

Any unbonds created in epoch n decrements the bond's validator's total bonded tokens by the bond's token amount and update the voting power for epoch n + unbonding_length.

An "unbond" with epoch set to n may be withdrawn by the bond's source address in or any time after the epoch n. Once withdrawn, the unbond is deleted and the tokens are credited to the source account.

Note that unlike bonding and unbonding where token changes are delayed to some future epochs (pipeline or unbonding offset), the token withdrawal applies immediately. This because when the tokens are withdrawable, they are already "unlocked" from the PoS system and do not contribute to voting power.

Staking rewards

Until we have programmable validity predicates, rewards can use the mechanism outlined in the F1 paper, but it should use the exponential model, so that withdrawing rewards more frequently provides no additional benefit (this is a design constraint we should follow in general, we don't want to accidentally encourage transaction spam). This should be written in a way that allows for a natural upgrade to a validator-customisable rewards model (defaulting to this one) if possible.

To a validator who proposed a block, the system rewards tokens based on the block_proposer_reward system parameter and each validator that voted on a block receives block_vote_reward.

Slashing

An important part of the security model of Namada is based on making attacking the system very expensive. To this end, the validator who has bonded stake will be slashed once an offence has been detected.

These are the types of offences:

  • Equivocation in consensus
    • voting: meaning that a validator has submitted two votes that are confliciting
    • block production: a block producer has created two different blocks for the same height
  • Invalidity:
    • block production: a block producer has produced invalid block
    • voting: validators have voted on invalid block

Unavailability is not considered an offense, but a validator who hasn't voted will not receive rewards.

Once an offence has been reported:

  1. Kicking out
  2. Slashing
  • Individual: Once someone has reported an offence it is reviewed by validarors and if confirmed the offender is slashed.
  • cubic slashing: escalated slashing

Instead of absolute values, validators' total bonded token amounts and bonds' and unbonds' token amounts are stored as their deltas (i.e. the change of quantity from a previous epoch) to allow distinguishing changes for different epoch, which is essential for determining whether tokens should be slashed. However, because slashes for a fault that occurred in epoch n may only be applied before the beginning of epoch n + unbonding_length, in epoch m we can sum all the deltas of total bonded token amounts and bonds and unbond with the same source and validator for epoch equal or less than m - unboding_length into a single total bonded token amount, single bond and single unbond record. This is to keep the total number of total bonded token amounts for a unique validator and bonds and unbonds for a unique pair of source and validator bound to a maximum number (equal to unbonding_length).

To disincentivize validators misbehaviour in the PoS system a validator may be slashed for any fault that it has done. An evidence of misbehaviour may be submitted by any account for a fault that occurred in epoch n anytime before the beginning of epoch n + unbonding_length.

A valid evidence reduces the validator's total bonded token amount by the slash rate in and before the epoch in which the fault occurred. The validator's voting power must also be adjusted to the slashed total bonded token amount. Additionally, a slash is stored with the misbehaving validator's address and the relevant epoch in which the fault occurred. When an unbond is being withdrawn, we first look-up if any slash occurred within the range of epochs in which these were active and if so, reduce its token amount by the slash rate. Note that bonds and unbonds amounts are not slashed until their tokens are withdrawn.

The invariant is that the sum of amounts that may be withdrawn from a misbehaving validator must always add up to the total bonded token amount.

System parameters

The default values that are relative to epoch duration assume that an epoch last about 24 hours.

  • max_validator_slots: Maximum active validators, default 128
  • pipeline_len: Pipeline length in number of epochs, default 2 (see https://github.com/cosmos/cosmos-sdk/blob/019444ae4328beaca32f2f8416ee5edbac2ef30b/docs/architecture/adr-039-epoched-staking.md#pipelining-the-epochs)
  • unboding_len: Unbonding duration in number of epochs, default 6
  • votes_per_token: Used in validators' voting power calculation, default 100‱ (1 voting power unit per 1000 tokens)
  • block_proposer_reward: Amount of tokens rewarded to a validator for proposing a block
  • block_vote_reward: Amount of tokens rewarded to each validator that voted on a block proposal
  • duplicate_vote_slash_rate: Portion of validator's stake that should be slashed on a duplicate vote
  • light_client_attack_slash_rate: Portion of validator's stake that should be slashed on a light client attack

Storage

The system parameters are written into the storage to allow for their changes. Additionally, each validator may record a new parameters value under their sub-key that they wish to change to, which would override the systems parameters when more than 2/3 voting power are in agreement on all the parameters values.

The validators' data are keyed by the their addresses, conceptually:

type Validators = HashMap<Address, Validator>;

Epoched data are stored in the following structure:

struct Epoched<Data> {
  /// The epoch in which this data was last updated
  last_update: Epoch,
  /// Dynamically sized vector in which the head is the data for epoch in which 
  /// the `last_update` was performed and every consecutive array element is the
  /// successor epoch of the predecessor array element. For system parameters, 
  /// validator's consensus key and state, `LENGTH = pipeline_length + 1`. 
  /// For all others, `LENGTH = unbonding_length + 1`.
  data: Vec<Option<Data>>
}

Note that not all epochs will have data set, only the ones in which some changes occurred.

To try to look-up a value for Epoched data with independent values in each epoch (such as the active validator set) in the current epoch n:

  1. let index = min(n - last_update, pipeline_length)
  2. read the data field at index:
    1. if there's a value at index return it
    2. else if index == 0, return None
    3. else decrement index and repeat this sub-step from 1.

To look-up a value for Epoched data with delta values in the current epoch n:

  1. let end = min(n - last_update, pipeline_length) + 1
  2. sum all the values that are not None in the 0 .. end range bounded inclusively below and exclusively above

To update a value in Epoched data with independent values in epoch n with value new for epoch m:

  1. let shift = min(n - last_update, pipeline_length)
  2. if shift == 0:
    1. data[m - n] = new
  3. else:
    1. for i in 0 .. shift range bounded inclusively below and exclusively above, set data[i] = None
    2. rotate data left by shift
    3. set data[m - n] = new
    4. set last_update to the current epoch

To update a value in Epoched data with delta values in epoch n with value delta for epoch m:

  1. let shift = min(n - last_update, pipeline_length)
  2. if shift == 0:
    1. set data[m - n] = data[m - n].map_or_else(delta, |last_delta| last_delta + delta) (add the delta to the previous value, if any, otherwise use the delta as the value)
  3. else:
    1. let sum to be equal to the sum of all delta values in the i in 0 .. shift range bounded inclusively below and exclusively above and set data[i] = None
    2. rotate data left by shift
    3. set data[0] = data[0].map_or_else(sum, |last_delta| last_delta + sum)
    4. set data[m - n] = delta
    5. set last_update to the current epoch

The invariants for updates in both cases are that m - n >= 0 and m - n <= pipeline_length.

For the active validator set, we store all the active and inactive validators separately with their respective voting power:

type VotingPower = u64;

/// Validator's address with its voting power.
#[derive(PartialEq, Eq, PartialOrd, Ord)]
struct WeightedValidator {
  /// The `voting_power` field must be on top, because lexicographic ordering is
  /// based on the top-to-bottom declaration order and in the `ValidatorSet`
  /// the `WeighedValidator`s these need to be sorted by the `voting_power`.
  voting_power: VotingPower,
  address: Address,
}

struct ValidatorSet {
  /// Active validator set with maximum size equal to `max_validator_slots`
  active: BTreeSet<WeightedValidator>,
  /// All the other validators that are not active
  inactive: BTreeSet<WeightedValidator>,
}

type ValidatorSets = Epoched<ValidatorSet>;

/// The sum of all active and inactive validators' voting power
type TotalVotingPower = Epoched<VotingPower>;

When any validator's voting power changes, we attempt to perform the following update on the ActiveValidatorSet:

  1. let validator be the validator's address, power_before and power_after be the voting power before and after the change, respectively
  2. let power_delta = power_after - power_before
  3. let min_active = active.first() (active validator with lowest voting power)
  4. let max_inactive = inactive.last() (inactive validator with greatest voting power)
  5. find whether the validator is active, let is_active = power_before >= max_inactive.voting_power
    1. if is_active:
      1. if power_delta > 0 && power_after > max_inactive.voting_power, update the validator in active set with voting_power = power_after
      2. else, remove the validator from active, insert it into inactive and remove max_inactive.address from inactive and insert it into active
    2. else (!is_active):
      1. if power_delta < 0 && power_after < min_active.voting_power, update the validator in inactive set with voting_power = power_after
      2. else, remove the validator from inactive, insert it into active and remove min_active.address from active and insert it into inactive

Within each validator's address space, we store public consensus key, state, total bonded token amount and voting power calculated from the total bonded token amount (even though the voting power is stored in the ValidatorSet, we also need to have the voting_power here because we cannot look it up in the ValidatorSet without iterating the whole set):

struct Validator {
  consensus_key: Epoched<PublicKey>,
  state: Epoched<ValidatorState>,
  total_deltas: Epoched<token::Amount>,
  voting_power: Epoched<VotingPower>,
}

enum ValidatorState {
  Inactive,
  Candidate,
}

The bonds and unbonds are keyed by their identifier:

type Bonds = HashMap<BondId, Epoched<Bond>>;
type Unbonds = HashMap<BondId, Epoched<Unbond>>;

struct BondId {
  validator: Address,
  /// The delegator adddress for delegations, or the same as the `validator`
  /// address for self-bonds.
  source: Address,
}

struct Bond {
  /// A key is a the epoch set for the bond. This is used in unbonding, where
  // it's needed for slash epoch range check.
  deltas: HashMap<Epoch, token::Amount>,
}

struct Unbond {
  /// A key is a pair of the epoch of the bond from which a unbond was created
  /// the epoch of unboding. This is needed for slash epoch range check.
  deltas: HashMap<(Epoch, Epoch), token::Amount>
}

For slashes, we store the epoch and block height at which the fault occurred, slash rate and the slash type:

struct Slash {
  epoch: Epoch,
  block_height: u64,
  /// slash token amount ‱ (per ten thousand)
  rate: u8,
  r#type: SlashType,
}

Initialization

An initial validator set with self-bonded token amounts must be given on system initialization.

This set is used to pre-compute epochs in the genesis block from epoch 0 to epoch pipeline_length - 1.

Cubic slashing

Namada implements cubic slashing, meaning that the amount of a slash is proportional to the cube of the voting power committing infractions within a particular interval. This is designed to make it riskier to operate larger or similarly configured validators, and thus encourage network resilience.

When a slash is detected:

  1. Using the height of the infraction, calculate the epoch just after which stake bonded at the time of infraction could have been fully unbonded. Enqueue the slash for processing at the end of that epoch (so that it will be processed before unbonding could have completed, and hopefully long enough for any other misbehaviour from around the same height as this misbehaviour to also be detected).
  2. Jail the validator in question (this will apply at the end of the current epoch). While the validator is jailed, it should be removed from the validator set (also being effective from the end of the current epoch). Note that this is the only instance in our proof-of-stake model when the validator set is updated without waiting for the pipeline offset.
  3. Prevent the delegators to this validator from altering their delegations in any way until the enqueued slash is processed.

At the end of each epoch, in order to process any slashes scheduled for processing at the end of that epoch:

  1. Iterate over all slashes for infractions committed within a range of (-1, +1) epochs worth of block heights (this may need to be a protocol parameter) of the infraction in question.
  2. Calculate the slash rate according to the following function:
calculateSlashRate :: [Slash] -> Float

calculateSlashRate slashes = 
    let votingPowerFraction = sum [ votingPowerFraction (validator slash) | slash <- slashes]
	in max 0.01 (min 1 (votingPowerFraction**2)*9)
  -- minimum slash rate is 1%
  -- then exponential between 0 & 1/3 voting power
  -- we can make this a more complex function later

Note: The voting power of a slash is the voting power of the validator when they violated the protocol, not the voting power now or at the time of any of the other infractions. This does mean that these voting powers may not sum to 1, but this method should still be close to the incentives we want, and can't really be changed without making the system easier to game.

  1. Set the slash rate on the now "finalised" slash in storage.
  2. Update the validators' stored voting power appropriately.
  3. Delegations to the validator can now be redelegated / start unbonding / etc.

Validator can later submit a transaction to unjail themselves after a configurable period. When the transaction is applied and accepted, the validator updates its state to "candidate" and is added back to the validator set starting at the epoch at pipeline offset (active or inactive, depending on its voting power).

At present, funds slashed are sent to the governance treasury. In the future we could potentially reward the slash discoverer with part of the slash, for which some sort of commit-reveal mechanism will be required to prevent front-running.

<span class="katex"><span class="katex-html" aria-hidden="true"></span></span>

Reward distribution

Namada uses the automatically-compounding variant of F1 fee distribution.

Rewards are given to validators for voting on finalizing blocks: the fund for these rewards can come from minting (creating new tokens). The amount that is minted depends on how much is staked and our desired yearly inflation. When the total of the tokens staked is very low, the return rate per validator needs to increase, but as the total amount of stake rises, validators will receive less rewards. Once we have acquired the desired stake percentage, the amount minted will just be the desired yearly inflation.

The validator and the delegator must have agreed on a commission rate between themselves. Delegators pay out rewards to validators based on a mutually-determined commission rate that both parties must have agreed upon beforehand. The minted rewards are auto-bonded and only transferred when the funds are unbonded. Once we have calculated the total that needs to be minted at the end of the epoch, we split the minted tokens according to the stake the relevant validators and delegators contributed and distribute them to validators and their delegators. This is similar to what Cosmos does.

Basic algorithm

Consider a system with

  • a canonical singular staking unit of account.
  • a set of validators .
  • a set of delegations , each to a particular validator and in a particular (initial) amount.
  • epoched proof-of-stake, where changes are applied as follows:
    • bonding after the pipeline length
    • unbonding after the unbonding length
    • rewards are paid out at the end of each epoch, to wit, in each epoch , is paid out to validator
    • slashing is applied as described in slashing.

We wish to approximate as exactly as possible the following ideal delegator reward distribution system:

  • At each epoch, for a validator , iterate over all of the delegations to that validator. Update each delegation , as follows. where and respectively denote the reward and stake of validator at epoch .
  • Similarly, multiply the validator's voting power by the same factor , which should now equal the sum of their revised-amount delegations.

In this system, rewards are automatically rebonded to delegations, increasing the delegation amounts and validator voting powers accordingly.

However, we wish to implement this without actually needing to iterate over all delegations each block, since this is too computationally expensive. We can exploit this constant multiplicative factor which does not vary per delegation to perform this calculation lazily, storing only a constant amount of data per validator per epoch, and calculate revised amounts for each individual delegation only when a delegation changes.

We will demonstrate this for a delegation to a validator . Let denote the stake of at epoch .

For two epochs and with , define the function as

Denote as . The function has a useful property.

One may calculate the accumulated changes upto epoch as

If we know the delegation upto epoch , the delegation at epoch is obtained by the following formula,

Using property ,

Clearly, the quantity does not depend on the delegation . Thus, for a given validator, we need only store this product at each epoch , with which updated amounts for all delegations can be calculated.

The product at the end of each epoch is updated as follows.


updateProducts 
:: HashMap<Address, HashMap<Epoch, Float>> 
-> HashSet<Address> 
-> Epoch 
-> HashMap<BondId, Token::amount>>

updateProducts validatorProducts activeSet currentEpoch = 
	let stake = PoS.readValidatorTotalDeltas validator currentEpoch
        reward = PoS.reward stake currentEpoch
		entries = lookup validatorProducts validator
	    lastProduct = lookup entries (Epoch (currentEpoch - 1))
	in insert currentEpoch (product*(1+rsratio)) entries
	

Commission

Commission is charged by a validator on the rewards coming from delegations. These are set as percentages by the validator, who may charge any commission they wish between 0-100%.

Let c_V(e)DVep_n