Introduction submitted by
This essay contains a healthy dose of math sprinkled with opinion, and I would be the first to admit that my math and personal opinions are sometimes wrong. The beauty of these forums is that it allows us to discuss topics in depth, and with enough group scrutiny we should arrive at the truth. I'm actually a cryptocurrency noob; I've only been looking at it in earnest for a few months, but I've seen enough to conclude that we are in the middle of a revolution, and if I don't intellectually participate somehow, I think I'll regret it for the rest of my life.
Here I analyze sharding in a PoS (proof-of-stake) system, and I will show that not only is sharding good, but I will quantify just how beneficial it is to Tps
(transactions per second of the whole network) and mps
(messages per second processed by each individual node). I use Raiblocks as my point of departure, regarding it as both my inspiration and my object of critique. But much of the discussion should be relevant to any PoS sharded system.
As you may know, Raiblocks does not employ ledger sharding, but seeing as every wallet is already in its own separate blockchain, it's basically already half-way there! From an engineering perspective, sharding is low-hanging fruit for a block-lattice structure like Raiblock's, especially when you compare it to how complicated it is for single-blockchain currencies.
For the record, I think that Raiblocks will scale just fine according to the current strategy laid out by Colin LeMahieu
) . By using only full nodes and hosting them in enterprise grade servers (basically datacenters), chances are good that the network will be able to keep up with future Tps
(transaction per second) growth. Skeptics have been questioning if people are going to be willing to run nodes pro bono, just to support the network. But I don't doubt that many vendors will jump at the chance. If I'm Amazon, and I've been paying 3% of everything to Visa all these years, when there's an option to basically run my own Visa, I take it.
Payment networks like Paypal have been offering free person-to-person payments for years, eating the costs of processing those transactions in exchange for the opportunity to take their cut when those same people pay online vendors like Amazon. This makes business sense because only a minority of transactions are person-to-person anyway. Most payments result from people buying stuff. So, in a sense, vendors like Amazon have already been subsidizing our free transactions for years. By running Raiblocks nodes, they would still be subsidizing our transactions, but it would be a better deal than what they were getting before.
But have we forgotten something here? Is this really the dream of the instant, universal, decentralized, uncensorable payment network that was promised and only kinda delivered by Bitcoin? Decentralization comes in a spectrum, and while this is certainly better than a private blockchain like Ripple, the future of Raiblocks that we're looking at is a smallish number of supernodes run by a consortium of corporations, governments, and maybe a sprinkling of die-hard fans.
You may ask, but what about the nodes run by you and me on our dinky home computers and cable modem connections? Well, people need to remember that Raiblocks nodes need to talk to each other every time there's a transaction, in order to exchange their votes. The more nodes there are, the more messages have to be received and sent per node per transaction. Having more nodes may improve the decentralization, redudancy, and robustness of the network, but speed it definitely does not. Sure, the SSD of a computer running a mock node will handle 7000 tps
, but the real bottleneck is network IO, not disk IO, and how many Comcast internet plans are going to keep up with 7000 x N
messages per second, where N
is the total number of nodes? If you take the message size to be 260 bytes (credit to u/juanjux
's packet-sniffing skills
), and the number of nodes to be 1000, that's 1.8 GB/s. Also, if you consider that at least two messages will need to be exchanged with every node (one for the sending wallet, one for the receiving), the network requirements per node becomes 3.6 GB/s. This requirement applies to both the download and upload bandwidth, since in addition to receiving votes from other nodes, you have to announce your own vote to all of them as well. Maybe with multicasting upload requirements can be relaxed, but the overall story is the same: you almost want to convince small players not to run their own nodes, so N
doesn't grow too large. Hence, the lack of dividends.
So, if we're resigned to running Raiblocks from corporate supernodes in the future, we might want to ask ourselves, why is decentralization so important anyway? For 99.9% of the cases, I actually think it won't matter. People just want their transactions to complete in a low-cost and timely fashion. And that's why I think Ripple and Raiblocks on their current trajectories have bright futures. They are the petty cash of the future. But for bulk wealth storage, you want decentralization because it makes it hard for any one entity to gain control over your money. No government will be able to step in and freeze your funds if you're Wikileaks or a political dissident when your cryptocurrency network is hosted on millions of computers scattered across the internet. I know the millions number sounds outlandish given that Bitcoin itself has fewer than 12k nodes at present, but that's my vision for the future. And I hope that by the end of this essay, you'll agree it's plausible.
The main benefit of sharding is that it allows nodes to divide the task of hosting the ledger into smaller chunks, reducing the per-node bandwidth requirements to achieve a certain Tps
. I'll show that this benefit comes without having to sacrifice ledger redundancy, so long as sufficient nodes can be recruited. One disadvantage that must be noted is the increased overhead of coordinating a large number of nodes subscribed to partial ledgers. At the very least, nodes will need to know how wealthy other nodes are for voting purposes. However, I don't see how an up-to-the-second update of nodal wealth is necessary, since wealth changes on the timescale of months, if not years. It should be sufficient to conduct a role call once every few weeks to update nodes on who the other nodes are and to impart information about wealth and ledger subscriptions. Nonetheless, in principle this overhead means it is still possible to have too many nodes even with sharding.
Raiblocks has a unique advantage over single-chain cryptocoins in that each wallet address is already its own blockchain. This makes it especially amenable to sharding, since each wallet can already be thought of as its own shard! You just need a clever algorithm to decide which nodes subscribe to which wallets. For the purposes of this analysis, I assume a random subscription, so that for example if both you and I subscribe to 10% of the ledger, our subscriptions are probabilistically independent, and we intersect on roughly one percent of the total wallet space. I will also assume that all nodes are identical to each other in bandwidth, though in practice I think each node's owner should decide how much bandwidth he is willing to commit, letting the node's software dynamically adjust its P
to maintain the desired bandwidth, where P
, or the participation level, is the fraction of the ledger that the node is subscribed to. That way, when the Tps
of the network increases over time, each node will use the increasing bandwidth demand as a feedback signal to automatically lower its ledger subscription percentage. Then, all that would be missing for smooth and seamless network growth is a mechanism for ensuring node count growth. Some math
|Symbol ||Definition |
|mps ||messages per second received/sent per individual node |
|N ||total number of nodes |
|Tps ||transactions per second processed by the whole network |
|R ||ledger redundancy |
|P ||fractional participation level of an individual node |
|k ||role call frequency |
From the definitions, it should be apparent that
(1) R = NP
There are two types of messages that nodes have to deal with, transaction messages and role-call messages. Transaction messages are those related to updating the ledger when money is sent from one wallet to another. For each transaction, each node presiding over the sending wallet/shard will need to
- Broadcast its vote to the other R members of the shard. In the normal case this is a thumbs up signal and no conflict resolution is required.
- Receive votes from the other R members of the shard
- Broadcast its thumbs up to the R members of the receiving wallet/shard
Each node presiding over the receiving wallet/shard will need to
- receive thumbs up signals from the R members of the sending wallet/shard
Therefore, on a macro level upload and download requirements are the same. (Two messages sent, two messages received.)
Role-call messages are those related to disseminating an active directory of which nodes are participating in which wallets and information about nodal wealth. Knowledge about each individual node is broadcasted to the network at a rate of k
. I think 10-6
Hz is reasonable, for an update interval of 12 days. For each update, all R
nodes presiding over the wallet of the node whose information is being shared will broadcast their view of the node's wealth to all N
nodes. Therefore, from the perspective on an individual node:
- The rate that role-call messages are received is kRN.
- The rate that role-call messages are sent is k(# node wallets presided over)N = k(NP)N = kRN.
Again, upload and download rates are the same. Since upload and download rates are symmetric (which intuitively should be true since every message that is sent needs to be received), the parameter mps
can be used equally to describe upload and download bandwidth.
= 2R(PTps) + kRN
where the two terms correspond to the transaction and role-call messages, respectively. Using (1), (2) can be rewritten as
= 2R2Tps/N + kRN
Here, we see an interesting relationship between the different message categories and the node count. For a fixed ledger redundancy R
, the number of transaction messages is inversely proportional to the number of nodes. This is intuitive. If all of a sudden there are twice as many nodes and ledger redundancy remains the same, then each node has halved its ledger subscription and only has to deal with half as many transactions. This is the "many hands make light work" phenomenon in action. On the other hand, the number of role-call messages increases in proportion to the number of nodes. The interplay between these two factors determines the sweet spot where mps
is at a local minimum. Since the calculus is straightforward, I'll leave it as an exercise to the reader to show that
Alternatively, another way of looking things is to consider mps
to be fixed. This may be more appropriate if each node is pegged at its committed bandwidth. Then (3) describes the relationship between the ledger redundancy and N
. You may ask how this can be reconciled with (1), which seems to imply that N
are directly proportional, but in this scenario each node is dynamically adjusting its ledger subscription P
in response to a changing N
to maintain a constant bandwidth mps
. In this view, the sweet spot for N
is where R
is maximized. Interestingly, regardless of which view you take, you arrive at the same expression for the sweet spot (4).
If N < N_sweetspot
, then transaction messages dominate the total message count. The system is in the transaction-heavy regime and needs more nodes to help carry the transaction load. If N > N_sweetspot
(the node-heavy regime), transaction messages are low, but the number of role-call messages is large and it becomes expensive to keep the whole network in sync. When N = N_sweetspot
, the two message categories occur at the same rate, which is easily verified by plugging (4) back into (3). This is when the network is at its most decentralized: message count per node is low while redundancy is high.
Note that N_sweetspot
increases as Tps1/2
. This implies that, as transaction rate increases, the network will not optimally scale without somehow attracting new people to run nodes. But the incentives can't be too good either, or N
may increase beyond N_sweetspot
. Ideally, a feedback mechanism using market forces will encourage the network to gravitate towards the sweet spot (more on this later).
One special case is where P
=1 and N=R
. This is when the network is at its most centralized operating point, with every single node acting as a full node. This minimizes node count for a given redundancy level R and is how Raiblocks is currently designed. I will show that for most real-world numbers, the role-call term is so small as to be negligible, but the mps
is many orders of magnitude higher than in the decentralized case because of the large transaction term.
Assuming that we are able to keep the network operating at its sweet spot, by plugging (4) into (3), we arrive at
If instead we plug N=R
into (3), we arrive at
= 2RTps + kR2
So, we see that in the decentralized case the mps
of individual nodes increases as the square root of Tps
, a much more sustainable form of scaling than the linear relationship in the centralized case.
And now, the moment we've all been waiting for: plugging various network load scenarios into these formulas and comparing the most decentralized case to the most centralized. Real world operation will be somewhere in between these two extremes.
|Fixed parameters || |
|packet size (bytes) ||260 |
|k (Hz) ||1.00E-06 |
|R ||1000 |
|transaction fee ($) ||$0.01 |
| ||Tps || || || || || || |
| ||0.1 ||1 ||10 ||100 ||1,000 ||10,000 ||100,000 |
|Total monthly dividends ||$2,592 ||$25,920 ||$259,200 ||$2,592,000 ||$25,920,000 ||$259,200,000 ||$2,592,000,000 |
|Decentralized node requirements || |
|mps (Hz) ||28 ||89 ||283 ||894 ||2,828 ||8,944 ||28,284 |
|node traffic (bytes/s) ||7.35E+03 ||2.33E+04 ||7.35E+04 ||2.33E+05 ||7.35E+05 ||2.33E+06 ||7.35E+06 |
|N ||1.41E+04 ||4.47E+04 ||1.41E+05 ||4.47E+05 ||1.41E+06 ||4.47E+06 ||1.41E+07 |
|P ||7.07E-02 ||2.24E-02 ||7.07E-03 ||2.24E-03 ||7.07E-04 ||2.24E-04 ||7.07E-05 |
|Total Network Traffic (bytes/s) ||1.04E+08 ||1.04E+09 ||1.04E+10 ||1.04E+11 ||1.04E+12 ||1.04E+13 ||1.04E+14 |
|Yearly Network Traffic (bytes) ||3.28E+15 ||3.28E+16 ||3.28E+17 ||3.28E+18 ||3.28E+19 ||3.28E+20 ||3.28E+21 |
|Decentralized node income || || |
|monthly per node ($) ||$0.18 ||$0.58 ||$1.83 ||$5.80 ||$18.33 ||$57.96 ||$183.28 |
|income/GB ($/GB) ||$0.0096 ||$0.0096 ||$0.0096 ||$0.0096 ||$0.0096 ||$0.0096 ||$0.0096 |
|Centralized node requirements || || |
|mps (Hz) ||2.01E+02 ||2.00E+03 ||2.00E+04 ||2.00E+05 ||2.00E+06 ||2.00E+07 ||2.00E+08 |
|node traffic (bytes/s) ||5.23E+04 ||5.20E+05 ||5.20E+06 ||5.20E+07 ||5.20E+08 ||5.20E+09 ||5.20E+10 |
|N ||1000 ||1000 ||1000 ||1000 ||1000 ||1000 ||1000 |
|P ||1 ||1 ||1 ||1 ||1 ||1 ||1 |
|Total Network Traffic (bytes/s) ||5.23E+07 ||5.20E+08 ||5.20E+09 ||5.20E+10 ||5.20E+11 ||5.20E+12 ||5.20E+13 |
|Yearly Network Traffic (bytes) ||1.65E+15 ||1.64E+16 ||1.64E+17 ||1.64E+18 ||1.64E+19 ||1.64E+20 ||1.64E+21 |
|Centralized node income || || |
|monthly per node ($) ||$2.59 ||$25.92 ||$259.20 ||$2,592 ||$25,920 ||$259,200 ||$2,592,000 |
|income/GB ($/GB) ||0.0191 ||0.0192 ||0.0192 ||0.0192 ||0.0192 ||0.0192 ||0.0192 |
Yes, I did sneak a transaction fee in there, which is anathema to the Raiblocks way. But I wanted to incentivize people to run nodes. Observe that income per gigabyte remains the same, independent of network Tps
, because both total income and total bandwidth scale proportionally to Tps
. The decentralized case has half the income/GB because the role-call overhead doubles network activity. In either case, the income per GB depends on transaction fee and is independent of network load.
An interesting number to check online is the price/GB that various ISP's charge
. With Google Fiber, it is possible to purchase bandwidth as low as $0.00076 per GB, meaning that it may be possible for nodes to be profitable even if fees were lowered by another order of magnitude. As time progresses, bandwidth costs will only go down, so fees may be able to be lowered even further past that. But because of electricity and other miscellaneous costs, I think a one cent transaction fee is probably pretty close to what people need to incentivize them to run nodes.
With sharding, even many home broadband connections today can feasibly support 100,000 transactions per second, with each node subscribed to about one ten thousandth of the total ledger and handling about 7 MB/s. Getting 14 million people to run nodes may seem like a tall order, but the financial incentives are there. Just look at all the people who have rushed to do GPU mining. Here, bandwidth replaces hashing power as the tool used for mining.
According to a study done by Cisco, yearly internet traffic is projected to reach 3.3 ZB by 2021
. Looking at the table, that means if we ever reach 100,000 Tps
, Sharded Raiblocks traffic would be equal to the rest of the world combined. Yikes! But if you think about it, nobody along the way is taking on an unbearable load. Users pay low fees for transactions. Nodes get dividends. ISPs get additional customers. The only ones who lose out are Visa, Paypal, and banks.
With such a large network presence, the cultural impact of this coin would be huge. That, in addition to the sheer number of participants running nodes as side businesses would cement this as the coin of the people.
From a macro level, I see no red flags that would indicate this is economically or technically infeasible. Of course, the devil's in the details so I'm posting this to see if people think I'm on the right track. To me, it seems that the possibilities are tantalizing and someone needs to build a test net to see if this idea flies (u/meor
, if any of this sounds appealing, are you guys hiring? ;) ). Musings
I've only scratched the surface and there are many other topics that are worthy of deeper discussion:
- Efficient assignment and announcement of ledger subscriptions for each node
- Feedback control mechanism for transaction fee amount to nudge the network towards its N/√Tps sweet spot
- Culling slow nodes from the network using economic incentives
- Dividend distribution without creating too many extra transactions
- Fair dividend distribution based on bandwidth committed
- Shard security
Supernodes are most commonly used to enclose a floating voltage source in order to apply node voltage analysis. But supernodes are more general than that. For example, see these MIT EE course notes on nodal analysis. "The part of the circuit enclosed by the dotted ellipse is called a supernode. Kirchhoff’s current law may be applied to a supernode in the same way that it is applied to any ... An early promotional picture for ‘Conserving Up With The Kardashians’ [E! Entertainment]. Kendall, who's now 24, began out as a clumsy pre-teen on KUWTK, and now, she’s one of many world’s hottest supermodels.Alongside along with her development spurt, Kendall additionally grew into all of her options, and nearly appears to be like like a distinct particular person. Blockchain news, including news about blockchain, Bitcoin, Ethereum and more. (Page 3608) Blockchainster.com: blockchain news and blockchain resources. Toggle navigation. Home; Blockchain News; Blockchain News. Monero Analysis (XMR/USD) January 9, 2018. Monero buyers are definitely in charge and it is easy to see why. From previous mentions, we were expecting dips towards the middle BB for ... World Blockchain Hackathon, a 100% community funded global organization, has announced the date and details of their latest hackathon, a 3-day global virtual event that will attract more than 3000 participants and contributions across 50 countries. The Babylon Project aims to democratize access to global innovation with blockchain technology. Additionally, the project aims on developing […] Rising food prices are having impacts across the world, but especially among poor people in low-income developing countries. Since 2000, a year of low food prices, wheat prices in international markets have more than tripled, corn prices have doubled, and rice prices rose to unpreceden ted levels in March 2008. Such increases
Asian University of Bangladesh, EEE 1101 Electric Circuits online class, in Bengali language. Topic: supernodes. Next class: reviewing mesh analysis and usin... Second circuit with supernodes. This one has two. There are four variables (which could be reduced to two, given the two supernodes!) Nodal Analysis & Supernode Class on Feb 7th, 2018. This video is unavailable. Watch Queue Queue When more than one super nodes that are adjacent to each other are present in the circuit, we have two approaches to solve this circuit when applying KCL based nodal analysis. First is to treat ...