Slicing data: what comprises blockchain transactions, Excellent Wall of Numbers

Excellent Wall of Numbers

Main menu

Slicing data: what comprises blockchain transactions?

Over the past month we have seen nominal transaction volume on the Bitcoin network reach several all-time highs. Enthusiasts on social media have proposed any number of theories including a rise in retail payments or commercial volume.

Yet upon further inspection, there does not emerge to be a silver bullet response.

We know, for example, that these transactions can originate or be comprised of faucet outputs, mining prizes, coin mixing, gambling, movement to ‘change’ addresses and ordinary wallet shuffling. One So with this type of identification problem, how can analysts distinguish the signal from the noise? Or as Peter Todd and others explained last month, for a few hundred dollars a day, it is possible to inflate the transaction volume by an entire order of magnitude. Two

For example, questions have arisen over a series of what some call “long chains.” Last month several commentators on a popular thread on Hacker News identified thousands of puny transactions originating from a single source. Three The source was continually sending transactions and paid transaction fees for each of them. The reason this struck many as odd as a rational actor would simply bundle the transactions together to save on transaction fees.

While there are likely different motivations for doing so, one reason for why this was occurring was that the originating source was attempting to delink or otherwise mix and tumble coins to make it difficult to “dox” or identify the originating source. But it could also be a faucet and at one point even pools paid out miners using chained transactions, perhaps some still do.

What does this look like? Below is a chart created by user “FatalLogic” in that thread:

Source: Hacker News

The green line identifies the overall transaction volume on the Bitcoin blockchain, whereas the crimson line goes after the rule, the heuristic that eliminates these “long chains.”

Is there a definition of long chains?

Two weeks ago published several similar charts excluding “Long Chains.”

According to Jonathan Levin, formerly of Coinometrics: four have implemented a heuristic to identify high velocity activity that is most likely unrelated to real world commerce. Every day the internal counter resets and counts how many times transaction outputs were spent on the same day. So if a wallet paid someone one btc in one transaction output and they then transferred that to cold storage that would be a chain of two. However there are some chains where the chain of spent outputs of a given day exceeds 1000. Each day, on average, the sample size is one hundred forty four blocks. Therefore, for chains of more than 144, the chain of transactions involve zero confirmation transactions (i.e., are not relying on the blockchain for their security). In other words, it is a measure of velocity.

These long chains demonstrate that there are some parts of the economy that are rolling outputs almost ten times a block with chains of over one thousand in a day. This may not relate to real world commerce or security processes, very likely more likely to be gambling or mixing. In Satoshi Dice often the bettor just takes their winnings and gambles again with everything being done with zero confirmations. Likewise with mixing there is little need to wait for confirmations and the priority is obscuring the origin of the transaction outputs. Ultimately this is unlikely to capture a lot of activity run by the centralised services since their objective is fee minimisation.

Furthermore, according to the description on’s site, “A chart displaying the total number of bitcoin transactions per day excluding those part of long chain transaction chains. There are many legitimate reasons to create long transaction chains however they may also be caused by coin mixing or possible attempts to manipulate transaction volume.”

The very first chart below is the original unmodified chart of total transactions on the Bitcoin blockchain: five

Using the same Y-o-Y time framework, below is the freshly modified chart, using the heuristic that eliminates these “chains” longer than Ten: six

As we can tell above, by removing these “long chains” the volume decreases by 3x, yet there does show up to be an upward trend over the past several months.

I spoke with Atif Nazir, the CEO and co-founder of In his view: seven

The term “longest chain” is vague – it would be misleading to say it is just coin mixing. The volume could be a series of transactions where the user cannot spend to the desired destinations in the same transaction. This could be a limitation of their wallet software’s user interface, or the backend of the software itself.

For example, if a faucet is built on, the holder spends coins rapidly, sometimes cracking them into a duo transactions if they are efficient, and at other times into hundreds of transactions that spend unconfirmed switch in rapid succession. We have seen chains of unconfirmed spends as long as 1,000 transactions, and they could be longer if blocks are not found.

In general, achieving provable privacy through coin mixing and coin shuffling is hard as long as you stay on the same Blockchain. With the current methods, you can look at a destination address and say, with some certainty, “hey, this dude is the one who stole the Bitstamp coins.” eight

In the absence of a definite, no-non-sense way to look at “long chains” of transactions, the safest assumption would be to consider them as unconfirmed chain spends, where the user wants to spend transactions very quickly deliberately or due to their software’s limitation.

Another potential source is even smaller.

For example, Sidney Zhang, co-founder of HelloBlock has noticed that: nine

Another interesting thing is people are sending dust transactions on the network as advertisements for high-yield investment program (HYIP). Ten

This transaction, 92aa, is an example of an ad (and the message was liquidated by Eleven

What they do is they will look for transactions happening on the blockchain, pick a collection of addresses and then send one satoshi to them and then they will fasten a “public note” on The message is normally like earn 7% per day at The public note in this case was eliminated, very likely reported as spam

The 2nd, 1cca, is an example of a faucet. If you look at the tag “win free bitcoins every hour!” it is the address for Twelve

It is unlikely the long chains come directly from consumers because consumers don’t spend money rapidly.

A more likely screenplay is it is a ‘shared’ hot wallet operated by a service (e.g., Coinbase, Circle). A possible explanation then emerges – off-chain gambling sites such as Primedice / Moneypot / Betcoin casino and others operate hot wallets.

In terms of scale, very puny casinos may receive approximately 30+ deposits a day. A larger casino lightly operate with 1000s of deposits a day and hundreds of withdrawals.

One interesting behavior is that, bitcoin gamblers never keep funds in a casino. They tend to deposit, play and then instantaneously withdraw without leaving funds there overnight. That could create a fat amount of activities from the same hot wallet. Thus creating a large chain.

Last year Ken Shirriff also pointed out a few of the notable chunks of “spam” that permanently reside on the blockchain including pics. Thirteen

What does this look like altogether?

For extra analysis I reached out to Organ of Corti who plotted out these differences onto two different charts. Fourteen

As shown above, these match up with the heuristic used by the original Hacker News post as well as that of In Organ’s view:

If long chains of transactions are used by entities of a very different nature to single transactions or brief chains of transactions, then we might expect to see differences in transaction rates and transaction rate cycles inbetween the brief and long chain groups.

Embarking with a visual comparison of the two groups, the most significant difference inbetween the longer and shorter chain groups is variance. This is to be expected since one long chain of transactions increases transactions rates more than a single, unchained transaction.

Does the yellow line at the bottom represent the actual “real” volume? Perhaps, but maybe not.

In addition, Organ put together a spectrogram to analyze this weekly cycle that is visually apparent in all the charts:

Another way to look at it is through a spectral density chart, according to him:

Perhaps a more useful test is to check for periodicity in the data. We know from previous work that presently transactions demonstrate a daily and a weekly cycle. I’m using’s data which is daily, so a spectrogram will only expose a weekly cycle.

The last plot shows the spectrograms for chains longer and shorter than Ten, 100, 1000, or 10000. These demonstrate a periodicity similar to that for all transactions of one cycle per week.

We can also compare transaction of chains longer and shorter than Ten, 100, 1000, or ten thousand by calculating the cross correlation function. In each case the maximum correlation is at lag zero and is much higher than the upper tied of the 99.9 confidence interval, so the periodicity of the transaction rates of each group (chains longer and shorter than Ten, 100, 1000, or 10000) are similar to, also suggesting that time of use for shorter and longer chain transactions are similar.

Further, time series decomposition showcased the same kicking off and completing days of each weekly cycle.

I think that a working week cycle implies that the larger number uses of longer chain transactions are from businesses with a normal working week, and the correlation in the periodicity of the shorter and longer chains of transactions suggests the largest use of both longer and shorter chains of transactions are by entities with a work days and weekends.

Is there anything that explains the increase then?

Earlier this month a fresh game called SaruTobi was approved for inclusion into the iOS store. Fifteen The game tips its users bitcoin on the blockchain (in contrast, ChangeTip does so off-chain). During its debut week, before running out of coins, according to its very first public address, SaruTobi sent out more than Five,000 transactions most of which during an 11-hour time period. Sixteen Within its very first two weeks it paid out toughly 6.Four bitcoins with more than 50,000 transactions. Seventeen

Another continual source of on-chain usage comes from Counterparty, a “2.0” platform that effectively sits on top of the Bitcoin blockchain and uses bitcoins for each counterparty transaction (e.g., it is an embedded consensus mechanism). Below is a visual of the daily transaction volume over the past year: eighteen

The variation goes after some of the daily (and weekend) patterns we have observed with Bitcoin in general (e.g., less activity on “Sundays”) but at certain days and times there are peak usages of up to 3% of the Bitcoin network. Nineteen One explanation is that Counterparty is a popular platform for issuing tokens during crowdsales. For example, the dual peaks in December are most likely related to the Gems crowdsale, in which Two,633 BTC were exchanged for thirty eight million “GEMZ” (the native coin of the Gems system). Twenty

As I shortly described last month, over the past year, a BitcoinTalk user, “dexX7” has been parsing other data, usually related to alt platforms such as Counterparty, Mastercoin, Colored coins and proof of existence. Twenty one Recall that these ‘altcoins’ are actually in practice, just watermarked bitcoin transactions. In order to use these platforms, a user has to interact with the Bitcoin network (e.g., they are embedded consensus mechanisms). Below is a chart he recently sent me that dissects this composed parts: twenty two

  • Data captured at block height 340,018
  • There were at least 184,155 identifiable meta-transactions
  • There were 57,489,982 transactions in total
  • There were 16,511,696 unspent outputs

This only includes the transactions dexX7 was able to identify. Counterparty, Mastercoin and Chancecoin use almost entirely “naked multisig” scripts as medium to embed and transport data. In contrast, Proof of Existence, Open Assets, Coinspark and Block Sign use OP_RETURN (note: there is still an active discussion inbetween using forty bytes and eighty bytes). Twenty three Open Assets and Coinspark are a type of colored coin implementation and both Proof of Existence and Block Sign are a type of notary service (previous charts are available in an album view). Twenty four

Some other analysis from dexX7:

Almost all Counterparty transactions carry data via naked multisig and there are about five thousand non-multisig Mastercoin transactions. There are furthermore seventeen thousand six hundred twenty unclassified, unspent multisig outputs and six thousand two hundred eighty six unclassified, spent multisig outputs.

Almost all of those unclassified multisig outputs were created by Wikileaks and actually carry some data too. Twenty five

Proof of Existence, Open Assets, Coin Spark and Block Sign account for seven thousand three hundred sixty three OP_RETURN transactions. The total number of all OP_RETURN outputs, according to, is close to 11960, so more than sixty % can be mapped to those four.

Another slice of daily and weekly transactional volume comes from pay-to-script-hash, better known as P2SH. This was originally BIP sixteen proposed by Gavin Andresen and incorporated into the protocol in two thousand twelve to “let a spender create a pubkey script containing a hash of a 2nd script, the redeem script.” twenty six

This has substituted ‘bare’ multisig as a means for securing bitcoins. While its use and adoption embarked off very slow, more than 6% of all bitcoins are now stored in this manner including Bitstamp via its latest integration with BitGo: twenty seven

As has become apparent, it cannot be said that an increase in transaction volume is (most likely) due to any one specific variable. Yet, according to a popular narrative, the quadrupling of acceptance by merchants this past year (from

20,000 to 82,000), may have led to enlargened spending by consumers and therefore account for the increase. Twenty eight

Last month, Jorge Stolfi a computer science professor in Brazil analyzed the BitPay addresses (BitPay reuses addresses) based on the Walletexplorer dataset. Twenty nine Below is a visual of what BitPay has received over the past two years.

According to Stolfi:

The green line on this graph shows the number of BTC deposited each day into that wallet. Thirty This graph is rather strange since the number is practically constant since January 2013, about 500–1000 BTC/day, and shows no weekly pattern. And no Black Friday spike either.

In his analysis Stolfi also noticed two different types of orders processed by BitPay, what he labels “wholesale” versus “retail.” The “wholesale” coins are likely miners selling their block prizes in bulk whereas “retail” is consumer behavior (e.g., buying coffee, food, tickets).

Furthermore, if this wallet heuristic is valid, according to Stolfi:

  • BitPay now processes about 1000-1500 “retail” payments per day, averaging less than one BTC each;
  • The number of retail transactions processed by BitPay has grown 3x since mid-2013, and has been vapid through most of 2014;
  • The amount of BTC processed by BitPay (including “retail” and “wholesale” payments) has been fairly constant since Jan/2013, about 500-1000 BTC/day
  • In terms of dollar value, the amount processed by BitPay (including “retail” and “wholesale” payments) has enhanced a lot from two thousand thirteen to 2014, but has fallen 50% or more since February, as the BTC price fell.
  • Black Friday had a modest effect (2x to 3x) on the number of “retail” payments, but had no effect on the total BTC/day (which is predominated by the “wholesale” payments).

And what about off-chain retail transactions?

Below is a public chart from Coinbase that visualizes the off-chain activity that takes place on Coinbase’s platform. Thirty one

The noticeable pattern of higher activity on weekdays versus the weekend is apparent irrespective of holidays. Consequently, on most days these self-reported numbers comprise inbetween 3-5% of the total transactions on the Bitcoin blockchain. However, as Jonathan Levin, has pointed out, it is not clear from these numbers alone are or what they refer to: Coinbase user to user, user to merchant, and possible user wallet to user vault?

Another way of looking at whether or not transaction volume is enlargening is through the “fees” to miner metric (recall that these are not real “fees” as they are not mandatory yet and may be more akin to “donations”). Thirty two Maybe transaction volume based on the methods above does not fully capture hypothesized growth.

Above is a fresh chart from Organ of Corti which visualizes the transaction fees included with each block over the past six years. Thirty three If on-chain retail commerce was enhancing, it would likely in turn be paid for via some fee mechanism yet this is not apparent. This is not to say that utility has not enhanced for certain participants. Volume as a entire has clearly enlargened as shown by the 2nd picture – yet these are users who likely opt to send a fee-less transaction to the mempool (these transactions typically take several hours or perhaps a day to be included within a block).

What is another explanation?

It does illustrate that the other narrative – that fees substituting block prizes – has not yet begun to occur. Maybe it will not.

For example, last year Robert Sams and Vitalik Buterin highlighted the economic costs that are being overlooked to maintain the infrastructure, that fees would unlikely be able to adequately compensate miners. Thirty four And Dave Hudson independently explored what has actually occurred in practice, providing visualizations of the empirical data that highlights and reinforces their marginalized viewpoint. Thirty five

To put it another way, if more users were actively using the blockchain to transmit value, then it would likely be apparent via an aggregate increase in fees.

As shown above during a four year time span, miners, the actual labor force of the network, are not witnessing the narrative play out as it is supposed to (block prize plus fees to miner). Denominated in bitcoin (the blue line), miners have not seen the increase in fees or revenue that many of the same social media promoters claim will happen. Whether this switches is unknown.

Again, recall the current narrative that in the end, transaction fees will purportedly substitute the block prize. Thirty six But the causality is the opposite direction than assumed by most: fees people are willing to pay determine the number of miners. Not the other way around. The takeaway is that simply put, fees may not rise to cover the current block prize amounts. It may be that the block prize falls and miners just drop out and net transaction fees never increase reducing the security of the network but this is a topic for another article.

For perspective I spoke with Ernie Teo, a research fellow at the Sim Kee Boon Institute for Financial Economics (which hosted a cryptocurrency conference in November). Thirty seven According to his team:

We observe similar trends to what has been mentioned in your article. We see a large increase in the one satoshi (or less) addresses over time. This could also be due to the long chain “spammer” you have described above. A few more things we can note from our upcoming analysis on the distribution of bitcoins over time:

  • 50 coin addresses, these are the only addresses in the very beginning due there being only miners on the network. However we see that this does not fluctuate a lot overtime and it indicates that most miners tend to cash out once they mined.
  • Large increase in number of addresses with less than one bitcoin. This indicates more “retail” type buyers.
  • Not much switch or fluctuations to the large addresses.

I think it is very likely true that not a very large proportion of the transactions are retail transactions. In the long run, it doesn’t help the network. We can only wait for the next big innovative app that can boost retail-type usage.

How else can this be visualized?

John Ratcliff recently published several fresh charts describing “the State of the Blockchain Address(es)“ in which he delves into token movements and in particular “zombie” addresses (addresses that have not been active in three or more years). Thirty eight They are illuminating and we both disagree on conclusions that can be drawn from them.

For example, he updated one chart that I previously described as displaying more than 70% of coins have not moved in more than six months: thirty nine

Source: John Ratcliff

What does the chart above illustrate? If it is velocity then what the color bands reinforce my explanation from two months ago: that the majority of coin holders that were purchased in the November / December two thousand thirteen bubble are now underwater. Forty We see the transition over the year, in which these coin holders, rather than spending and realizing a loss, hold on to them via the months. Hence, why we likely see another uptick to an “older” band embarking in mid-November two thousand fourteen – the anniversary of the beginning of the most latest bubble.

This explanation is further reinforced by the demographics of bitcoin holders: mostly middle to upper-middle class residents of developed countries – most of whom have “low time preference” (e.g., speculators) and therefore do not need or want to spend bitcoins instantly because they have other means of payment (e.g., credit cards) and can therefore hold onto their coins longer than someone with “higher time preference” (e.g., less affluent individuals living paycheck to paycheck who in theory would have to continually, instantly spend bitcoins). Another potential explanation is the disposition effect, but this is also a topic for a different article. Forty one

The chart above (originally Figure 15) was published this past month by two researchers at the Federal Reserve. Forty two They independently used a similar methodology that Ratcliff has undertaken. In their view:

Figure fifteen examines the degree of activity for the addresses in the network. For each date we partition the volume of addresses with positive balances according to their last activity. For example, the addresses that have transacted in the last week are likely to be frequently used (shown with the de-robe in the bottom). On the other forearm, some of the addresses have not been active in the past fifty two weeks. Those are likely to serve saving or investment purposes and much less so for transacting. From Figure fifteen we can see that the volume of “investment” addresses (not used in the last year) has been steadily decreasing. Still, however, around seventy five percent of the addresses in operation with positive balances have not been used in a transaction in the last four months.

While the rest of their report is illuminating, in their concluding remarks, they also do not see retail transactions as comprising more than a marginal amount of volume:

Broadly speaking, our empirical exercise documents general patterns of Bitcoin usage, and examines the use of Bitcoin for investment and payment purposes. We find that while the number of daily users may have doubled every eight months, the transaction volume is negligible compared to the domestic volume of U.S. payment systems. Our analysis of data from the Bitcoin system further suggests that Bitcoin is still hardly used for payments for goods and services. In addition, the patterns of circulations of bitcoins and the dynamics of the bitcoin exchange rate are consistent with low usage of Bitcoin for retail payment transactions. Eventually, we provide evidence that the exchange rates inbetween bitcoin and other currencies are not well aligned, which we interpret as a lack of depth of the exchange markets and as costly exchange rather than unexploited arbitrage opportunities.

Perhaps these trends will switch. Maybe, as some claim, retail volume will increase. But as shown above and through Total Output Volume we know what the maximum “purchasing power volume” of transactions is, this has not been a mystery. Forty three

While merchant adoption proceeds to increase, consumer adoption for retail purchases emerges to be vapid (as shown by both BitPay and Coinbase numbers). Future analysis may need to look at correlating these trends for brick and mortar merchants. Without regular use at the register and point-of-sale, there are a number of anecdotal stories of retraining and fumbling that will go on with floor employees with respect to accepting bitcoin. Forty four

Perhaps again, this will switch in the future (e.g. Impulse), forty five but going forward a total traffic analysis such as the type created by Sarah Meiklejohn et al. two years ago would help the industry as a entire determine what consumer behavior looks like with greater accuracy. Forty six And this is significant for a project whose white paper promotes itself as a payment network for online commerce (see section 1).

So what conclusions can be drawn from this?

As noted at the beginning, there does not show up to be one specific variable that explains the latest increases over the past several months. For example, most tipping from services like Bitui in China and ChangeTip internationally, is already done off-chain (e.g., the independent site ‘ChangeTip stats’ describes activity on the company database). Forty seven SaruTobi is too fresh to account for all but the last few weeks of growth and DarkWallet activity will likely be “long chain” related. Perhaps offline P2P transactions from OpenBazaar should be identified, aggregated and brought into future analysis. Forty eight

Future analysis should also look to factor in or filter out activity related to “switch” addresses. For example, the short-term ‘velocity’ seen in the daily and weekly bands of Ratcliff and Badev & Chen’s charts could be overstated due to coins which do not actually exchange mitts but are rather “spent” to themselves due to how “switch” is treated by the protocol. Furthermore, as has been described in Dave Hudson’s modeling of block sizes, it cannot be said that an increase in on-chain volume is axiomatically “good.” forty nine

All we can say for now is that there is an increase in usage from numerous sources, but not likely from on-chain retail commerce which has remained vapid for about a year.

This is still a dynamic space and perhaps it may be months or even years before we will be able to fully identify all the major contributors to volume switches.

Special thanks to dexX7, Raffael Danielli, Michael Dann, Dave Hudson, David Lancashire, TM Lee, Jonathan Levin, Atif Nazir, Organ of Corti, Jorge Stolfi, Ernie Teo and Sidney Zhang for their constructive feedback and time.

Share the post “Slicing data: what comprises blockchain transactions?”

Related video:

Leave a Reply