BUIP043: (closed) Exploring the Bitcoin Network

S Nadarajah

New Member
Dec 22, 2016
10
5
Project Title: Exploring the Bitcoin Network.

Bitcoin Address: 12w4PcnfSC13yPvTxDDATjeYJ8qj742Bhb

Motivation: Over the past few months there has been a significant growing notable interest in Bitcoin. For example, the UK government is considering paying out research grants in Bitcoin; an increasing number of IT companies are stockpiling Bitcoin to defend against ransomware; growing numbers in China are buying into Bitcoin and seeing it as an investment opportunity. Perhaps most significantly, the Chair of the Board of Governors of the US Federal Reserve has been encouraging central bankers to study new innovations in the financial industry. In particular, they expressed a need to learn more about financial innovations, including Bitcoin, Blockchain, and distributed ledger technologies. With this recent surge in interest, we believe that now is the time to start studying Bitcoin as a key piece of financial technology, and not just as a novelty.

Objectives: Expand on existing research and analysis of the Bitcoin network. The focus will be on three main objectives: i) analyse the distribution of the Bitcoin network - distribution of degrees, transaction frequency, transaction sizes, costs, scalability, etc; ii) investigate using Extreme value and quantile regression methods which could be used to detect fraudulent transactions and anomalies in the network, by examining characteristics of Bitcoin addresses; iii) analyse speculative behaviour in the Bitcoin network, Bitcoin transactions, and financial markets.

Project Duration: We expect the project to be completed within 12 months.

Project Team: Dr Saralees Nadarajah, Senior Lecturer, School of Mathematics, University of Manchester, M13 9PL, UK; Dr Stephen Chan, EPSRC Doctoral prize Fellow, School of Mathematics, University of Manchester, M13 9PL, UK; Jeffrey Chu, PhD research student, School of Mathematics, University of Manchester, M13 9PL, UK.

Summary of Current Work: We have already performed a preliminary statistical analysis of the exchange rate of Bitcoin against the US dollar, using a wide range of known parametric distributions in finance. We believe it is the most comprehensive using parametric distributions for any kind of exchange rate data. This was motivated by the fact that there exist many studies investigating the best fitting distributions for the exchange rates of major currencies; however, there are none (that we are aware of) for the exchange rate of Bitcoin. In addition, the exchange rate of Bitcoin versus the US dollar appears to behave very differently to the exchange rates of other major currencies. Using daily Bitcoin exchange rate data from September 2011 to May 2014 (approximately two and a half years) from the Bitstamp exchange, our results showed that the generalised hyperbolic distribution gave the best fit to the data, being consistent with the observation that Bitcoin exchange rates have somewhat complicated dynamics. Given our preliminary results, we believe that there is great scope to extend this analysis through more complex mathematical and computational methods.

Description of Activities: To achieve the objectives stated above, we will complete the following activities:
  • Review existing literature on approaches to scaling of Bitcoin.
  • Collect the complete Bitcoin network data from its inception to present. This should include all Bitcoin addresses and transactions since Bitcoin was created.
  • Collect the data on the cost of setting up a bitcoin node and the ongoing running and maintenance costs.
  • Sort and clean data, creating specific data sets containing the degrees of each Bitcoin address, number of transactions in and out of each address, the sizes of all transactions etc.
  • Fit a wide range of parametric distributions to each of the data sets, find the most appropriate fit.
  • Analyse and estimate the cost of running a node for different periods in Bitcoins history (Expected to finish by month 3-4).
  • Analyse the Bitcoin transaction graph, and model the number, size and time of transactions, and the price of Bitcoin to examine whether individuals buy into Bitcoin to profit from its high volatility.
  • Prediction and forecasting of the costs of running nodes in the future, based on the results of the analysis in the above tasks (Expected to finish by month 4-5).
  • Review existing literature on anomaly detection, and its application to financial markets.
  • Analyse the Bitcoin network graph to identify any patterns in transactions which may indicate money laundering behaviour --- e.g. when one user in the network performs transactions with many other users, who then each perform transactions with another common node.
  • Examine Bitcoin addresses with significantly different characteristics from others: transaction frequency or number of times an address pays or receives Bitcoins over a fixed time period; node degree or the number of users an address performs transactions with; transaction volume or the value of the transactions that an address is involved in.
  • If these characteristics are significantly different then they could indicate anomalies, and could give an indication of the overall health of the Bitcoin system and whether there are attacks on the Bitcoin network (Expected to finish by month 5-8).
  • Investigate appropriate methods in operational research which can be utilised in determining the optimal time to set scaling in the context with price. Also utilise quantile regression methods to analyse the transactional quantiles and provide an indication of when to scale.
  • Spatial analysis to study nodes globally and in regions of particular interest (Expected to finish by month 8-12).

Anticipated Challenges and Uncertainties:
  • We require the latest Bitcoin network data, however, we will need to determine a cut-off point as new Bitcoin transactions will be added constantly.
  • Obtaining the whole Bitcoin data set may take significant time, in addition to modelling and constructing the Bitcoin network from the data. Analysing this graph will be time consuming due to the size of the graph and data.
  • Modelling the Bitcoin transactions and price of Bitcoin will require the analysis of high frequency Bitcoin transaction data, as it is assumed that trading of Bitcoin for profit will be similar to the that of traditional financial securities.
  • Obtaining and estimating the exact cost for running node may be complex as some costs such as time, effort, and utility may not have specifically defined values. These value themselves may need to be estimated based on real data.

Budget: The total amount requested for the proposed work is $15,000. We anticipate for results produced by this funding to be published in relevant leading journals. $1000 will cover the potential publication fees for journals. We will attend and present our results at one UK conference. The corresponding costs for the UK conference are 3 x $300 for travel; 3 x $200 for accommodation/subsistence; 3 x $300 for registration fees. The $11,600 would cover the compensation for the research time of Research Assistants (RA), over a 12-month academic period. The main objectives of the RA will be to obtain all the relevant Bitcoin data and conduct the analysis and estimations. I will be overseeing the project management and involved in the research itself. The total compensation for the RAs is costed at the basic salary, starting level for this grade.

Impact: We believe that our proposed work would have a positive benefit for academics and also the Bitcoin community (miners and industry). We feel that our work could contribute to discussions on the scalability of Bitcoin unlimited from the perspective of the cost of running Bitcoin nodes, identifying optimal time for scaling, fraud detection and many others factors.

[edit renaming the BUIP to a temporary name until sponsorship is achieved]
 
Last edited:

solex

Moderator
Staff member
Aug 22, 2015
1,558
4,693
Hi @S Nadarajah
Thanks for your proposal. It sounds good. A couple of things:
  • A BU member is required to sponsor this BUIP for it to progress
    (I am amending the number until this happens).
  • A minimum two-week period of discussion is required (as per the BU Articles) and this will commence once it has sponsorship.
Assuming the above is achieved then it will go up for vote by the BU membership.
 
  • Like
Reactions: Bloomie

Peter R

Well-Known Member
Aug 28, 2015
1,398
5,595
@solex: this is a properly formatted proposal as per our requirements here, so I am happy to sponsor it. That said, my sponsorship should not be taken as support for the project, just that the proposal is properly formatted and should go to voting.

Two questions for @S Nadarajah:

1. Our grants are specifically for projects that advance Satoshi Nakamoto's vision for Bitcoin as peer-to-peer electronic cash. I think your proposal is interesting and useful, but I'm not certain that it really forwards Bitcoin Unlimited's agenda. We're trying to gain users and increase the block size limit so that the network can support these additional users. How does your research help that objective?

2. How do you intend to disseminate your research? It is important that there is some follow-up so that Bitcoin Unlimited members get some tangible deliverable (e.g., a research paper acknowledging BU as a funder).
 
  • Like
Reactions: Windowly and solex

S Nadarajah

New Member
Dec 22, 2016
10
5
@solex: this is a properly formatted proposal as per our requirements here, so I am happy to sponsor it. That said, my sponsorship should not be taken as support for the project, just that the proposal is properly formatted and should go to voting.

Two questions for @S Nadarajah:

1. Our grants are specifically for projects that advance Satoshi Nakamoto's vision for Bitcoin as peer-to-peer electronic cash. I think your proposal is interesting and useful, but I'm not certain that it really forwards Bitcoin Unlimited's agenda. We're trying to gain users and increase the block size limit so that the network can support these additional users. How does your research help that objective?

2. How do you intend to disseminate your research? It is important that there is some follow-up so that Bitcoin Unlimited members get some tangible deliverable (e.g., a research paper acknowledging BU as a funder).

Thank you for approving our submission.

1) Bitcoin is known not to satisfy features of a traditional financial time series - such as the market efficient hypothesis, random walk hypothesis, martingale difference hypothesis, etc. With these hypotheses not being satisfied, much of the known models in the academic community cannot be applied to Bitcoin. As part of the project, we aim to propose transformations which make Bitcoin satisfy these hypothesis.

One of the focuses was for proposals to improve communication with miners and industry leaders, or advance scientific discourse with the academic community. Our scientific research and results will be communicated to academics, miners and industry leaders through the publication of papers, software packages and presenting the work at conferences. We believe that that results from this proposal can further assist in spreading awareness and information about Bitcoin not only to academics but also to business and industry. We feel that this can, indirectly, help to draw in new users such as individuals and also commercial users. This can in turn help to encourage investment in further research and development, especially from larger corporations, into the Bitcoin technology - for example, adjustments to the Bitcoin block size limit.


2) The results of research produced by the funding will be published in leading journals in statistics (Journal of the Royal Statistical Society B, Annals of Statistics, Biometrika and the Journal of the American Statistical Association) and in leading journals in finance (Mathematical Finance, Finance and Stochastics, Econometrica, Journal of Finance, Journal of Financial Economics, Review of Economic Studies and the Journal of Econometrics).

The results will also be presented at international conferences like the Extreme Value Theory Conference, Royal Statistical Society Conference and Joint Statistical Meetings. Funding for the research from Bitcoin Unlimited will of course be clearly stated and acknowledged, as is standard practice for academic publications.
 

theZerg

Moderator
Staff member
Aug 28, 2015
1,012
2,327
Looking at your objectives:

In 1, are you referring to the Bitcoin full node network, the transaction graph, or something else?

In 2, what are "fraudulent transactions" in the Bitcoin network?

In 3, what do you mean by "speculative behaviour in the Bitcoin network"? How can you identify speculative vs. non-speculative behavior?


Would you be receiving other funding to support this work?
 

S Nadarajah

New Member
Dec 22, 2016
10
5
In 1, we would be considering a graph theoretic representation of the Bitcoin network where nodes in the graph represent each unique Bitcoin address. However, we would construct multiple (and different) graphs, where pairs of nodes are connected by an edge if a specific condition is satisfied. For example, in the most basic graph, pairs of nodes may be connected by an edge if two Bitcoin addresses have engaged in a transaction (the identification of which of the two is the sender/receiver can be overlooked in the most basic case). In this instance, we would be using data from the transaction graph.

In 2, "fraudulent transactions" should perhaps be included as part of anomaly detection in general. This could be defined as trying to identify patterns in activity in the Bitcoin network (transactions) which may be suspicious, and are significantly different from some threshold or "normal" activity level (determined from the whole transaction data). For example, with respect to magnitude/number/speed of transactions.

In 3, we would define speculative behaviour as any significant indication or evidence of users attempting to profit from an increase in the price at some point in the future, based on historic Bitcoin price and transaction data. We acknowledge that distinguishing between speculative and non-speculative behaviour can be subjective, some methods include (but are not limited to) identifying correlations; principal component analysis.

Lastly, we would not be receiving any additional funding from external sources.
 

freetrader

Moderator
Staff member
Dec 16, 2015
2,806
6,088
Budget: The total amount requested for the proposed work is $3500.
Are you able to put effort estimates to the proposed activities, or otherwise indicate how the amount requested was arrived at?

Reason I ask is because it seems that the list of proposed activities is quite extensive and the duration (12 mo) is sizeable too. Yet the amount quoted is small enough to make me wonder how it is going to be able to cover the involved efforts.
 

S Nadarajah

New Member
Dec 22, 2016
10
5
The $3500 would cover the compensation for the research time of a Research Assistant (RA), over a 12-month academic period. The main objectives of the RA will be to obtain the Bitcoin data and conduct the main numerical analysis – this will include simulations and analysis of the real data. The total amount is based on the RA spending around 4 to 5 hours per week working on this project.

Out of interest, could this budget total be adjusted (if necessary) prior to the proposal going to vote?
 

freetrader

Moderator
Staff member
Dec 16, 2015
2,806
6,088
Thanks for clarifying.

IMO the research assistant(s) involved should be named along with their affiliated academic institution(s).
I put plural in case the RA's change over the course of the project.
 

S Nadarajah

New Member
Dec 22, 2016
10
5
The research assistant involved in this project is:

Dr Stephen Chan, EPSRC Doctoral prize Fellow, School of Mathematics, University of Manchester, M13 9PL, UK.

Regarding my previous posts, out of interest, could this budget total be adjusted to implement additional cost such as publication fees for journal, hardware cost for the complex statistical simulations, conference registration fees to presents our results?
 
  • Like
Reactions: freetrader

Peter Tschipper

Active Member
Jan 8, 2016
254
357
@S Nadarajah I'm curious how you intend to gather your data. Are you working with just the blockchain data or do you also intend to gather real time node transaction traffic?
 

S Nadarajah

New Member
Dec 22, 2016
10
5
We intend to obtain bitcoin transaction data for all transactions occurring prior to a chosen cut-off date.
We are aware that some academic researchers have obtained this bitcoin transaction graph data directly, whilst other have obtained the bitcoin blockchain and extracted the transaction data using various software. We believe that the second method is more feasible.
 
I really have problems to understand how this realtes to Bitcoin Unlimited. How does it contribute to fulfill the Unlimited roadmap?

Don't get this wrong. Cooperating with academic research is a great idea; and just the existence of this BUIP demonstrated how far Bitcoin Unlmited did go. Maybe this is a great chance to demonstrate that Bitcoin Unlmited has become a hub for Bitcoin research and high-level debate.

But still, I'd rather see the research incorporated in the scaling idea of Unlimited.

Maybe this idea is a starter:

Analyse the Bitcoin transaction graph, and model the number, size and time of transactions, and the price of Bitcoin to examine whether individuals buy into Bitcoin to profit from its high volatility (Expected to finish by month 11-12).
If you get a question to research which is closer to the grand topic scaling, I'd support your BUIP. As @Peter_R and I earlierer agreed, there is a high lack on qualified academic research on Onchain scaling. How costly is it to run a node? How do the costs increase, if Bitcoin is allowed to follow the trajectories of the past? What are the factors that should take into account? How do they develop? And so on. It would also be possible to set scaling into context with price, or with transactional behavior (of exchanges or dnm). Maybe also the work of some KIT-researcher about coin selection and UTXO / fees can be a good point to cooperate.

However if you want to do more generalistic research into bitcoin statistics, as it is widely done by commercial website like blockchain.info or oxt.me and a variety of academics and cryptographers, I don't see a chance to support this BUIP.
 
  • Like
Reactions: freetrader

S Nadarajah

New Member
Dec 22, 2016
10
5
Our analysis of the Bitcoin transaction graph, and the modeling of the number, size and time of transactions, and the price of Bitcoin to examine whether individuals buy into Bitcoin to profit from its high volatility, will be used to help us understand the current trends of the current market.

How costly is it to run a node?
· There are many factors to consider when valuing the cost of running a node. The main costs are: time cost - initial learning curve (estimated as weeks/months); installation/configuration/initial sync cost – time, bandwidth, CPU, (estimated as hours or weeks); on-going running costs – bandwidth, CPU, RAM, hard drive; maintenance cost – time to troubleshoot and upgrade (estimated as hours/month).

Different approaches to price up the cost of running a node are:
· To estimate the cost by summing all block subsidies and transaction fees.
· To analyze the cost of setting up a node in terms of physical hardware and the running costs in terms of power and utility over different time periods of Bitcoin’s history.
However, the cost of running nodes may be more complex – i.e. cost of hardware, specification of hardware, time, effort etc. Therefore, there may exist other parameters that are harder to validate but which should be included in the costing.

In theory, costs will increase – i.e. greater computational resource requirements to validate and relay larger blocks, which may lead to fewer nodes (and vice versa). On the contrary, if the transaction volume increases significantly due to increased adoption, then there may be an increase in the willingness to run nodes (even with higher running costs). If transaction sizes increase, then there is also an increased incentive to operate in a trust less manner.

Given our initial timeline of activities, the data for this aspect would also be obtained in the initial stages of the research.

How do the costs increase, if Bitcoin is allowed to follow the trajectories of the past?
Given the findings from the previous two tasks, we can formulate and fit projections of possible paths that the costs may follow in the future using the trends of the past. (Statistical theory such as Extreme value analysis and distribution theory can be incorporated in this section to take in to account of extreme events that could influence the cost.)

What are the factors that should take into account?
In addition to factors such as the trends of the number, size and time of transactions etc., to further the research of an improved scaling approach, we have to identify some of the key metrics of the Bitcoin system that exist today:
· Maximum throughput.
· Latency.
· Bootstrap time.
· Cost per confirmed transaction (CPCT).
· Bandwidth provisioning and network topology.
· Network propagation rate – overall rate at which blocks propagated to, for example, 50%, 75%, 90% etc. of nodes.

It would also be possible to set scaling into context with price, or with transactional behavior (of exchanges or dnm).
Methods of Operational Research will be utilized when researching the optimal time to set scaling in the context with price. Therefore, it can assist us in determining some limit or thresholds in pricing. In relation to transactional behavior and exchanges, quantile regression methods will be utilized to analyze the transactional quantiles and provide an indication of when to scale.


These additional tasks will be complemented by a selection of the original proposed tasks as these will fuel and assist our final goal.

We will edit and update the initial proposal in the first post as soon as possible.

We are positive that this research will strengthen the cross disciplinary link between academics, miners and industry.
 
Thank you for answering my request. As I see, you have a nice overview on the topic of onchain scaling. Do you think it is possible to conduct studies about it, maybe with hundreds of nodes with different systems all over the world?

I'm quiet optimistic that a proposal from you about such studies will get approved by the Bitcoin Unlimited community. I even think it would be worth a thought to establish a hub for onchain scaling studies, which could be attractive for other scientific institutions, bitcoin instititions and miners and the industry in a wider scope.
 

S Nadarajah

New Member
Dec 22, 2016
10
5
I believe that it is possible to conduct such studies, subject to finding a suitable source for collecting this data. Such studies are very interesting and important as one can use spatial analysis to study nodes all over the world or in regions of particular interest.

Establishing a hub for onchain scaling studies will definitely assist and attract a wider scientific community. Academic institutions would also be very eager to collaborate with the Bitcoin community.
 

jbreher

Active Member
Dec 31, 2015
166
526
While the revision to emphasize building a cost model for node operation is a step in the right direction, I am not quite seeing how this research serves the Bitcoin Unlimited mission. Is there a possibility of modeling that compares node operational costs as a function of block size?
 
  • Like
Reactions: lunar and Norway