BUIP126: (passed) Planet-on-a-LAN stress test model network

How much should we allocate to this project?

  • $1000

    Votes: 0 0.0%
  • $3000

    Votes: 0 0.0%
  • $10,000

    Votes: 3 60.0%
  • $30,000

    Votes: 2 40.0%
  • $100,000

    Votes: 0 0.0%

  • Total voters
    5

jtoomim

Active Member
Jan 2, 2016
130
253
BUIP126: Planet-on-a-LAN stress test model network
Submitted by: Jonathan Toomim
Date: 2019/5/17

Summary

I'd like to set up a LAN-based test apparatus for simulating a planet-wide network of nodes, using the Linux netem module to add network delays, packet loss, and bandwidth caps as needed. This will provide a controlled environment in which developers can test out new code or new configurations quickly and efficiently, and rapidly collect and collate performance data.

The suggested nature of this network will be a server rack full of used servers, all interconnected via Gigabit Ethernet, operated in my datacenter in Moses Lake, WA on a separate LAN with a dedicated 100 Mbps connection to the internet. These servers will likely cost around $800 each after adding SSDs and HDDs, and will come with 16 to 40 CPU cores and roundabout 128 GB RAM. Each physical machine can run multiple nodes (either in separate VMs or just on different ports of the same machine). For around $8,000, we can set up 10 servers and get around 40 to 100 network nodes.

Operating costs for a 10 machine setup will be about $60/month for the 100 Mbps internet connection plus around $80/month for electricity costs, plus some undetermined amount for labor from my employees. In contrast, renting 10 dedicated servers with this kind of HW specs would cost around $2,000/month if we ordered it from standard cloud hosting providers. Owning will be cheaper than renting if this network is in operation for more than 4 months.

I expect to be able to keep this rig in operation at least until April 1st, 2020, at which time it will likely need to be relocated.

These machines can either be set up with one static global IP address per machine or behind NAT with forwarded ports for SSH and other services. I personally prefer the NAT/port forward concept, as I expect that will be cheaper, more scalable, and more easily relocated.

Experiments will probably usually use regtest mode. Use in testnet mode as part of an actual global network can also be done with machines in this network, but the 100 Mbps outbound pipe may become a bottleneck.

For ssh, non-simulation admin tasks, and performance data collection, we can set up a separate LAN (using the secondary Ethernet ports) without any artificial latency or packet loss.

This test setup is intended to be made available by all developers in the BCH community, not just BU developers. Experiments using a heterogenous mixture of node implementations will be allowed, as will experiments using a homogenous set of nodes that are not BU (e.g. bchd alone, or ABC alone). I expect BU's developers will make heavier use of it, though, as BU's developers tend to be more scaling-focused than the other implementations.

I will probably end up just buying this gear outright and setting it up with or without BU's financial support. However, if BU's membership wishes to reimburse me for the hardware costs by passing this BUIP, I will accept that support.

I am also interested in hearing how big our members think we should make this network. 10 machines? 5? 50? Small networks will outperform big ones, and many performance problems might not be apparent unless we have high per-node peer counts or high hop counts for total tx and block propagation paths. But also, mo nodes mo moneh. So. How much?

Budget

I would recommend setting a budget of $10k for this project to allow for some future expansion should we find the need for it (e.g. upgrading to 10 Gbps ethernet, or adding more servers).

Discussion of this project can happen in https://t.me/BCH_stress_testnet.

edit 2019/8/15 solex:
heading & budget
 
Last edited by a moderator:

Jonathan Silverblood

Active Member
Nov 21, 2018
100
73
I have absolutely no idea how much funds BU is in possession of, which makes prioritization hard. Assuming BU has unlimited funds, however, I think somewhere in the $5k ~ $25k is reasonable to use to set up a public, somewhat permanent test network on real hardware.

Can the log files from the community mainnet stress test (or any other mainnet stress tests) be used to inform netem so that the virtualized nodes can be assigned virtual locations? (or, rather, can we use that information to set up one of many "standard" configurations to rapidly test different network conditions - mainnet similarity being one of them?)
 

jtoomim

Active Member
Jan 2, 2016
130
253
@solex can you tell the public (or JS privately by DM) what the treasury holds?

The minimum data we need to configure netem is 2 numbers per connection:

Packet loss
Latency

Neither of those are given by bitcoind debug.log files. Both can be measured from pings -- e.g. sudo ping -i 0.01 -s 1472 targethost.com.

Ideally, we would also have some additional information:

Latency variability
Packet loss variability vs time of day
Packet loss autocorrelations (do losses come in clusters?)
Maximum route bandwidth

The first three are much harder to measure. Including them in the network model will also make the regtest results less consistent, so it might be desirable to not use the first three most of the time in order to improve benchmark precision.

Yes, setting up different network conditions rapidly to test them is a central goal for this project. I want to be able to compare 1% packet loss against 10% packet loss against 0% by running a script each time with a different parameter.
 

solex

Moderator
Staff member
Aug 22, 2015
1,558
4,693
@jtoomim @Jonathan Silverblood
It is a matter of historical record that BU is well financed. Funds available are in BTC, BCH & BSV.
Obviously we want this to be wisely spent on scaling Bitcoin i.e. we have focused primarily on BCH for the last 2 years and now solely, after the mandate from the recent membership vote.

The test-net proposal for acquiring accurate metrics sounds good, and we can certainly put that to membership vote. BU has always welcomed other testers to its own testnet. I think we can do this in phases starting small and growing it as interest warrants.

One thing we found with the GTI project was that interest from developers outside BU was never built up to get its own momentum. This was even after the world's first 1GB block was generated and propagated. Our original idea was to improve scaling for large blocks then eventually stepping back so that it had just community usage.
 
Last edited:

solex

Moderator
Staff member
Aug 22, 2015
1,558
4,693
Assigning BUIP126 as reference number .
@jtoomim Can you update the BUIP with a header and formalise it somewhat into sections e.g. background, objective, budget...
 

jtoomim

Active Member
Jan 2, 2016
130
253
[doublepost=1562140228,1562139216][/doublepost]I ordered four servers today. They were 40-core, 64-GB machines, and cost about $530 each. I also bought four 1.6 TB PCIe (not m.2 or U.2) SSDs to go along with them. Total cost was $2919.52. I may also need some more gigabit network switches, and will definitely also need to get a dedicated 100 Mbps line (and maybe 4 public IP addresses) for these machines. If BU decides that it wants to reimburse me for these things, that would be neat. If not, that's fine too.
 

solex

Moderator
Staff member
Aug 22, 2015
1,558
4,693
@jtoomim
Well done on the work to achieve such a high throughput bench-test result!
We can't presume the decision of the BU membership on BUIPs, and this one is still in draft. I have just updated the thread title with its reference number, but it does still need some clarification via formatting improvements mentioned further up, and it can be included for the next vote.

Your goals here are closely aligned with the existing and larger GTI initiative (BUIP065) which is still an active project and will soon get a renewed focus of attention by the BU developers. Please do liaise with @sickpig and @theZerg about extending your LAN to include some of the GTI network servers for wider testing. In my opinion covering this setup cost is within the scope of the GTI so we can discuss that. For this to run as an ongoing separate project we would however need the BUIP to be passed for BU funding.
 

jtoomim

Active Member
Jan 2, 2016
130
253
I've ordered one port of 1 Gbps connectivity plus 4 static IPs for these machines. I'm expecting this to cost about $100/month. (This is good news -- I was expecting only 100 Mbps would be available. Our city's fiber network was apparently upgraded and nobody told me.)

I think I'll order a couple gigabit switches plus some big HDDs as well.
 

jtoomim

Active Member
Jan 2, 2016
130
253
I think I prefer the separate project scheme. The point of this setup is intended more to be an ongoing resource for developers for performance testing of new code or for CI testing. While these servers could also be used for GTI-type testing, I intend for them to have their own life and usage.
 

solex

Moderator
Staff member
Aug 22, 2015
1,558
4,693
Right. In which case this BUIP will be put up for next vote as a separate funded project.
It does need a clear budget section when it its re-formatted.
 

jtoomim

Active Member
Jan 2, 2016
130
253
Is there a limit on the maximum age of a post that the author can edit? I don't see any Edit buttons on the OP, or any of my posts in this thread older than 1 week.
 

solex

Moderator
Staff member
Aug 22, 2015
1,558
4,693
Hmm. The protective time limit for editing earlier posts is a drawback for the BUIPs. For the moment, can you just paste an updated proposal in this thread?
 

jtoomim

Active Member
Jan 2, 2016
130
253
I'll do that, @solex.

There's been a setback on the server acquisition. According to USPS, they were delivered on Monday at 10:40am. Supposedly, USPS left them "In/at the mailbox" -- as if they'd fit. When one of my employees showed up at 2pm, they weren't there. Either they had been stolen or they had not been delivered after all. We're following up now.
 

solex

Moderator
Staff member
Aug 22, 2015
1,558
4,693
@jtoomim That's very unfortunate. Hopefully, USPS can locate the boxes.
 

jtoomim

Active Member
Jan 2, 2016
130
253
It turns out that they shipped me a "free gift" and posted the tracking number for that on eBay, but hadn't actually shipped the servers themselves. They delayed on shipping the servers because they supposedly found an issue with one or more of the CPUs. They probably sent the free gift in order to evade eBay punishment for shipping delays so they could stick a tracking number into eBay's system.
 
  • Like
Reactions: solex

jtoomim

Active Member
Jan 2, 2016
130
253
The four servers have all arrived. One of the servers had a bad memory slot, but we were able to move the RAM around and get it to work. We weren't able to get the PCIe SSDs to work, so I have ordered some SATA SSDs to replace them via Purse.io. One of those SATA SSDs has arrived, but the other three are still en route. These new SSDs are 960 GB Samsung 883 DCTs -- basically, the fastest SATA SSDs you can get. Each cost a little under $200. Total HW cost is now around $900 per machine unless I can get a refund on the PCIe SSDs.

We got a gigabit port for these machines plus 4 static global IPs. Unfortunately, it turns out that the gigabit port shares bandwidth with the 3 other ports we have in use here, so it's not dedicated 1 Gbps. Oh well. It should still be fast. Gigabit port cost is about $90/month. Static IPs are probably $5/mo or something.

The controller software has been coming along nicely. I don't have the most recent commit of my code on github yet, but you can still get an idea of how it works by checking it out:

https://github.com/jtoomim/remote_stress/tree/master/src

So far, with this code I've been able to set up a test network with 2 old quad-core machines, connect 8 nodes in a ring, generate spam at about 8,000 tx/sec, and validate transactions at about 2,500 tx/sec per node using one 3.5 GHz or 3.6 GHz core per node. Most of my tests so far have been on a LAN. I've done a little bit of stuff between my home desktop and two servers I have in my DC, but it was glitchy because of NAT screwing connections up, and it was slow because I haven't yet pipelined, asynced, or threaded all of the RPC requests that I need. Not sure if I will do that optimization or not; I might just do everything on a LAN with a low-latency RPC connection and high-latency/lossy p2p connection.

For some reason, block propagation appears to be about 3x faster with the networked setup than it was with the single-machine stresstest script. I haven't been able to figure out why, but it appears to be true both for Compact Blocks and for Xthinner propagation. In each case, I think that it's really just measuring block validation time. Still, it's a bit curious.
 
Last edited:

solex

Moderator
Staff member
Aug 22, 2015
1,558
4,693
@jtoomim
Shall we proceed to vote on this? As I mentioned earlier, this work is closely aligned with the GTI project which already has membership approval. It would be good to have a budget paragraph with a clear total proposed spend.
I will add the header and formatting info for you.
 

jtoomim

Active Member
Jan 2, 2016
130
253
Sure, we can put this to a vote. Total cost for the 4-machine setup (which is running now, btw) was around $4k, plus about $100/month for the static IPs and power. I might want to upgrade to 10G networking, though, as a 160-node test run ends up using about 42 MB/s right now, which is uncomfortably close to saturating the 125 MB/s theoretical capacity of 1 Gbps ethernet. But that should be pretty cheap, maybe $200 total. (Two of the machines already have 10G interfaces, but two are lacking, and I don't yet have any 10G-capable switches.)
 

AdrianX

Well-Known Member
Aug 28, 2015
2,097
5,797
bitco.in
@jtoomim what would be your minimum and ideal budget for this?

What is the predicted lifespan of the network?
Geophysical, where will the infrastructure reside?
Who will maintain it, and why?

Would you consider a leasing agreement, you build it and host it on the predication of X number of tests committed to in advance with an agreed rate for service.

In the light of the questions above, if there is value in it and I think there is, run it as a marginally profitable service and let's assess the value proposition for each test and ongoing service in advance.