BUIP044: Development Process

deadalnix

Active Member
Sep 18, 2016
115
196
Proposer: Amaury SECHET
Submitted on: 09/01/2017

Summary:

This BUIP aims to provide a world class development process for BU. The current process has several shortcomings, such as the developer being bound to become a bottleneck as the team grows, or development being stalled when a release is prepared. The proposed process is a variation fo what is used in many software companies and open source projects.

I propose we start moving toward this process after the next release so the release process isn't disturbed too much. If this BUIP is accepted, the articles will need to be amended to reflect its content.

Definition:

A developer is anyone writing code with the intention of having it included in BU. When using the term developer I'm not referring to the developer of BU's articles.
A committer is someone with commit right to the BU repository.
The release manager is the person deciding what goes into the next release and when.

Specification:

The developer as per the BU articles becomes the release manager and a committer. Other trusted BU members can become committer as well, but there is only one release manager.

The development happens in the branch "master". All developers must base their PR on this branch, the only exception is a merge conflict in the release branch.

To be merged, a PR needs to be accepted by a committer. A developer who is also a committer cannot accept its own PR.

PR are squashed and rebased on top of master. The only exception is when backporting work from 3rd party client such as bitcoin core or bitcoin classic, in which case a merge node is created and the commit structure kept "as this".

Release Process:

Once the release manager decides it is time, (s)he cuts a branch from master called bu_X.Y , where X and Y are major and minor version numbers. Any change made in master before the branch cut will be included into the release. Any change made after the branch cut won't, unless a developer specifically request the release manager do merge this specific change. The change must be already in master before such request is made. If the change doesn't merge cleanly into the release branch, the developer needs to do another PR with the change made on the release branch. The PR needs to be as close as possible as the original one.

Once the release manager consider the release reached an appropriate level of stability, (s)he creates a tag bu_X.Y.0 and publishes the binaries.

Change continue to be merged into the branch bu_X.Y branch and the release manager creates tag bu_X.Y.Z when (s)he judges a new revision needs to be published.

The process continues in the bu_X.Y branch until the release manager decides it is time to create the branch bu_X.(Y + 1), at which time the process moves into that branch.

Guideline and rationale:

This section contains a bunch of guideline about how to use this process successfully. It also contains various rationale to for above rules. The rules mentioned above are not very numerous on purpose. Complex process with many rules tends to get in the way by their lack of adaptability. In our case, we want to have a set of guideline, but the only rule to get something merged is to have a committer accept it. This way, committers can chose to disregard some guideline when it doesn't make sense in this particular case. For instance, it would be undesirable that the development work is stalled because the test infrastructure is broken, so having a rule such as nothing can be committed unless tests are green would be counter productive. On the other hand, we can trust committers to enforce that tests are green, but they can make exception when test do not run properly for some reason.

master needs to be in a state where a release branch can be cut pretty much at all time. Any breakage happening in master must be treated with the upmost importance. It is expected from the developer who made the code responsible for the breakage to fix it promptly. If this isn't happening, either because the developer is not reachable or because the problem is too complex to be fixed promptly, the diff will be reverted.

PR must be kept on point. It makes code review easier and make history easier to follow. In addition, this ensure that each diff can be reverted cleanly with as little risk of conflict as possible. This is important because we want to keep master in good shape at all time so we need to be able to revert diff breaking something easily.

It is advisable to have a bot merge code rather than the committer him/herself. The bot can rebase the code on top of master, squash it, run the tests and commit if the process is successful, or report the problem if it isn't. This avoid merging code conflict with something that was committed in between when the patch was created and when it was accepted.

It isn't because all tests are green that everything works properly. It is advisable to have a build machine creating a new build on a regular basis, run it in a monitored environment and report any abnormality.

It is a good idea to keep slow tests out of the test suite that run on each PR. This slow down development and has diminishing return. A good alternative is to have a machine pulling master and running the slow tests on a regular basis. When one of these fails, the machine can bisect to find what commit broke it and report it. This allow developer to move faster, and, if PR are kept on point, it won't be too hard to revert the offending code.

Rebasing/squashing is a better alternative to merge. If a PR is made of several commits, it either indicate that it is doing several thing and should be split in several PRs, or it is doing one thing and each commit present an incomplete picture of the change. Having a linear history with focused commits makes it much easier to understand for newcomers. It is also much better to work with when making tooling as the ones mentioned above.

When something fails in master, while no tests found about it, it is good to have an incident review. How did the problem happen, how come it wasn't detected, and coming up with actionable items to avoid or mitigate the problem is the goal of the review. Good direction usually are: make the problem impossible by design - for instance via refactoring, add testing for this specific case, add monitoring in pre-production for this kind of scenarii, etc...

It is often preferable to make failing cheap rather than prevent failure. Technique which prevent failure tends to be expensive in other ways such as development slowness and usually are self defeating in the long run - for instance development slowness will cause the counter measure to be implemented slowly, and may even cause the project to slowly die compared to the competition. Slowing down dev is *ALWAYS* a losing strategy.

Grilling the master and release branch will all the available tooling is a good idea. We should try to setup instrumented clang build (ASAN, UBSAN, TSAN) as soon as possible.

It is good to avoid time consuming tasks such as formatting. Providing a clang-format config is a good idea. having a linter flag style problem in PR before any reviewer waste time on it is also a good idea. The developer gets immediate feedback and the committer do not need to spend time on these problems.

Automate, automate, automate, automate, ...

Do not accept PR that come without tests without a good reason to do so. Try to figure out what the developer has done to test its changes.

"Shotgun diffs" (diff that touches many files) soon before the branch cut. This will reduce the number of merge conflict into the release branch.

Release early, release often. This keeps discrepancy between ongoing dev and code in the wild low, and allow for real world feedback on the work being done. Contrary to intuitive expectation, this also reduce the number of defects. The more unique change there is in a release, the more likely something surprising is going to happen and the harder it will be to track it down.

Any committer can accept any PR. However, it is a good idea to let committer with more expertise in the area review it. Making this a guideline rather than a rule allow for these reviewers to be bypassed if they can't handle the load for some reason. This way, nobody becomes a bottleneck.

The release manager should be working on the project full time.