Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce interface between Bisq and BitcoinJ #30

Closed
2 tasks
dmos62 opened this issue Apr 23, 2020 · 9 comments
Closed
2 tasks

Introduce interface between Bisq and BitcoinJ #30

dmos62 opened this issue Apr 23, 2020 · 9 comments
Assignees
Labels
a:proposal bisq.wiki/Project_management#Proposal was:deferred bisq.wiki/Project_management#Closing_as_deferred

Comments

@dmos62
Copy link

dmos62 commented Apr 23, 2020

Description

This is a proposal for introducing an explicit Java interface between Bisq and BitcoinJ, with the purpose of improving maintainability of our Bitcoin network code.

Rationale

In summary

Why is it important?
  • Our Bitcoin network code is very difficult to maintain, which threatens reliability and blocks some features (SegWit). This would improve maintainability and be a stepping stone towards SegWit.

  • While our BitcoinJ fork is old, upstream BitcoinJ itself is widely regarded as outdated. It is possible that in the future we will want to make drastic changes, like switching Bitcoin clients. Reducing coupling between Bisq and BitcoinJ would improve our flexibility when making those kinds of changes.

Why now? What happens if we wait?
  • Our BitcoinJ code is not being audited (much less improved), because it is difficult to understand. The sooner we resume taking care of this code the better. At this time, reacting to a BitcoinJ vulnerability (salt over shoulder) would be very awkward.

  • Delaying work on our Bitcoin network code is also delaying SegWit.

In depth

Maintenance of Bitcoin network code

Bisq is currently tightly coupled with BitcoinJ, and our usage of it, being scattered throughout the rest of the code, is difficult to audit and to maintain. Abstracting our usage of BitcoinJ into an explicit interface would facilitate audits, maintenance of Bisq, maintenance of our BitcoinJ fork, and general comprehension of Bisq's internals.

Restoring maintainability of our BitcoinJ fork

BitcoinJ is a Bitcoin network client that is used by and tightly coupled with Bisq. Maintaining the BitcoinJ part of our codebase has proven challenging. Notable illustrations being that we're using a two year-old release of BitcoinJ (0.14.7, Mar 30, 2018) and that we have a dedicated role for its maintenance (which has not been filled since May, 2019).

Among other things, returning to beIng able to maintain our BitcoinJ fork (and thus being able to upgrade it) will allow us to support SegWit/Bech32 addresses, which is an often asked for feature.

Tight coupling and maintainability

Our BitcoinJ fork's maintenance involves understanding how we use BitcoinJ, which is made difficult by tight coupling. To illustrate what is meant by tight coupling here is the long list of unique BitcoinJ imports we use in Bisq (excluding tests, minor formatting added):

grep -r --include=\*\.java -E "import( static)* org.bitcoinj" ./*/src/main -h | sort -u

Output moved to pastebin.com for the sake of readability:
https://pastebin.com/raw/Jd0nA6Nk

And, below is the even longer list of (non-test) source files that import BitcoinJ classes (with minor formatting added):

grep -r --include=\*\.java -E "import( static)* org.bitcoinj" ./*/src/main -l | sort -u

Output moved to pastebin.com for the sake of readability:
https://pastebin.com/raw/Ap0wf3Pd

To maintain our BitcoinJ fork you'd have to make at least some kind of sense of the above usages of it, which is (I think evidently) intimidating and difficult. The primary way that the proposed interface would help is by structuring our Bisq usage.

Make sense of BitcoinJ usage by introducing an intermediary interface

The proposed interface would aim to hide the bulk of BitcoinJ specific logic and to centralize essential Bisq Bitcoin network logic.

BitcoinJ imports can be put into two categories: essential and auxiliary. I define auxiliary imports as those that serve to handle output from and pass input to the essential imports. The essential imports (such as Wallet, PeerGroup, BlockStore, net.discovery) are the exclusive focus of this refactor effort, in part due to the sheer number of aux imports, and, at the same time, because hiding essential imports behind an interface should be enough to see an appreciable improvement in maintainability.

In other words, the proposed interface would not hide all of the above listed imports, but only the most important ones. This also means that this wouldn't be a complete decoupling of BitcoinJ, though it would pave the way for one, if it were to ever happen.

Criteria for delivery

  • new intermediary interface abstracts away the most important BitcoinJ-specific logic;
  • there's a concensus that the new interface's clarity meets our standards.

Measures of success

  • positive difference in Bitcoin network code readability and maintainability;
  • reduced learning curve and complexity of resuming work on BitcoinJ fork.

Risks

Our Bitcoin network code is one of the most sensitive areas of Bisq; however, the risk of introducing bad changes should be minimal since this is a refactoring effort (meaning rearranging code without changing algorithms).

Tasks

This refactor effort is exploratory by nature, so making a precise plan is not possible ahead of time. Currently the plan is to analyse BitcoinJ usage and abstract it away behind the new interface, in order of descending importance, until only the least critical BitcoinJ code is left.

Estimates

Budget not yet discussed.

@dmos62 dmos62 added a:proposal bisq.wiki/Project_management#Proposal needs:triage bisq.wiki/Project_management#Triage labels Apr 23, 2020
@dmos62
Copy link
Author

dmos62 commented May 17, 2020

Concrete plan

Note: below text is written in a tree-like fashion, which helps me structure my writing and navigate the text. Hopefully it doesn't look too outlandish.

  • Below algorithm describes a refactoring of BitcoinJ usage in Bisq
    • the goal of the refactoring is to
      • generally speaking, improve maintenance of Bisq's Bitcoin network code
      • more specifically, lay the foundation for ongoing reduction of the BitcoinJ-Bisq coupling
        • ongoing, because the coupling is too extensive to be safely undone at once
    • the described refactoring aims to achieve that by
      • introducing a "btc.low" layer that
        • acts as a Bitcoin client API that is implementation agnostic
          • in other words, it would be the only place in Bisq with Bitcoin-client-specific (BitcoinJ-specific) logic
      • doing it in a way that allows safe incremental changes
        • what will make the process safe is the smallness of the refactoring steps
          • the small steps facilitate audits
          • further, being able to refactor in small steps will make opportunistic refactoring more common
    • the end product of this initial refactoring
      • a place where we can move BitcoinJ-specific logic
      • a place where we can see how exactly we're using BitcoinJ (in a much more centralized way)
      • a clear and simple abstraction and refactoring system for ongoing improvement of our BitcoinJ usage

Refactoring steps

For readability, I've bolded what you'd call the main variable names if this were code:

  • Make an identical copy of the BitcoinJ API inside Bisq
    • more specifically, of the subset of bitcoinj api that we use
    • initially this copy will just redirect all calls to BitcoinJ
      • nothing more than a fully transparent wrap
    • you call this copy in the same way that you call the original
      • and you get the same results
  • Designate the copy as foundation for Bisq-BitcoinJ separation layer
    • Bisq-BitcoinJ separation layer
      • or, in other words, Bitcoin client agnostic layer
    • the ultimate goal of this abstraction is to be the only repository of BitcoinJ-specific logic in Bisq
  • Make this layer the (only) Bitcoin client used by Bisq
    • as simple as
      • search and replace import org.bitcoinj.PeerGroup for import bisq.core.btc.low.PeerGroup
    • will now refer to this layer as btc.low, as in the package bisq.core.btc.low.
  • Incrementally mutate the btc.low abstraction
    • by
      • moving BitcoinJ-specific logic into it (from the rest of Bisq)
      • generally refactoring it to better match the needs of Bisq
    • this step is what makes the refactoring ongoing

Illustrative code sketch

Below sketch illustrates what btc.core.btc.low.PeerGroup could look like.

package bisq.core.btc.low;

import org.bitcoinj.core.BlockChain;
import org.bitcoinj.core.NetworkParameters;
import org.bitcoinj.net.BlockingClientManager;

import bisq.core.btc.nodes.ProxySocketFactory;
import bisq.core.btc.nodes.LocalBitcoinNode;

import java.net.Proxy;
import java.net.InetSocketAddress;

import com.runjva.sourceforge.jsocks.protocol.Socks5Proxy;

/* Below code is a sketch of what btc.low can look like.
 * To make it short and simple, it presumes that our BitcoinJ usage is only
 * what is in the bisq.core.btc.WalletConfig::createPeerGroup method.
 * This sketch tries to demonstrate how logic is moved into btc.low: you start
 * out with a lot of BitcoinJ-specific logic, and end up with a
 * BitcoinJ-agnostic endpoint and a clear view of what BitcoinJ endpoints are
 * used under the sheets.
 * The code would definitely be simplified more after moving, but that's
 * outside the scope of this sketch.
 */

public class PeerGroup {

    private final org.bitcoinj.core.PeerGroup bitcoinjPeerGroup;

    private PeerGroup(org.bitcoinj.core.PeerGroup bitcoinjPeerGroup) {
        this.bitcoinjPeerGroup = bitcoinjPeerGroup;
    }

    /* This method is a copy-paste of
     * bisq.core.btc.WalletConfig::createPeerGroup (except for the signature
     * and using locally defined constructors). It serves only demo purposes
     * and code quality inside it not representative of standards set for this
     * project.
     */
    private static PeerGroup createPeerGroup(
            Socks5Proxy socks5Proxy,
            NetworkParameters params,
            BlockChain vChain,
            LocalBitcoinNode localBitcoinNode
    ) {
        PeerGroup peerGroup;
        // no proxy case.
        if (socks5Proxy == null) {
            peerGroup = this(params, vChain);
        } else {
            // proxy case (tor).
            Proxy proxy = new Proxy(Proxy.Type.SOCKS,
                    new InetSocketAddress(
                        socks5Proxy.getInetAddress().getHostName(),
                        socks5Proxy.getPort()));

            ProxySocketFactory proxySocketFactory =
                new ProxySocketFactory(proxy);
            // We don't use tor mode if we have a local node running
            BlockingClientManager blockingClientManager =
                config.ignoreLocalBtcNode ?
                new BlockingClientManager() :
                new BlockingClientManager(proxySocketFactory);

            peerGroup = this(params, vChain, blockingClientManager);

            blockingClientManager.setConnectTimeoutMillis(TOR_SOCKET_TIMEOUT);
            peerGroup.setConnectTimeoutMillis(TOR_VERSION_EXCHANGE_TIMEOUT);
        }

        if (!localBitcoinNode.shouldBeUsed())
            peerGroup.setUseLocalhostPeerWhenPossible(false);

        return peerGroup;
    }

    /* Proxy all PeerGroup methods that Bisq uses.
     * These would probably be "protected" and be in a separate class, which
     * would be subclassed by this one, for the sake of separating the used
     * BitcoinJ endpoints and their usage.
     */

    private PeerGroup(
            NetworkParameters params,
            BlockChain vChain
    ) {
        org.bitcoinj.core.PeerGroup bitcoinjPeerGroup =
            new org.bitcoinj.core.PeerGroup(params, vChain);
        this(bitcoinjPeerGroup);
    }

    private PeerGroup(
            NetworkParameters params,
            BlockChain vChain,
            BlockingClientManager
            blockingClientManager
    ) {
        org.bitcoinj.core.PeerGroup bitcoinjPeerGroup =
            new org.bitcoinj.core.PeerGroup(
                    params, vChain, blockingClientManager);
        this(bitcoinjPeerGroup);
    }

    private void setUseLocalhostPeerWhenPossible(boolean bool) {
        this.bitcoinjPeerGroup.setUseLocalhostPeerWhenPossible(bool);
    }

    private void setConnectTimeoutMillis(int timeout) {
        this.bitcoinjPeerGroup.setConnectTimeoutMillis(timeout);
    }

}

@ghubstan
Copy link

@dmos62 , Just want to express some enthusiasm for this project... Really, really big thumbs up from me.

The newly forming gRPC/Core API is going to help in terms of quick dev/test iterations -- faster than re-starting the UI after small changes. It's a little too early today, and the priority for the API is to ship the simplest reference impl asap, but even that will help speed up functional testing.

@cbeams
Copy link
Contributor

cbeams commented May 18, 2020

@dmos62, what do you think about putting together an initial PR that demonstrates a single such refactoring that you think is demonstrative of the value that can be provided? I'm generally for this effort, but I'd like to have a look at a working example before signing off on a larger scale effort. My main concern here is that I'm not sure the "fully transparent wrap" approach you detail above will actually add much value. If we do nothing more than a 1:1 wrap of a subset of the BitcoinJ API, what do we really gain? I'd rather see this play out in code than discuss it further here, so if you'd like to take a stab at it, please go ahead.

Also, what is the significance of "low" in your proposed bisq.core.btc.low package? Do you mean "low-level bitcoin operations", perhaps? In any case, I don't think this is an intention-revealing name. Perhaps better naming will avail itself as we see and review an actual working example.

@dmos62
Copy link
Author

dmos62 commented May 18, 2020

@cbeams I agree a PR example is a good idea.

The 1:1 wrap part is just to get the foundation for where to move BitcoinJ-specific logic. I.e. the refactoring happens on top of that. You'd move the logic in one set of commits (like in the sketch) and then you'd most likely simplify it in another set of commits (not in the sketch).

At first I was thinking of calling it btc.client, since it would handle all Bitcoin networking, but then realised that actually the bisq.core.btc package is our Bitcoin client, and the proposed package would be the lowest-level of that client, hence the bisq.core.btc.low. It's low-level in that it hides BitcoinJ logic that the rest of the btc package shouldn't care about and exposes the basic interface that the rest of the btc package will use. It makes sense to me at this point, but that might change and I don't have strong feelings either way.

@sqrrm
Copy link
Member

sqrrm commented May 18, 2020

I like the idea of starting by just doing a transparent wrap and continue incrementally from there. One of the biggest issues will likely be keeping refactoring PRs small enough that they are easy enough to review and merge. I think the suggested route sounds good, but a sample PR will make it even clearer.

@oscarguindzberg
Copy link

oscarguindzberg commented May 23, 2020

  • I think this proposal is independent of Segwit support plan proposals#226. They don't compete with each other because they attack different problems.
  • If both proposals are accepted I suggest not to code both projects in parallel because that would lead to merging hell. They could be implemented one after the other. If both proposals are accepted I suggest merging [WIP] Use bisq's bitcoinj 0.15.8 bisq#4270 as a starting point, which is already coded and only requires peer review.
  • I did a research of bitcoinj alternatives back in Dec 2018 and found no good alternative (Feasibility for replacing Bitcoinj with Bitcoin Core SPV bisq#1062 (comment)). As of today I think there are 2 alternatives to bitcoinj that might make sense for bisq: a) bitcoin core and use its RPC interface or b) Electrum client for java (does not exist yet, but could be implemented). Every alternative has its pros and cons.
  • Alternatives to bitcoinj might work very differently and would be difficult to abstract things without knowing the interfaces of the other libraries.
  • This is a personal taste comment: bitcoinj usage in bisq could be subject to refactors and other code improvements, but I prefer to create an abstraction of a library once I am already implementing the usage of an alternative library. You guys know the bisq source code better than me, so this comment might not make sense for bisq.

@dmos62
Copy link
Author

dmos62 commented May 25, 2020

@oscarguindzberg I think there's potential for collaboration and see these projects working in complement to each other. More specifically, I can see this project facilitating the more interesting Bisq-side changes required for adopting SegWit and BitcoinJ 0.15.x in general.

Paving the way for an alternative library is more of a nice to have at this point. My priority with this project at this time is making our Bitcoin net code easier to maintain and audit. It just so happens that this is what's needed for switching out BitcoinJ as well.

@dmos62
Copy link
Author

dmos62 commented Jun 11, 2020

Status update:

While I believe that this project is very valuable in the medium to long term, and I very much enjoy working on it, I'm putting it on hold, because I feel there are short-term Bisq responsibilities that I should move my attention to. Follow a few notes I'd like to share.

This project is relatively slow to implement. To illustrate, I've spent 2.5 weeks refactoring logic surrounding PeerGroup, which is the simplest and least critical part of our Bitcoin code (that's also why I chose to start with it), and that's about 50% finished (intuition).

Whereas I started out thinking that this project would be low-hanging-fruit-only type of task, I soon realised that the really tangible increase in maintainability is from diving deep into Bisq code. Parts other than PeerGroup-related logic might be different, but probably not.

I've discovered a pretty complicated algorithm (which decides how we source peers) and I've not yet decided how to approach its audit and eventual refactor. I've created a TLA+ state machine of it (took about a week), but this algo has so many initial states that it's hard to define the properties that the algorithm should satisfy. For example, if a local bitcoin node is provided and custom nodes are provided too and we're in regtest, what sourcing method should be used? What about DAO regtest? I'm not really sure. And those are not all the input variables. Defining what to do for Mainnet is not that hard, but the rest is complicated, because it requires knowing a lot about Bisq. So that's the last thing I was working on.

Project's commit structure is such that it can be merged in installations and read one by one. I took care to have most if not all commits do only one thing, because for a reviewer 2 commits doing 2 simple things can be a lot simpler to review than those 2 commits squashed into 1.

I plan to now move to doing some PR reviewing and urgent bugfixing. I've considered continuing work on this refactor part-time, but I don't see myself being able to multitask this.

I'll PR the refactor to make my progress available.

@cbeams
Copy link
Contributor

cbeams commented Jun 16, 2020

Thanks for the update, @dmos62. I'm closing this as was:deferred as that seems most fitting (definition). Let me know if that doesn't make sense to you.

@cbeams cbeams closed this as completed Jun 16, 2020
@cbeams cbeams added was:deferred bisq.wiki/Project_management#Closing_as_deferred and removed needs:triage bisq.wiki/Project_management#Triage labels Jun 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a:proposal bisq.wiki/Project_management#Proposal was:deferred bisq.wiki/Project_management#Closing_as_deferred
Projects
None yet
Development

No branches or pull requests

5 participants