Blockchains, Determinism, Monads, Agents and Functional Reactive Programming

(I am going to assume the reader is a software developer who's interested in programmable blockchains. If you feel confused by smart contracts etc, go read Programmable Blockchains in Context by Vinay Gupta, who explains them far better than I do!)

I would like to propose that blockchains are a network concurrency primitive, like the two-phase commit or distributed hash-tables. That modern Turing-Complete chains, like Ethereum, are simply Erlang or Smalltalk-style virtual machines, and that their blockchain nature is orthogonal to their VM nature. This makes blockchain's future simply a part (although a vital one) of the widening adoption of decentralized computing.

To start my argument, let's wind back to 90s LAN parties and the importance of determinism.

Networked computer games are not entirely dissimilar the usual international derivatives and securities trading examples used to promote blockchain: we have players, we have actions, there are elaborately computed rules and someone keeps score. In formal terms, StarCraft is a function over all possible moves and game states which maps to the resulting game state. Or in pseudo-code...

public Battlefield starcraft(
  Battlefield theBattlefield,
  Set<Commands> zerg,
  Set<Commands> terrans,
  Set<Commands> protos

If we only want to play a single-player game, it isn't important that this procedure is a pure-function. It is fine if the game states returned from an identical starting point are slightly different each time, maybe because different random numbers were used or some small difference in CPU architecture. Provided the result is reasonable I can still enjoy the game.

If I want to play multiplayer this becomes harder. We can take two approaches :-

  • Calculate the new game state centrally then replicate it to everyone.
  • Replicate the player moves and re-calculate the game state everywhere.

Generally game developers choose the second. The moves are fewer bytes than the game state. Less data, less bandwidth, less lag, more frames-per-second, happy gamers. (Happy gamers pay premium prices for enjoyable games from quality game studios who then bid a premium for the best network developers.) Computer games run as specialised virtual machines managing the interaction of their "pieces" [bullets, players, Zerg...] in discrete ticks that we could also call transactions or blocks.

(The challenges in this for game developers are beautifully laid out by Spencer Nielsen.)

We might imagine a particularly well-governed online game where each XBox hashes and signs the calculated game state, creating a cryptographically secure log held in common between the participant XBoxes.

We could even generalise to an N-player game where each unit is permissioned by a unique private key, who's censorship would be disastrous (see your science fiction bookshop for real-life examples) and who's players will accept a 1/600th of a second frame rate. Obviously no-one would build this, but he point I want to make is that the key to efficient distributed data structures is determinism. The defining factors in a distributed data structure are determinism and feedback. If the output of a transition is not an input of the next, then there is no persistence of data. If the transitions cannot be computed independently of any particular environment, then we cannot distribute the structure in any meaningful way.

I would suggest that on this basis, we can only build usable generic networking primitives within a functional programming paradigm. Hidden state allows unmanaged indeterminism which prevents us from slotting in new network logic in practice.

With this restriction, we can start to think of The Blockchain (cue heavenly choirs) - or hash-chains (used since the 90s) - as box carts on a railway. We write our logic, decide some of the structures need to be globally shared between all our clients and then we can pick the appropriate distributed network "rails" to load them onto. In the same way we might pick between MySQL or Redis today.

This is also a widely used description of monads. See Railway Orientated Programming from F# blog, F# For Fun and Profit.

Within the functional programming community there is a strong tradition of cross-compiling into different environments. Two good examples from F# are MBrace [seamlessly execute expensive calculations on cloud servers] and Brahma.FS [push a subset of F# to CUDO machine code to be executed on graphics cards].

let command =
            fun (r:_2D) columns (a:array<_>) (b:array<_>) (c:array<_>) ->
                let tx = r.GlobalID0
                let ty = r.GlobalID1
                let mutable buf = c.[ty * columns + tx]
                for k in 0 .. columns - 1 do
                    buf <- buf + (a.[ty * columns + k] * b.[k * columns + tx])
                c.[ty * columns + tx] <- buf
let task =
    cloud { return "Hello world!" }
    |> cluster.CreateProcess

In both cases, we are taking well-defined logic, transcribing it to a different set of low-level instructions and pushing it into a different execution context. We are free to switch the "computing substrate" for performance or governance reasons without worrying about re-writing. (Obviously there are limits, many operations are very slow on graphics cards, hauling gigabytes of data to and from a cloud environment is awkward, etc.)

But the most widely deployed functional language is not F#. It is Erlang. And Erlang is fascinating. Developed in Sweden at Ericsson in the 90s, Erlang was designed to run telephone switch fabrics. Erlang was created because C++ was not conservative enough for large Scandinavian corporates. (Magdalena, why half of Kista without phones? Oh that's just young Krister fumbling his pointer arithmetic again. How long take to fix now that Olov is parental leaving this year? ... You get the idea.)

Decades of real-life industrial use in critical systems has given Erlang a rich set of networking and concurrency primitives. All of them built from the basic Erlang entity of a process. The Erlang process model is an implementation of the Actor-model of distributed computing.

From Wikipedia :-

An actor is a computational entity that, in response to a message it receives, can concurrently:
  • send a finite number of messages to other actors;
  • create a finite number of new actors;
  • designate the behavior to be used for the next message it receives.
There is no assumed sequence to the above actions and they could be carried out in parallel.

As Erlang is a functional language, every process is simply a recursive function over the process state and its message queue. For example :-

loop() -> receive
    {Sender, Num} -&lt;
        Sender ! Num,
start() -> spawn(fun loop/0).

This is functionally equivalent to the Ethereum blockchain's concept of a smart contract. It's all just actors. (Ethereum should really call them something else. It confuses outsiders.) I would suggest that where a compilation to Ethereum-like VM instructions exists, any actor system can be simply be transpiled into "the blockchain". (By simply, I mean subject to various arbitrary restrictions but hopefully ones that can be type-checked.)

Which is interesting, since this implies that we can lift concepts for hot-swappable code, fault-tolerant supervisors, etc directly from Erlang's world into "the blockchain". Or for that matter, that we could take an actor and choose to 'blockchain it' (or run it on a central server with a DocChain time-stamping service to prove integrity, or let multiple nodes vote on the next state, or assign a master node in cluster then use a hash-chain to integrity check it and then, etc etc).

We might write something like this in pseudo-code:

  let my_blockchain = chain {
    uuid: d115066a-a74a-4366-941e-409ab06ddef9,
    dht: bittorrent,
    membership: { ip_limit: [UK, NZ, AU, US, UK] },
    proof-of-work: x -> leading_zeros(sha256(sha256(x)))
  let new_chain = chain_load(my_blockchain, agent_foo(initial_data));

And that would pop into existence a fresh blockchain restricted to the anglosphere, bootstrapped from the existing bittorrent p2p dictionaries, using double sha256 as its proof-of-work algorithm with a single initial agent of type agent_foo.

And this is the way I think we will need to go as an industry. Even if a full functional development is too great a leap, I can't see blockchains taking off without at least the same degree of language-level interop as say, Google's Protocol Buffers.

It would require an Android developer to be able to define an interface, generate the Java client stubs and then fill in the, for ex. Solidity [Ethereum's main higher-level language], code required to implement the distributed networking functionality. There are similar patterns in Zero-C's Ice or Apache Thrift [runs much of Facebook].

module YP {
  class PersonDetails {
     string phoneNumber;
     optional(1) string address;
  interface PhoneBook {
      PersonDetails find(string name);
service Calculator extends shared.SharedService {
   * A method definition looks like C code. It has a return type, arguments,
   * and optionally a list of exceptions that it may throw. Note that argument
   * lists and exception lists are specified using the exact same syntax as
   * field lists in struct or exception definitions.
   void ping(),
   i32 add(1:i32 num1, 2:i32 num2),
   i32 calculate(1:i32 logid, 2:Work w) throws (1:InvalidOperation ouch),
    * This method has a oneway modifier. That means the client only makes
    * a request and does not listen for any response at all. Oneway methods
    * must be void.
   oneway void zip()

Now, by formal programming norms, Erlang's actor model is actually not that formal or strict. When academics came to include UIs in their pure research languages they came to a paradigm called Functional Reactive Programming. (Very simplified history.)

Functional reactive programming [FRP] is (for our purposes) the actor model with a type system, but without unbounded addresses. In the FR-model, message senders and consumers are attached by strictly typed channels. External I/O is then a stream of events passed through a network of producers and consumers.

Separating data-flows from the "agent" logic simplifies code reuse and it allows us to use compile-time checks to prevent a wider range of errors. All the complexity required to deal with any agent interacting with any other agent is lost.

(For more functional programming, try Purely Functional Retrogames.)

I am unsure how a fully FRP blockchain would work, and I suspect that having unbounded parties spawning Turing-complete entities into a single address-space would not be feasible under strong-typing. But an FR-model could be used to develop the inside of a blockchain application. Those "smart contracts" that were within a single application boundary could be developed in a typed fashion and then compiled to untyped blockchain (or other distributed data-structure) instructions. We could even put the try definitions into some sort of repository. (An approached user for other difficult to manage languages.)

So there are my predictions :-

  • Short-term: Network data structures will be defined within their client application's source code using cross-language interfaces which are then implemented in high-level procedural, probably blockchain, languages. Most applications will run everything on large public chains/time-stamping services to begin with; then split out specific parts to their own chains and services as they gain users.
  • Long-term: Blockchains, permissioned-ledgers and the rest will be written in functional languages in a FRP-style. As will the clients that use them. There will be single codebases, in single (as far as possible) languages. The norm will be a hybrid of trust-less central servers (a al DocuChain), public/private blockchains and ad-hoc shared structures like Secure Multiparty Computation.

tl;dr -> You can have your shiny neo-libertarian cryto-anarchy when Alice the iPhone Developer can write Tinder for Dogs on top of it!