Coding as a Circular Dependency

Here at DEVLIKE, we’ve been doing something we think is very important, and I think it’s finally time to share it with you! We’ve taken some time to look at programming from a different angle, under a different light. We tried to take nothing for granted, and to evaluate with care between historical artifacts and first principles. The topic of our exploration: seeing coding itself as a circular dependency problem. To start, let’s observe non-trivial boilerplate code. There’s a lot of industry code that falls into this category. A good starting point would be finite state machines.

Example finite state machine from Game Programming Patterns

Whatever your purpose for implementing finite state machines in code is, you’re creating non-trivial boilerplate. It’s something we keep rewriting, and as with everything else we keep redoing, it keeps growing into a more generalized form. The Game programming Patterns book has a few tips on how to write these.

From a couple of simple if-then-elses wrapped into a stateful function, we get to generic libraries that have everything and the kitchen sink. In and of itself, openness isn’t a bad thing, but the foundations we have are never wide enough to support it. The ultimate open library always fails someone, so instead, we pull back and implement propriatary versions that are more versatile than the if-then-else code we could write, but only enough to cover exactly what the people implementing them need. So you end up with a finite state machine that you can show and edit, and then you come to saving and loading. Not having worked with many other libraries that you might have liked (because they either do too much and not enough, or just not enough), you’ll probably invent a new format, or at least a new embedding in some established data format (like JSON or XML).

XKCD 927

Now we have a new finite state machine format, and it can do whatever you need, but not more. We could have one for every person on the planet who needs finite state machines, and a whole forest of miscommunication. Time to start writing intermediate layers between these different ways of saying the same thing!


Why can’t we just do simple?

The semantics of finite state machines are usually quite simple, however, so let’s focus on that. We can draw it simply enough, in a way that everyone can understand, so why can’t we code it in a similar way? Why can’t we just do simple in a simple way? Let’s get back to the beginning – it might help to take a look at how finite state machines are built in practice. Usually, you do only what’s enough, and for a start, that’s to encode a single yes-no state – a boolean. To use this boolean, we need to translate it from data to logic, meaning that we need to turn it into a branch. We are now at our if-then-else, and this is a good spot.

However, adding only a few more booleans leads to an explosion of statefulness: booleans are too simple for what we need. We want to talk about a state defined by some number of them, not about all the parts by themselves, and the combinatorial explosion leads to truly unruly code. We most probably don’t want to add fuel to this fire, and take another route at this point, but the only other route is the one we’ve talked about in the first place: it’s either write code that will fail us now, or code that will fail us later. It might feel like I’m being pessimistic, but I truly do hope for more good code in the future, and intermediate layers good code do not make.

So, where’s our mistake? Should we throw out booleans? Should we not generalize? Do we need a better if-then-else? In a way, yes and no to all of these questions: our mistake is writing code that should have been an image in the first place. What’s the mysterious difference, you might now be wondering. Without wasting too much of your time, the difference is primarily in the linearity of text. We call text “strings” in programming, short for strings of characters, and this in fact can be taken literally: they are S -> t -> r -> i -> n -> g -> s. They are graphs that do not branch, and this is their defining quality. To represent graphs via code is to have to create an embedding of branchings into strings first. We’ve done this, but even when we do create if-then-elses, and even when we allow for code to be semi-two-dimensional and indented nicely, the linearity of code catches up with us rather quickly…

There’s a definite increase in no-code editing in recent years, including high-profile visual scripting languages, like Blueprints in Unreal Engine 5. The experience these editors bring ranges from fabulous (for newbies or small projects) to bad and untenable (for larger and more expansive teams), mostly for the same reasons we already gave for code: there’s boilerplate that comes up when things that should be text aren’t. A mix of both code and graphs seems best, but how do we choose between them? What makes finite state machines boilerplate, and visual scripting not, or vice versa?

Entropy as a guide

If you look at a picture of a finite state machine versus when you look at the code for the same automaton, you get a lot of information very quickly, evolutional tendencies notwithstanding. This is a guide towards choosing graphs in this case: lower entropy is good, and fewer ways to make mistakes is lower entropy. The foundational tenets of programming language design agrees: the compilers we make should make the well-defined states of our programs align with the code we are allowed to write. A better compiler will allow fewer badly defined states. A better compiler has lower entropy over the state space of our applications.

In a similar way, seeing a picture of a visual script is practically the same as reading the code for it: you get an instruction, and then follow along through to another instruction, and can’t pick up speed — a clear sign to keep this code in the form that requires less data to present. In this case, that’s text because we care not about the connections between instructions but the instructions themselves: data is data, whereas above structure was trying to be data. An increase in entropy, in these terms, is simply amounting to more time needed to understand things: more guesses, more attempts to bugfix, more tests written, more information added to cover for it.

It is obvious that all code carries both structure and data: we build types that hold values, and we use patterns to 1) embed structure into text, as previously stated, and 2) create a semantic context that is never truly in the code, but does unravel during execution. This context is, as is all execution, a graph between previous and next states, and thus conductive to the same presentation problems that we have already talked about. Having the full executive context of our programs in our heads with no clear way to present it to our peers (especially being dependent on our peers having the same model and knowledge that we do) is what makes higher-level coding in larger teams daunting. Any miscommunication introduces entropy and has led directly to the technological ceiling that the industry is facing nowadays, sinking billions of dollars to somehow build through. This is impossible for cyclical reasons, all of which is described above.

So, what now?

We need to build a new base, naturally, one that is (at least) capable of supporting any and all graphs that people can throw its way. To do so, we need to expand on what a graph is. Usually, what we think of is nodes connected to each other with arrows. There’s potentially a lot of extra detail work here: both nodes and arrows can have labels, both can be colored, some of them can be marked as “initial state” or “final state”. Whatever base representation we choose should be able to deal with all of these situations without making too many exceptions, othrewise we will most certainly be guilty of building yet another standard with the same problems as before.

This is where we introduce tiles. A tile is a tuple (id, source, target, type, data) that we’ll use for pretty much everything. It consists of a bit of structure and a bit of data, and every tile is independent. The id of every tile is its unique identifier, ie. a simple unsigned integer. source and target are of the same type, and they represent the structure of the arrow our tile represents: our tile quite simply is the arrow source -> target. It might seem unnatural to have all the tiles carry source and target information when only some of them are arrows, but we will use this to our advantage. Investigating id, source and target, we have several combinations that land in different categories:

  • id = a, source = a, target = a: all three of our identifiers are the same. This tile represents something that is its own source and target. Quite simply, this is, by definition, an object.
  • id = a, source = b, target = c: a tile is a connection between some other two tiles. It identifies an arrow.
  • id = a, source = b, target = b: this is an arrow that connects b -> b, meaning that it’s a looping arrow. Not too much of a difference, but still structurally different!
  • id = a, source = a, target = b: our tile’s identity is its source, meaning that this is, in a way, a virtual arrow starting nowhere but leading somewhere. It has a separate target, and can carry information about it. We call this a descriptor.
  • id = a, source = b, target = a: the dual of descriptors, our tile’s target is itself, and it has a source leading to itself, meaning that it’s a virtual arrow going nowhere specific. This is an extension.

With these four entity (the words object is taken) types, we can go on to define anything we need: a simple graph uses just objects and arrows. If we need to add properties (such as labels, colors, etc), these are descriptors of arrows and objects. If we need more of some data to describe a single tile, instead of making its type more complicated and embedding lists, trees, etc. inside a simple type system, we add extension tiles to it as simple data carriers.

The graph from the top looks like this when represented in tiles:

(1, 1, 1, "Label", "Standing")
(2, 2, 2, "Label", "Ducking")
(3, 3, 3, "Label", "Jumping")
(4, 4, 4, "Label", "Diving")
(5, 1, 2, "Label", "Press Down")
(6, 2, 1, "Label", "Release Down")
(7, 1, 3, "Label", "Press B")
(8, 3, 4, "Label", "Press Down")

Every tile here carries the type “Label”, which is quite simply described as Label: s32, meaning it’s a bounded 32-byte string. More on the type system later, but it’s quite expressive, allowing for basic types and records. The decision to have every tile carry only its label is based only on the picture we started from.

If someone came in and said that we need to make Standing an initial value, we can do this with no changes to the above:

(9, 9, 1, "Initial", none)

A new tile is added, a descriptor at that, to give us more info on its target. Of course, none of this is meant to be in textual form, instead being tightly packed in blocks of tiles called mosaics. One mosaic describes a full graph (or forest) with all its features, but the fact that tiles inside it are independent means that you can isolate sub-mosaics for the things you need (for example, if you no longer need the positions of your nodes, you can leave them out and then use a smaller, more condensed — and thus faster — mosaic going forward).

I hear you saying “yes, but you just represented the state machine there, it isn’t working!” and I agree – this is a representation, but one with which we can go far: while it’s completely possible to write code to work with this graph, we can get mosaics up to represent any graph, and what is a compiler, for example, if not a rulebook for how to transform one complex state into another? More on this topic, and our grasp project at some later time, though.

Data-oriented Design Adjacency

It’s common knowledge that graphs are the bane of data-oriented frameworks, mostly seen in ECS. It’s a real bother to query components to get references to other entities and then do mini-queries over those. It’s like throttling an engine, and always feels bad. Apparently, that’s an artifact of coupling data and structure.

Tiles come pre-packed with an entity identifier, and their relations are included in the source and targets, and strictly kept outside the data. The actual type and data fields of every tile are straightforward: data is a vector of bytes typechecked against the type field using a pretty simple type system.

We have base type definitions, such as:

Label: s32;
Number: i32;

As base types, we have unit for a void-data empty type with a single valid value that never needs to be set, i8, i16, i32, i64 for signed integer types, u8, u16, u32, u64 for unsigned integers, f32 and f64 for floating-point types, s32 for bounded 32-byte strings, s128 for 128-byte strings, and bool for booleans.

Together with these, there are record types such as:

Parameter: { name: s32 };
Position: { x: f32, y: f32 };
Error: { code: s32, line: u32, caret: u16 };

The notation used here is standard for struct-like types, and doesn’t require much explaining. Types also allow for nesting, such as in:

Line: { start: Position, end: Position, width: u8 };

Types defined this way resolve into a completely flat list while in use:

Line: { start.x: f32, start.y: f32, end.x: f32, end.y: f32, width: u8 };

Because of their structure, mosaics can be used in two ways, depending on your needs: one has you work just with tiles, with all the data in them — this is convenient for general usage, but with many tiles, it may be rough on memory due to cache invalidation. It’s trivial, however, to change storage so that tiles are (id, source, target, type) tuples, and data is kept outside it in a sequential memory store, indexed by id, optimized via sparse set usage, separated by type (as every type defines exactly the same size). This makes it easy to use mosaics naturally within an optimized ECS framework at a very low cost of the extra structural data.

Tiles seem to solve the circular dependency we observed before: the name is aptly chosen — it isn’t that the problem disappears, it’s just that it’s localized in the glue behind every tile of a mosaic. There’s no global structure and global data, it’s always cut into simple pieces and padded behind what looks like regular code. The underlying quality of mosaics not having a semantics assigned to them beforehand is what makes them wildly generalizable.

We’ve built an extensive library on top of this to allow for the various ways to extract data from mosaics to surface, but the real fun starts when other people take this into their own hands and write their own stuff on top. Or draw! Both work. If any of this sounds interesting, follow us for more!


Posted

in

by

Tags:

Comments

One response to “Coding as a Circular Dependency”

  1. […] from, and what the foundation for our solution looks like, we have a blogpost about it right here. Today, however, we’re going to be talking about building structure, both fundamentally and […]

Leave a Reply

Your email address will not be published. Required fields are marked *