Subterra: A Mistake Gone Right

Oct 24, 2017 programming, language design

About a year ago, I created a programming language. This wasn't the first time I'd done so; I'd churned out several Turing tarpits, none of which were likely anywhere near actually earning the Turing part of that name, and none of which had gotten much of any attention at all, even if I did release them. By this point, I was quite experienced at hacking together a bit of Python code to interpret a seemingly arbitrary sequence of symbols as simple operations on a single data stack. For the time being, that had satisfied me, but I was starting to grow a bit restless. I wanted to make something useful - not necessarily that would be used, but something that could be used if anyone happened to be insane enough to try.

This is the story, as best I remember it (and probably with a little poetic license), of how I created Subterra; the mistakes I made, the troubles I went through, and the lessons I learned.

The Goal

I set out creating Subterra with a simple goal in mind:

Subterra is an esoteric programming language designed to be simple yet powerful. Every instruction in Subterra consists of only one character, it uses a single stack for data storage, and the only datatype it handles is integers (a typical "tarpit language").

~ from the Subterra wiki

If that's somewhat cryptic to people not really involved in the esoteric programming language community, my apologies - essentially, it boils down to three goals:

The first two goals are simple enough; these are things I'd done with nearly every programming language I'd created before. The third, however, is a bit trickier - how do you define how powerful a programming language is?

In the case of Subterra, this essentially ended up meaning that it had a few features not usually found in tarpit languages:

This all seemed simple enough to accomplish on the outside, but in reality, it was anything but.

The Problems

For the sake of conciseness, I will skip over the issues raised by the latter three features above. For the most part, they were relatively simple to implement, and the worst I had to deal with in the process was tedium, not extreme challenge. Importing simply built on the existing subroutine structure, the source language imports were simply a matter of some slight tweaks to that logic and to the in-memory representation of subroutines, and error logging, while tedious to add, was far from complicated.

With that said, the other two problems provided much more difficulty. Let's start from the top...

Functions Subroutines

Subterra's implementation of functions is... interesting to say the least. Since all data is manipulated by the stack, there is no concept of "parameters" like functions in most programming languages; perhaps this is why I settled on the name of "subroutines" instead.

There are three types of subroutines in Subterra, which can be told apart by the type of brackets that enclose them:

Subroutines also had a sort of implicit return built in - excepting the {} type, at the end of their execution they would pop the top value from their own stack, if it existed, and push it to the calling context's stack.

When defining a subroutine, it would pop an integer from the stack to use as it's ID. This ID could be used to refer to and call the subroutine. Certain constructs, such as while (w) and if (?), were also implemented in terms of subroutines; they could be passed a subroutine body itself, or the ID of an already-defined subroutine.

This all sounds fairly simple in retrospect, but at the time, it caused me no end of trouble to implement. Stack overflow errors, recursion-related bugs, and more plagued me for days. Still to this day, I would say that the single biggest source of bugs in Subterra is this crazy, backwards way of implementing functions.

Lessons Learned


Strings, while not quite so large a problem as functions, were still quite the troublemaker. Trying to represent text in a language that only knew how to deal with integers was a perilous journey for a 16 year-old kid who had yet to fully grasp how binary worked for anything but math. There were two pieces to deal with: the representation of strings in the source code, and the representation of strings in memory.

The representation in the source code was the easier of the two. I used the typical style of enclosing the text in either single (') or double (") quotes. On the parsing end, I made a special exception in how the parser worked; on encountering a quote character, it would defer to a specialized parser that built the string, handled escape codes, and returned control to the main parser when the matching quote was found. On the whole, it was far less imposing a task than I'd expected.

The representation in memory was somewhat more difficult. There wasn't any way to build in awareness of what was and wasn't a string on the stack without changing the core ideals of the language. This left the task of keeping track of the difference between string and number to the programmer, which made my job somewhat easier. I eventually settled on pushing the string, character code by character code, onto the stack from the end to the beginning, followed by a length value. This gave an easy way to find the end of the string, without resorting to strange hacks or workarounds.

In the end, the trouble I had with strings was less an actual problem and more a problem with my own thinking. The solution was right in front of me, without even changing the limitations that I set for myself. I was simply too caught up in my own mindset to see it.

Lessons Learned

The Aftermath

A year later, I haven't really touched Subterra much. My last commit to the Github repo was on September 1st, 2016, only about 2 weeks after the initial commit. While it was a short-lived project, it's one that I remember fondly; not because it got me attention, or because I created something honestly useful, because it didn't and I didn't. What Subterra did do for me was teach me some important lessons, both about programming, programming languages, and about myself.

I don't recommend trying to make your own programming language for everyone. Until you get the hang of it, it can be quite a grueling process; it takes time and effort that a lot of people just aren't willing to spend. If you do decide to try making your own programming language, though, don't let setbacks like the ones I had with Subterra get you down - learn from them, move through them, and never be afraid to ask for help. Who knows, maybe your creation will be the next Java... or maybe it will be your Subterra. Either way, I wish you the best of luck.