Writing a 2D platform game engine in Golang

Tue, Apr 3, 2018       22-minute read

This article describes the hurdles I encountered during the process of building a simple game engine in Golang. To aid in building the engine I built a demo game alongside it: an approximation of the first level of Super Mario Bros, to which this article makes occasional reference.

As a working software developer I very much respect the importance of copyright, so I feel it is important to note that neither the demo game nor its source code are available for download. If, like me, you enjoy playing the original Super Mario Bros, there are numerous legitimate options for obtaining a copy.


My day-job centres around video analytics and most of my professional life has involved the development of web services in the marketing and eCommerce sectors. So to say that I’m not a game developer is an understatement. That having been said, the game developers of the eighties and nineties greatly inspired me.

My first foray into game development occurred in the nineties, consisting of Borland Delphi, dozens of TImage components (and at least as many TTimer components) loaded haphazardly onto a form. I think I wanted to make a Sonic the Hedgehog game. Of course it was a disaster.

But now, in my second decade as a professional programmer, the thought occurs to me that maybe — just maybe — I might finally be good enough to make a game like they did back in the day.

Here is what I managed to build. Read on for details of the choices I made and the challenges I faced.

Finding inspiration

Recently I have been binge-watching videos from the GameHut YouTube channel.

For anyone unfamiliar with GameHut: it is the YouTube channel of Jon Burton, founder of the game studio Traveller’s Tales. On the channel Jon delves into the technical challenges of creating games for consoles such as the Sega Mega Drive (Genesis) and Saturn.

I found it fascinating how the limitations of the Mega Drive forced developers to be inventive when it came to storing and displaying graphical data. A video describing the challenges of animating the Sega logo at a frame rate well above the console’s limitations blew my mind.

As software developers in 2018, we can provision a virtual machine with whatever level of resources we require and deploy a containerised application to it within seconds. If we need to install an NPM dependency that pulls in an additional hundred dependencies, we don’t bat an eyelid.

I can’t speak for other developers, but every day a little part of me feels like I’m copping out by having so many luxuries at my fingertips. I felt inspired by GameHut’s videos and wanted to see if I could have cut it as a developer back in the day.

It all started with a sprite

At the core of any 2D game is the notion of sprites; small graphics that represent a game object or character. My initial inclination was to create a basic sprite engine using the knowledge I hoped I had gleaned from Jon’s videos.

As much of my recent development has been in Golang, it seemed to be the logical choice for the project. Not least of all because of the language’s reputation for speed: my past experience with graphics programming has been that working with scanlines is immeasurably faster than individual pixels, which my plan revolved around.

From the outset I assumed that my experiment would work, but that it might not be able to process more than a few frames a second.

To test the theory I settled on a sprite format that was 16x16 pixels in size, represented by a series of 32-bit hexadecimal integers. Using this approach, a single sprite would consume only 128 bytes of memory.

In code, a sprite looks like this:

{
    0xcccccc95, 0x52cccccc,
    0xcccccb7f, 0xf78ccccc,
    0xccccc5f7, 0x575ccccc,
    0xccccc59b, 0x975ccccc,
    0xccccc914, 0x17fccccc,
    0xccccc5e4, 0xa572cccc,
    0xcccc25dd, 0xd27fdccc,
    0xcccdf8dd, 0xddf75ccc,
    0xccc2fddd, 0xddb77dcc,
    0xcccfbddd, 0xdd877bcc,
    0xccdf2ddd, 0xd8277bcc,
    0xc8ea0bd8, 0x880ffbcc,
    0xc0aa4f2d, 0x824aa08c,
    0xc03333e8, 0xb534339c,
    0xc913336f, 0xff331bdc,
    0xccd2968d, 0xdd068ccc,
}

A sprite also requires a colour palette; a collection of up to 16 RGBA values named 0 to F (each character in the hexadecimal number refers to a colour on the palette).

A well-designed game would reuse colour palettes between sprites, and a sprite’s palette could even be swapped to create different game objects with the same sprite data. One of the best-known examples of this is that of clouds and bushes in Super Mario Bros:

In code, a colour palette looks like this:

{
    "0": color.RGBA{209,177,97,255},
    "1": color.RGBA{212,178,39,255},
    "2": color.RGBA{193,192,190,255},
    "3": color.RGBA{250,223,23,255},
    "4": color.RGBA{250,215,88,255},
    "5": color.RGBA{129,129,127,255},
    "6": color.RGBA{159,150,92,255},
    "7": color.RGBA{33,31,30,255},
    "8": color.RGBA{217,216,214,255},
    "9": color.RGBA{171,164,135,255},
    "a": color.RGBA{254,228,149,255},
    "b": color.RGBA{175,174,172,255},
    "c": color.RGBA{0,0,0,0},
    "d": color.RGBA{246,246,246,255},
    "e": color.RGBA{211,193,152,255},
    "f": color.RGBA{92,87,79,255},
}

To store the scanline integers and palette data, I created a Sprite struct that took the following form (where a Palette type defines the code above):

type Sprite struct {
    Palette   *Palette
    Scanlines *[]int
}

Attached to the Sprite struct is an AddToCanvas() method, into which a set of X / Y coordinates and an *image.RGBA pointer could be passed, making it trivial to paint a sprite onto a frame.

However, it soon became apparent that some game objects would require sprites larger than 16x16 pixels. To avoid complications, I opted to create a SpriteGroup struct that took the following form:

type SpriteGroup struct {
    GroupWidth  int
    GroupHeight int
    Sprites     *[]*Sprite
}

If I needed to create a sprite that was 32x32 pixels in size, four Sprite objects (following left-to-right, top-to-bottom) and width / height values of 2 / 2 could be passed into a new SpriteGroup object, which has its own AddToCanvas() method.

I then added a SpriteInterface interface type that describes both the Sprite and SpriteGroup structs, making it possible to treat either type as a generic sprite and not have to worry about the specifics.

As there is no technical limitation preventing sprites within a group from using different colour palettes, with a creative design it is possible to build a game object that seamlessly uses dozens or hundreds of colours.

After building the basic sprite structs I almost downed tools, satisfied that yes, I probably would have been able to work on games in the eighties and nineties, had I been in the workforce at the time.

But a little voice in my head said, “You’ve done the hard work now… just build a game around the sprites; it’ll be easy.”

I should really learn not to listen to that voice.

Abstracting the libraries

Being able to create sprites and paint them to a blank canvas is only useful for a game engine if that canvas can also be displayed on-screen. As Golang is primarily known as a language for writing command-line applications, I was unsure how best to proceed.

I initially tried two or three windowing libraries that I simply could not get working. I recall that one of them relied on system libraries that my development machine did not possess, and another seemed to require third-party dependencies that had since published breaking changes.

Eventually I found an exp package called shiny that seemed to be a favoured option for windowed applications.

It took a few hours of trial and error to divine from the documentation that I needed to upload and publish an image buffer to the window object in order to display graphics on-screen, but once I figured that out the hard part was over. I wrapped a Window object around shiny and a Game object around the Window object.

Eventually the entry point to the game ended up looking like this:

CreateGame("Window Title", 320, 240, 2, 30, painter, keyEvents, levels)

Where the second and third arguments are the window/canvas width and height, the fourth argument is the scaling factor (more on this later) and the fifth argument is the target frame rate in FPS.

The painter argument is a function that is called after the game engine has painted each frame (I used this to overlay debugging information), keyEvents is a function to which keyboard events are passed and levels is a slice of game level objects.

Rate limiting

Examples for using shiny to paint a window typically involved listening for a paint.Event event, repainting as necessary and then immediately triggering the next paint.Event event.

I wanted games to be able to set a desired frame rate, so initially I included a condition around the repainting logic to skip an iteration unless the correct amount of time had passed since the last repaint.

This seemed to work well, as I generally observed a frame rate accurate to within a few hundredths of a second, even up to rates of 120 frames per second. However, this method consumed 100% CPU.

A better approach was to compare the current time in nanoseconds against the time of the last repaint, and if not enough time had elapsed to sleep until the next frame was due. This brought the CPU levels down to a more reasonable 5-10%.

Embiggening your world view

My development machine has a HiDPI display, as do most devices nowadays. Working with a standard game resolution of 320x240 pixels, it soon became apparent that 16x16-pixel sprites on modern displays are generally too small to see in any detail.

To fix this I decided to implement a scaling factor to the game engine, using nearest neighbour resampling to maintain the ‘pixelated’ feel I sought from the project.

My first attempt involved adding a multiplier to every piece of code that dealt with painting, but this resulted in too many touchpoints to be manageable. In the end I decided to wait until the whole frame had been repainted and scale it just prior to it being published to the window.

This resulted not only in fewer lines of code, but also had the benefit of only requiring a single additional image operation, as opposed to an increase in the number of individual pixel operations by a factor of n².

Further painting optimisations

From my previous ill-fated attempts at recreating classic games (I have attempted on more than one occasion to build something akin to Age of Empires), I knew that a common mistake was to repaint game objects that are outside of the user’s viewport.

To that end, when iterating through the game objects prior to publishing a new frame, only those with at least one pixel intersecting the visible part of the level would be repainted.

Delegating responsibility

Initially, much of the game code that I was writing existed within the engine and much of the engine code I was writing existed within the game. It took quite some discipline, but one evening I forewent the development of new features to sort through the code, divide it into packages and more-or-less cleanly separate concerns.

Most of this tidying was simple, boring grunt work, but there was one decision I struggled with: where should the code that handled movement reside? Should the engine merely provide a canvas that the game can take full responsibility for painting sprites onto, or should the game merely tweak the precalculated properties of game objects (if needed) prior to the engine repainting them?

I eventually opted for a hybrid approach that leaned slightly closer to the latter option, deciding to build the concepts of basic movement and physics into the engine, but to require the game to define all of the starting values such as velocity and direction.

Defining game objects

The Level object became responsible for much of the frame-by-frame action, but largely in conjunction with the GameObject object. In hindsight these two objects are probably a bit too tightly coupled, but there are not many scenarios in which one might be used without the other.

The properties of the GameObject struct define the characteristics of the sprite(s) that it contains, whether it be a static floor tile or the game’s primary controllable character.

The basic struct looks like this:

type GameObject struct {
    CurrentState   string
    States         map[string]SpriteSeries
    Position       Vector
    Velocity       Vector
    Mass           float64
    Direction      int
    IsFlipped      bool
    IsControllable bool
    IsFloor        bool
    IsInteractive  bool
    IsHidden       bool
    FloorY         float64
}

The States property is the most important; it contains a map of sprite collections that can represent all states of the game object (e.g. ‘standing’, ‘running’, ‘jumping’, etc.). A sprite collection is simply a group of sprites that form an animation when run consecutively.

CurrentState is the key of the States map that is currently being used to display the game object. In the context of my Mario demo, part of the code executed when intercepting a directional key press is to set CurrentState to moving for the controllable game object (Mario).

The Vector type is a struct that contains X and Y float values. The Position property is intended to be set initially in code to define the starting point of the game object, but thereafter would be updated by the engine based on the values of the Velocity and Direction properties, which themselves can be set by the game where required.

The Direction property serves a dual purpose: its value (-1, 0 or 1) not only determines whether the engine should move the object left, not at all or right, respectively, but also toggles the IsFlipped flag if the value is set to -1. This flag instructs the engine to mirror the sprite horizontally when painting it, halving the number of sprites required to represent full movement of a non-symmetrical object.

Finally, the IsX flags keep track of what the object can and cannot do, and are mostly handled by the engine. However, they can also be set by the game to manipulate the behaviour of a game object (e.g. hiding an enemy’s sprite and making it non-interactive when it has been killed).

Staying down to Earth

Jumping is perhaps the primary action performed by a character in a platform game. Adding jump mechanics proved to be trickier than I originally thought, and it took a number of iterations of the code before I was satisfied that I had achieved something that felt right.

Initially I implemented a straight-up-straight-down mechanism. It was similar to what I had implemented in previous ill-advised game projects, but obviously did not feel at all accurate.

By factoring in mass and gravity I was able to solve this problem.

In a nutshell, jumping increases the game object’s velocity on the Y axis, which at rest is 0. For each frame the game engine increases the Y position of the object by its Y velocity, and decreases the velocity by the object’s mass multiplied by the level’s gravity:

gameObject.Position.Y += gameObject.Velocity.Y
gameObject.Velocity.Y -= (gravity * gameObject.Mass)

The graph on the left shows the Y position over time with gravity (black) vs the Y position without gravity (grey), and the graph on the right shows the Y position with gravity (black, flipped) and the pull of that gravity (grey).

Using this more realistic approach, each rendered frame would see the upward velocity gradually decrease until it strays into negative figures and the level’s gravity returns the object to the ground.

For the controllable game object, the following figures provided satisfactory results:

Level Gravity:     0.5
Mass:              1.0
Y Velocity (Slow): 6.0
Y Velocity (Fast): 8.0

An unintended (but extremely useful) side-effect of this approach is that tweaking the level’s gravity or the object’s mass allows different objects to exhibit different movement patterns. A water level, for example, could be achieved by reducing the level’s gravity, and floating objects (such as blocks) can be achieved by setting their mass to zero.

Maintaining momentum

Implementing a jumping mechanism brought its own set of problems, however. As directional key presses set the controllable game object’s direction (and therefore movement along the X axis), releasing a direction key mid-jump had the unfortunate effect of stopping the character mid-air.

In an attempt to solve this problem I made the game ignore the key event if a jump was underway, but store it for execution once the jump had run its course: a game object with a positive velocity would be carried forward by its momentum while it was in-flight, but come to a halt once it hit the ground.

Unfortunately, this revealed a further problem: old games simply don’t behave that way.

As a geek in his thirties I can probably navigate the first level of Super Mario Bros with my eyes closed, so when navigating the first hurdles in my demo I found that I was instinctively trying to do things that the game did not know how to handle — specifically: letting go of buttons mid-air to achieve a decelerated fall back to Earth.

This is not, of course, how physics works; if I flung myself into the air I could not alter the duration of my jump unless I had a parachute or some other means of significantly increasing or decreasing my drag.

Working around this challenge was tricky. The logic I eventually settled on was that if a direction key is released mid-jump, the character’s velocity on the X axis would half.

This more or less makes the game behave how I expected it to, but is not a perfect solution and my gut tells me that the old games also provided a degree of control over reversing the character’s direction mid-jump. This is something that I need to investigate and revisit at a later date.

Making an impact

One of the most difficult things to implement in the game engine turned out to be collision detection.

Detecting collisions themselves is relatively easy: simply map the X position of each game object and where two overlap check if their Y positions also overlap.

By noting the X coordinates of each object it is easy to tell if two objects overlap on the X axis — here the blue object overlaps with the green and red objects, meaning only two lookups on the Y axis are needed to find those objects that are colliding with it (in this case only the green object also overlaps on the Y axis).

Instinct is to compare the position of every object against every other object, which would be O(n²) in complexity. Utilising the aforementioned two pass approach, however, reduces the average number of required computations considerably.

There are possibly more efficient approaches, as even with this method it is possible to require n² calculations if every object were to overlap every other, but for game levels with a standard distribution of objects it is relatively low impact.

What proved surprisingly difficult was deciding on which edge a collision had occurred.

I read a little on the subject and favoured an approach that calculated how far into the colliding object the game object was in each direction. Assuming a uniform velocity, the edge with the shallowest intersection should be the most recent edge to have collided.

Assuming the character fell onto the block, it seems obvious that accepting the edge with the shortest intersection distance will provide the correct collision edge.

However, I soon found that this approach was not reliable, in particular when altering the velocity of a game object along one axis only. Consider a character that is moving at high speed along the X axis, and jumps into an object that it can almost (but not entirely) clear.

In this example the character hits from the side after just failing to clear the block, but the edge with the shortest intersection distance is the top.

In this scenario, the difference in the objects' intersection on the X axis is large due to the directional velocity, but the intersection on the Y axis is small as the upward velocity fell just short of that required to avoid a collision on the top edge. The algorithm would determine that the collision occurred from above, but a human observer would rightly state that the collision occurred from the side.

This issue reared its head almost immediately after programming collision logic for enemy game objects. Unless striking them directly from above, the algorithm found most of the time that the glancing hits that should kill an enemy (e.g. when jumping onto them whilst moving) instead registered as a collision from the side.

Hoping for a quick solution, I attempted to reverse the logic, so that the deepest intersection would indicate the collision edge. Unfortunately this just shifted the problem from one type of collision to another.

Eventually I decided to start from scratch with a new method that would require more game code but would be more accurate overall. Rather than worrying about depth, the engine would simply report the sides on which the objects intersected, including combinations such as top_left, top_right, etc.

I could now have enemies take damage if they suffered a top_left, top or top_right collision, but inflict damage if they collided only on the left or right sides.

This did, however, present one final challenge: I was initially testing with a 16x16-pixel object that only collided with other 16x16-pixel objects. When I began to piece together the scenery of the level I found that bugs emerged when one of the colliding objects was larger than the other (Super Mario, for example, would register a form of top collision with Goombas when he walked into them).

To solve this problem I implemented an additional piece of collision logic that would always report a simple left or right edge if both objects were stationary on the Y axis at the time of the collision.

Treading carefully

At an early stage I made the decision to add the concept of ‘floors’, rather than relying on vertical collision detections between objects to halt free fall. I am still in two minds about this decision, but a definite benefit was gained from allowing an object’s IsFloor flag to be toggled on-demand (in particular I’m reminded of the rotating floor sections of Sonic the Hedgehog 2’s Chemical Plant Zone).

The code that identifies floors is similar to the first pass of the collision detection code — each object inspects a list of other objects that are known to intersect it on its X position and its FloorY property is set to the upper bound of the closest object that is beneath it. The engine then uses this property to halt the object’s downward movement at the appropriate frame.

Follow the leader

A common element of side-scroller gameplay is for the camera to follow the controllable character very closely. In some games, such as Sonic the Hedgehog, the character may approach the far edge of the screen as an indicator of above-average speed, but the majority of games keep the player’s character centre-screen.

I wanted to keep the option open to take either approach, so instead of tying the camera position directly to the controllable character I opted to implement a horizontal and vertical scroll offset for the Level object.

Prior to the engine triggering a level repaint, a BeforePaint() function is called to allow the game to make last-moment adjustments of the level’s game objects. I was able to use this function to adjust the scroll offset based on the controllable character’s position — if it was within half a screen of the beginning or end of the level nothing would change, but at any point in-between the offset would be set such that the controllable character was always in the centre of the screen.

The offset approach has the nice side-effect of allowing each level to decide when and by how much the scrolling effect should be applied. It is just as easy to follow the controllable character as it navigates horizontally as it is to have the character progress vertically up a level and shift the view one screenful at a time (an effect used nicely in parts of Super Mario Bros 2).

Future development

Whether or not I’ll continue development of the engine is 50/50 at the moment. The world doesn’t really need another game engine and the primary reason for building this one was just to see if I could. Having satisfied myself that it was possible, there isn’t a whole lot of motivation to take it any further — I’m not a game developer, after all.

That having been said, I do wonder about sound. In a modern system it just wouldn’t be practical to interface directly with sound cards to play audio like they used to, but I think it would be interesting to build an interface that mocked such a sound card and generated PCM audio of the emulated channels. But this is something that I need to do some research on before delving into.

What would I do differently?

If I were starting this project again from scratch I would do a number of things differently.

I would certainly work more on early optimisation of the workload; I am not happy with the load the engine places on my test machine. There is a limit to how closely I can emulate the old 8-bit systems, of course — I do not have direct access to the hardware — but I can’t help but feel that I’m missing some tricks that developers in the eighties employed to stay within the limited number of CPU cycles they had at their disposal.

It may also be beneficial to dynamically place and remove game objects onto the level so that the engine has less to process each frame. Static objects that are out of view do not need to be processed every frame, and traditionally the starting position of game objects is fixed until the player approaches them, so I suspect that this may be a tactic already undertaken by most games.

Related to this: building up the demo game has seen its CPU usage creep back upward of 40% on my test machine, which I suspect to be a consequence of the per-pixel painting method and large number of game objects that require collision detection several dozen times a second. This is something to be investigated in the future and may be helped by implementing larger sprites for walls/floors rather than building them up from smaller sprites (one game object vs hundreds of game objects for the floor, for example).

I think I would also build a level editor into the game engine, making it possible to select and place PNGs as sprites, set their initial properties visually and maybe do some work around setting up collision detection functions, even if the specifics then have to be coded manually in the auto-generated .go files.

I would also plan out the general code structure before starting. As the engine grew organically out of a single-file proof of concept, much of the code design was reactive rather than planned. In general I think I’ve done a good job to separate concerns and logically place engine code, but too much of the codebase feels tightly coupled and I suspect that when I look at adding unit tests I might run into some issues.