The Alphabet of Games Programming (Part 1)

Games Programming is a large topic, containing a whole lot of different parts. I wanted to try something different by listing a whole bunch of concepts with the only constraint of starting with a letter of the alphabet. My goal was to gather some of the more relevant online resource per topic.

Here is my Part 1 (letter A-H) Games Programming Alphabet:

Soviet stample commemorating 1200th anniversary of Al-Khwarizmi

Algorithm, named after Persian polymath Al-Khwarizmi, is a sequence of instructions and can be set as the mathematical abstraction of a computer program. I still remember when I learned Binary Search or good old Quick Sort. The complexity O(n) notation can help realize that having more than three for-loop might not be the best idea. A good programmer has a toolbox of algorithms in his head. C++ has even the algorithm header (https://en.cppreference.com/w/cpp/algorithm) with pretty useful functions that you can combine with lambda functions (Jonathan Boccara CppCon talk goes through all of them: https://www.youtube.com/watch?v=2olsGf6JIkU). It is also important to implement algorithms by ourself, understand the data structures behind them and benchmark their performance against standard implementation (for example, implementing the A* algorithm by hand, visualize it and optimize it).

Benchmarking 4×4 matrices multiplication and beating Clang at it!

Benchmark is a very good way to prove that your implementation works. Obviously, it is easier with leaf functions that you can isolate, benchmark and check and understand the assembly behind it. Godbolt online compiler is an amazing tool to check assembly out of leaf function (for example here: https://godbolt.org/z/XnTkzx); you can also see the differences between compilers (MSVC, clang, gcc) and their different versions. I typically use google benchmark (https://github.com/google/benchmark) to run my benchmarks, check if my optimizations actually work and compare functions directly. I even have a little python script that generate a graphic out of a google benchmark json file (https://github.com/EliasFarhan/NekoEngine/blob/develop/scripts/bench_graph.py). One thing to always be careful with is confirmation bias, where you are calibrating your benchmark to run faster instead of optimizing your code. Good performance principles are nice, but data about performance is always more important.

Intel 80486DX2 microprocessor, because I did not have a CPU at hand

CPU is the center of everything a programmer does. A modern CPU can calculate 40 additions while light moves one meter (in a ideal situation of course). We basically have monster machines and we have no idea how to use them correctly. A lot of programmers have no idea that most of the time, the CPU is waiting for data from the RAM (Mike Acton CppCon 2014 is a pretty good talk to understand those kinds of problems https://www.youtube.com/watch?v=rX0ItVEVjHc, as well as Timur Doumler CppCon 2016 talk: https://www.youtube.com/watch?v=BP6NxVxDQIs). Fortunately, CPU have L-caches that reduce the latency to get data, but it requires data to be close to each other. This is the hardware reason why std::vector is way faster than std::list to iterate over. CPU typically have a 64 bytes cacheline, meaning that you get 64 bytes of data at once (sort of) through the typical three levels of cache before they arrive in CPU registers. Data locality is also a reason why object-oriented programming is not the way to go with high performance code (the 20% of code that is run 80% of the time). This is where memory layout transformations are used to tranform Arrays of Structures (AoS, the typical way we lay out data in memory in OOP) to Structures of Arrays (SoA, the better way for the CPU to process the data). The final stage is Arrays of Structures of Arrays (AoSoA) that combines strength of the both worlds (a very good article by Sam Reeve about it here: https://github.com/ECP-copa/Cabana/wiki/AoSoA). Jonathan Blow talked about it and the problem he had on the Witness here: https://www.youtube.com/watch?v=ZHqFrNyLlpA.

Data structures always have good sides and bad sides, and it is important to understand them. A good way to truly understand a data structure is to implement it (how is it allocating memory, how is data lay out in memory, how to insert and delete data from the structure, how to get data). There are data structures in C++ standard:

  • std::list is a double-linked list with pointer going forward and backward (and is never to be used! Data scaterred in memory is not a good idea for performance; if someone has a good example to use std::list instead of std::vector let me know),
  • std::vector has nothing to do with mathematical vectors, is a dyanimc array (probably one of the best data structures in C++, but that does not allow small array optimization; in other programming languages you might have something like ArrayList),
  • std::array is the child of good old C-array with some additional C++ functions like begin(), end() and size(),
  • std::unordered_map maps keys with values, with keys being stored as hash tables (unfortunately buckets of key-value pairs tend are linked lists as well. Chandler Carruth talk at CppCon 2014 is pretty at explaining performance with data structures: https://www.youtube.com/watch?v=fHNmRkzxHWs).

Typically, game programmers tend to implement their own data structures (like EASTL: https://github.com/electronicarts/EASTL).

ECS diagram by Tobias Stein (https://sudonull.com/post/64555-Entity-Component-System-Design-Pattern-Implementation-and-Example-Game)

Entity-Component-System is the typical architecture used to represent game objects (instead of Object-Oriented Programming, Vittorio Romeo talk at CppCon is pretty good introduction on this subject, even though I’m not a complete fan of the implementation: https://www.youtube.com/watch?v=NTWSeQtHZ9M). Typically, components are data, systems are code and entities are simple indexes. With such an architecture, it is way easier to build system using components with data locality (which is super important for CPU caches) and multithreading. Unity Game Engine is currently switching from their GameObject-based implementation of ECS, to a more pure ECS implementation, with a very promising approach through their JobSystem (that pushes code in a multithread manner sort of for free) and Burst Compiler (that compiles a subset of C# directly to assembly through LLVM, even switching some of their C++ engine systems to C#). Blizzard’s Timothy Ford had a very good talk on their ECS implementation for Overwatch here: https://www.youtube.com/watch?v=ZHqFrNyLlpA, with simple rules like components have no functions and systems have no state.

Because there is a difference between 30 and 60 frames per second.

Frames per second, because we only have 16.66 ms to render a frame on a computer/console/mobile if we want to target 60Hz. And there is so many things we have to do in 16.6 ms: input management, physics, resource loading, gameplay, AI, rendering and more… To be able to do that, game programmers have to use all the available resources (mulithreading for example) and to only do the necessary work for the frame. This is the main reason why game programmers go low-level to the same level than high-frequency trading finance programmer. Waiting 1 second on your HTML5 + Javascript for an action to complete is okay, but pressing a button in a game requires near instantenous reaction. There is a very good talk on frame pacing by Alen Ladavac at Reboot Develop Blue 2019 here: https://www.youtube.com/watch?v=n0zT8YSSFzw.

A quick way to explain garbage collection in programming langagues by Jyothsna Srinivas: https://medium.com/@jyothsnasrinivas/https-medium-com-jyothsnasrinivas-what-is-a-garbage-collector-d0e152110219

Garbage collection (GC) is the main memory management mechanism in OOP languages like C#, Java, Python. It means that allocating memory in those languages will be managed automatically (no need to free or delete pointers like in C++), but there are downsides. GC makes program stalls when collected, which is unacceptable for real-time applications. This is one of the reasons why game programmers are allergic to new languages even the most promising. You will typically here them disregarding a language completely just because of GC. In C#, there is a clear difference between classes and structs. Classes are reference types allocated on the heap (sort of at random like a list) and are memory managed by the GC (meaning that allocating a lot of them and destroying them will make your program stalls). I still remember a case using Spine on Unity was making the game stalls hard trying to read the length of an animation, because the whole skeleton data needed to instantiate 1MB per second (I am not completely sure why, but caching the value solved the problem). Structs are value types, meaning that an ArrayList of structs are aligned in memory, making it way faster to iterate over (this is why Unity defined the HPC# for jobs to use the Burst compiler, Joachim Ante talked about it as early as 2017: https://www.youtube.com/watch?v=tGmnZdY5Y-E).

Raspberry Pi 4 is a nice mini computer to benchmark code on ARM system for less than 100$.

Hardware is the platform we are programming on, not software (it is one of the three big lies by Mike Acton, here: https://cellperformance.beyond3d.com/articles/2008/03/three-big-lies.html). Even building a game with Unity requires you to change some specific code to work on other platforms (like consoles having their own abstraction of IO). With the rise of operating systems, we sort of forgot that our code is running on hardware (mostly the CPU, but more and more the GPU). With consoles, game engine can be optimized for specific hardware, but on PC, things are a bit more complicated. Games put minimum requirements on PC to be able to us specific features (for example SSE instructions are now a hard requirement to play games). Compilers have a larger knowledge of the hardware and know how to order instructions to make them run faster, but in the end they can optimize data structures memory layout.

And this is it for the part 1! I probably missed your favorite topic in Games Programming, but don’t hesitate to create your own list (in your own domain as well) and share it.

Read More

Nordic Game Conference 2019

Next to Malmö train station

During a cold May month in Switzerland, we went north to get warmer and showcase Soup Raiders at the Nordic Game Conference. I took the whole week to be able to visit on the first day. Our airbnb was a bit outside the city, but it takes less than 30 minutes to go throw the entire city. It is also 30 minutes by train from Copenhagen airport.

On Tuesday, we had the chance to visit several game studios like Massive, Avalanche, King, Tarsier, and all the lovely people at the Game Habitat. We visited the Game Assembly school and damn I was jealous.

Each student has a desk with a computer and two screens.

They have 5 curriculum in Swedish:
-Games Programming
-Game Art
-Game Animator
-Level Designer
-Technical Artist

The programmers build their own engine and make every 3 months with the other sections a game. So by the end of the education they did 8 games, while working on their own engine in C++.

But of course I was not there to just visit studios and schools…

So the goal was to meet as much publishers as possible and so I participated to the Publisher Market where we had 5min to pitch our game. This was fast and there were moments where I was pitching my game three time in 15min.

I also had the chance to talk in a small panel talking about the evolution of game programming languages starting from Assembly to C++. I was not expecting to talk there, but they ask the public and so I talked about what we do at the SAE Institute.

Malmö is a wonderful city, very similar to Lausanne, but by the sea and close to Copenhagen airport, one of the biggest airport in Europe. Even if I am comfortable in Lausanne, it is hard not to want to move there and work for those awesome companies there.

Bye bye Malmö

Read More