I rather enjoy debugging. It’s just another type of puzzle, one of the many challenges of gamedev approached with logic and the tools at hand. It should be noted I actually don’t much like debugging if it involves a bunch of code I didn’t write, but this is why I almost entirely use my own tech, stuff that I built, am familiar with and… capable of understanding 😛
Most bugs die a swift death in Cogmind, since it’s built using a pretty simple architecture and literally the first thing I put together for the framework was its error detection and reporting system, always taking into account what could go wrong whenever anything new is added.
But one nightmare bug in particular has been in there for a very long time…
“Seeding” a game’s RNG allows it to produce the same numbers in the same sequence, and is therefore a useful feature in roguelikes, especially where map generation is concerned. I’ve already written an article on seeds and how they work in Cogmind, along with their many applications, so I won’t go into all that again.
This time I’m here to talk about a specific seed-related issue that popped up and how it was uncovered and resolved.
Around early 2017 occasional reports of seeded runs not always generating the same maps in some cases started popping up. Now obviously this isn’t right, because the same seed should always produce the same map, so clearly some player action before that point had managed to affect the generation, causing the seeded content to “diverge.”
Major mapgen phases in Cogmind.
Map generation can be divided into three main phases: layout, content, and player-affected content. It’s important to separate out all the latter stuff (C) so that it doesn’t affect the base map that everyone using the same seed should share (A/B), so I’m generally careful to do that, but obviously something had slipped in somewhere…
I say this bug was a “nightmare,” though honestly the effect on players was minimal since it rarely came into play and wasn’t a show-stopper or anything like that, it was a nightmare for me because I couldn’t easily track down something like this!
Nonetheless, this is a vital sort of bug to fix because not only are fully reliably consistent seeds important for built-in weekly seeds or other similar events (which are still something I’d like to do), but this bug had also already affected me several times before in other bug-solving efforts. Often times the quickest way to reproduce a bug in order to properly resolve it is to be able to generate a map using the same seed it was created from, especially when I get a random remote crash report which is nothing more than a stack trace and log containing the seed. More than once over the past couple years I couldn’t take that easiest route, or even recreate certain bugs at all since the seed results may not match what the player encountered!
So you can see why it was pretty important to fix this, and when kiedra suddenly brought it up and later offered relevant save files, I was happy to jump on it immediately, brushing aside my previously scheduled work for the day. (It’s best to do this sort of thing when the events are freshest in the player’s mind, in case I had any other questions.)
kiedra provided exactly what I needed, two save files, each from separate runs, both from the map before the one in which the divergence was observed. The fact that Beta 6 added multiple interval autosaves to Cogmind made collecting these saves (and others needed for debugging) much easier, but solving this issue in particular still required that someone be playing actual seeded runs, and observing the differences, and be both able to save this data and willing to share it with me. Whew, finally got an, uh, convergence of all these variables 😉
Here are two screenshot excerpts demonstrating divergence on the same section of map:
Seed divergence demo. You can also see the full size maps here and here, opening them both and flipping between the two to see the total changes. Having added the new map output feature in Beta 7 was great for getting full-sized maps like those 🙂
You can see how the layout is identical, as are a couple machines and certain locations chosen for item placement, but other machine and item choices are actually different! Gotta find out where the changes started…
My first guess was that it had something to do with global plot-related values. This is what I’d been thinking all along since I didn’t hear about this issue until much of the story and events were complete. In any case, this was really quick to check since we had two saves, so I loaded up each and just compared the list of globals…
Files match. D’oh!
That didn’t pan out, so I moved to comparing the values coming out of the RNG at several major points in the mapgen process, since if any value at a given point was different from that same point in the other save, then the divergence must be occurring between that point and the previous non-diverging one. Basically, if there’s a divergence the RNG must be handing out at least one extra number in one save, and that would entirely throw off where all the subsequent numbers are applied, hence different results from that point onward.
Even before that, based on just the screenshots I could pretty much narrow it down to placeRandomObjects(). “Narrow” is an overstatement though, because that’s also the bulk of the map content initialization process :P. Anyway, that’s where the number comparisons would start.
The first 500 lines of placeRandomObjects(), marking major intervals where the latest RNG output was checked. (I just tested them by setting breakpoints in the debugger and recording the numbers on paper, nothing fancy.)
At the first three points the RNG gave the same number, so we can be pretty confident that the content generated prior to those points was identical between saves. Then comes the fourth check, and we have a winner! The RNG in each save gave a different number there, so they must have diverged somewhere between the last two checks.
Here I got a little ahead of myself and ended up wasting some time because I was excited about finally getting this close and immediately made an assumption based on the general code in that section. I thought it had something to do with how in a few cases later map generation stages were allowed to modify spawning restrictions for object types, different from what was set in the original layout. Problem was, this assumption was not at all based on actual evidence, so the lesson here is to follow the evidence, not your imagination, especially when there’s already a direct route to finding the solution. Oops.
Fortunately I realized my error when I was taking a quick break (it’s good to “get away” from problem solving for a bit, since it might allow for new perspectives, although clearly this was still rolling around in my head while on “break” xD).
I came up with a few ideas for narrowing down the problem space, and while most would solve the problem quickly once implemented, they’d also take a while to build and end up spending more time than they were worth, so I decided to just keep up the straightforward manual search. I did still chop out huge unrelated chunks of the content generation so that the resulting maps would have fewer distractions and be easier to visually analyze, possibly leading to more clues.
To go along with that view, I got a list of every room in the order they were filled, and what type of general content they included:
General room comparison.
Getting closer! From the data above, it’s either an issue with Room 14 or 15. Room 15 has a different composition, but since composition is set first, it’s probably an issue with the room before it at (1,67) on the map…
Room 14 we have you now!
To confirm real quick I also visually checked the final output of several rooms listed above 14, and those were identical in both saves.
Seeing as the Terminal looks identical but there are different numbers and types of items, I decided to take a look at the items first. Stepping through the code line by line for that room I recorded a few values under the first save, then went to the second save, only to discover that the very first numbers it started with were already different, so it must’ve been before item placement even started in there!
As usual when solving rather involved bugs, I have pages of notes recording the entire process and data along the way, so here we have this 😛
Well there wasn’t much before the items… just the Terminal, so I eyed it suspiciously and had an epiphany: it must be something inside the Terminals.
Caught red-handed with different hacking options!
As soon as I saw different hacks I knew the answer (although it becomes extra obvious by looking at the point from which the hacks change), recalling that schematic hacks at Terminals would favor the player by usually re-rolling if the randomly chosen schematic happened to be one they already had.
This kind of gameplay-improving tweak is fine, but it needs to be done in the player-affected content segment of mapgen! Here I’d checked for and applied the changes immediately, forgetting that we’re in the middle of the base content assignment. So if the player happened to already have a schematic which the game attempted to put on any Terminal on the new floor, it would roll again for a new one, advancing the RNG state and bam–everything after that point will be different.
This also explains why the issue tends to appear more often in the late-game (more time to accumulate schematics) and only for some players (those using schematics as part of their play style, and running seeds so they might actually notice it).
For the sake of double confirmation I did check that kiedra had different schematics in each save, four more in the second than the first, and one of them happened to be what was chosen for this Terminal.
Based on this finding I knew there were some other related instances, and fixed all of them at once. The same behavior exists (to varying degrees) with part schematics, robot schematics, lore records, and preloaded Fabricator schematics. Of course the fix is to move all these player-relative content modifications to the final mapgen phase.
The final check was to run the saves under the new code, and compare both those results to a completely fresh debug run using the same seed (which just teleports to that map so nothing at all can interfere with it). Same results across the board 😀
And now seeds should be fully reliable once again!
It’s worth mentioning (mainly to head off the inevitable comments to this effect :P) that there are ways to prevent this kind of thing from happening in the first place. Like if there are clear rules that should be obeyed, as there are here, then be sure to encapsulate all player-relative data and keep it hidden/inaccessible from the mapgen process until it’s allowed.
In any case, this made for an exciting debugging adventure 😉
- Ancient humans had strong teeth for eating tough plants, seeds and nuts without damaging their enamel, study claims
- STD map ranks Baltimore as worst US city for sexually transmitted diseases – see where YOUR town stands
- There are worrisome signals that Nigeria is sliding dangerously – Arewa youths
- Buhari’s achievements in agricultural sector listed
- The Next Step Toward Improving AI
- Blue roads and glowing signs—how this startup’s tech lets cars see the world
- Australian Open draw: Djokovic faces Struff, Barty takes on Tsurenko
- Australian Open draw treats Ash Barty well, Nick Kyrgios could play Rafael Nadal in fourth round
- Australian Open qualifying delayed due to poor air quality, then by heavy rain
- Dungeons & Dragons D20 Lamp Lights Your Path In Multiple Colors