On Friday we pushed the code we’ve been working on for the past couple of months to the PTS and we encountered the minor issue of most players not being able to login at all! This was highly likely given the fact that we’ve almost completely rewritten the entire login/logout system. We didn’t however expect the issues to be this drastic as obviously we hadn’t seen this during our QA process.

Initial observations indicate that logins work fine at first but then abruptly stop working completely. We can see this in the log history, green bars are successful logins, red bars are failures:

We suspect what’s happening is a concurrency issue (such as a deadlock) that only happens when there are enough players logging in at the same time, but we need to be sure of diagnosing the issue so before we push a new build to the PTS we’re adding as many diagnosis tools as we can think of:

  • Profiling the number of messages sent between various systems. We’re seeing more in the queue than we would expect, in particular on PTS. We don’t think this is causing the problem, but it definitely is an issue.
  • Initial profiling of message sending rates doesn’t seem to be excessive, which indicates that we’re not processing messages fast enough so the queues get too big. To figure out if this is the case we’ve added profiling to determine how long each message takes to process.
  • We’re adding a system heartbeat to determine what is the last line of code executed before everything freezes up.
  • Improve logging so that failures appear as errors not info.
  • Added a debug command to spam login attempts.

We’re hoping to get new builds to the PTS with various changes soon, it’s unlikely we’ll have a potential fix for the login issue unless the debug command to spam logins lets us reproduce it, so we’re going to need your help to break the server once again.

If you’d like minute to minute updates on the status of the PTS the best place to look will be on our official Discord server: https://discord.gg/worldsadrift

Fly safe (on the main servers…)