Good day, good readers!

You  might have heard of Thoriq Satriya, one of our talented software  engineers in analytics team. However you might not already know that he  worked a lobby service that is used for facilitating communication and  matchmaking for an AAA online multiplayer games scheduled to be released  on PC and consoles.

So are you curious of his story doing the project? Keep reading!

Hi!

My  name is Thoriq Satriya. I would like to share my recent experience  creating Lobby service for an AAA online multiplayer multi-platforms  game. Lobby service is a microservice that enables players to connect  with each other, so that they can play in multiplayer mode. The  architecture of Lobby is quite complex, mainly because it’s designed to  handle a big number of concurrent users. In order to handle such big  concurrency, Lobby spawns a big number of goroutines. There are  goroutines to handle players, send and receive messages to player, send  and receive messages between Lobby services, and instrument the Lobby  status to be sent to telemetry for monitoring purposes. Most of these  goroutines need to be able to communicate with each other. To achieve  that, Lobby utilizes channel to communicate between goroutines.

The  Lobby service leverages WebSocket, a communication protocol that  provides full-duplex communication channel over a single TCP connection,  to enable persistent connection between client and server with lower  overhead, facilitating lightweight real-time data transfer. In result,  it can handle a real big number of concurrent accessing users! Effective  interaction between client and server was made possible by keeping the  connection open — messages can be passed back and forth without making  new connection in each transfer. I dealt quite some time with WebSocket  and it was one of the interesting parts in doing the Lobby.

Beside  writing the service, I did the load testing too. That our goal is to  serve as many people as possible concurrently, load testing is needed to  ensure the service’s performance under real-life load conditions. In  doing load test, my hardest problem to solve was this one bug related to  concurrency. I didn’t see the bug when I tested the Lobby using small  number of concurrent connections. I could only find out there was this  bug when the Lobby was tested against big number of concurrency. Imagine  there are millions of goroutines running and then a panic happens. All  of the error logs from all goroutine will be printed and the debugging  will be very hard! I needed to turn off rate limiter of logging system  so that there’s no important part missing. After then, load test was run  and when the error occurred, the size of log file was more than 1GB. My  team and I needed to filter those logs to find out the actual error  message.

I  also found a couple of exciting challenges during the test setup  (tuning). The first one was in setting the OS, which I needed to tune  the most, to make the OS capable to accept a million simultaneous  connections. Basically OS has setting to limit resources in being used  at a time. Proper OS tuning improves system performance by preventing  error conditions occurrence that can degrade performance. I was faced  with ‘open files’ limit while tuning OS in Lobby deployment. This limit  states the amount of files that can be opened at a time. This process  was translated into connection because connection socket is treated as  file. That was one tricky case for me to solve (I enjoyed solving this  though). Another challenge I experienced in the setup was performance  degradation when the team moved the environment for testing from AWS C4  into C5. My team and I could not utilize the resource of C5 instance  even though we used the same tuning configuration. So I assumed that  there must be some other tuning configuration for C5 instance that we  just needed to find.

I  learned a lot of things doing this project, from how distributed system  works to how to handle concurrency. Here are some logic I learned:  Lobby is a microservice which should be able to be scaled horizontally;  in order to make it that way, Lobby needs the ability to communicate  between each services; in order to do so, we share the data between  lobby services using Redis (as an in-memory database). To comply with  the goal to withstand concurrency I wrote using Golang (this was my  first experience coding with Golang!). I consider the language to be  very good in handling concurrency — I can use the channel to synchronize  between goroutines without explicit lock and condition variable!

This experience enriched my knowledge bank for it made me learn a lot.

AccelByte  environment plays a great role too. The support was great, everyone is  willing to help each other, even those guys from different teams!

Thoriq Satriya
Software Development Engineer at AccelByte Inc.


Isn’t it great how this project turned out to be one enriching experience for Thoriq?

Got any question? Or wanting to get similar rewarding experience doing a project at AccelByte Inc?

Stay tuned on our pages!