DeepDame: Two Semesters, Two Halves of a System

DeepDame is a real-time multiplayer checkers game. We built it across two semesters, five of us, at FST Settat. The repo is public. The APK works.

It’s also one of the projects of mine that I like the most. I’ve been trying to figure out why, and I think it comes down to this: the idea started with me, and I stayed close to the work at every stage. The game backend in S3. Leading the frontend in S4. The infrastructure that’s still running it on a desktop in my room.

This is a long post. I’m not going to summarize what we built. The README and Ilyass’s blog post on the backend infrastructure do that. What I want to write down is what the project felt like. What I worked on. What I learned about leading something I didn’t have time to plan.

The Idea

We were in S3 of our program, starting a project for the JEE module. The brief was open: build a backend. Spring, Java, the usual stack.

The team did what teams do. Everyone brought an idea, we voted on a few, the conversation kept going from there. I came in with multiplayer checkers. The team picked it up and it grew from there into what DeepDame became: real-time multiplayer, an AI opponent, in-game chat, a social layer with friends and notifications.

The thing I cared about, more than the specific idea, was not building another CRUD app. Every group project tends to drift toward the same shape: a form here, a database table there, a list view, a detail view. It’s safe, it’s gradeable, and it teaches you almost nothing you didn’t already know. We had a semester. I wanted to spend it on something where we’d actually hit real problems.

A game is full of real problems. State that needs to be synchronized across players in real time. Rules that have to be enforced server-side so nobody cheats. An AI opponent that has to play well enough to not be embarrassing. A chat layer that has to deliver messages live, not on refresh. These are things you don’t build by accident, and they don’t get solved by a tutorial.

We also knew something the brief didn’t account for: the next semester had a mobile development module with Flutter. If we built the backend in S3 and the frontend in S4, we’d have a full real product across both modules instead of two disconnected toy projects. That decision shaped everything that came after. S3 became “build the backend well enough that a mobile client can sit on top of it.” S4 became “build the client that shows the backend was worth building.”

I don’t remember exactly who proposed the two-semester plan. Probably it came up in one of the early conversations and nobody objected. What I remember is that once it was on the table, it felt obvious. It also meant the work I was about to do in S3 wasn’t a throwaway exercise. Every architectural decision was a decision that future-us would be living with.

S3: The Backend

I owned the game side of the backend. Game engine, game services, real-time WebSocket layer, AI opponent, game chat, matchmaking, statistics. The rest of the backend (auth, friends, notifications, infrastructure, admin) was other people’s work. This section is about my half.

Two early decisions shaped the rest.

The first was making the game engine stateless and framework-free. I put it in its own Maven module, game-engine, separate from the Spring application. The game-engine module knows about boards, pieces, moves, and rules. The DeepDame application module knows about HTTP, WebSockets, persistence, and security. The first module compiles without Spring on the classpath at all. I wanted to work on the engine in parallel with everyone else without stepping on the rest of the codebase, and I wanted the rules to be testable without spinning up a backend. The engine was probably the calmest part of the codebase to write because it was just logic. Inputs in, outputs out, no surprises.

We didn’t use all of Maven’s multi-module features. To be honest, we didn’t know them. But the basic separation gave us what we needed. The engine compiled on its own, the main app pulled it in as a dependency, and the rules lived in one place.

The engine had test coverage. Move correctness, capture chains, king promotion, win conditions. Tests for a game engine are easier to write than tests for most things because the engine is pure: given a board state and a move, did the engine produce the right next state? Yes or no. No mocks needed.

The second early decision was the persistence strategy. I chose MongoDB for game data, which surprised some people on the team, but I had reasons. Games have a flexible shape. The move history is variable length, the state changes over time, and I wasn’t sure on day one what every game entity would need to carry. A relational schema would have been a guess about that shape, and changing a relational schema mid-project is painful. MongoDB let me change the shape without paying for migrations. The other reason was write throughput. I was anticipating a lot of state updates per game, and Mongo handles that better than Postgres does for what we were doing.

Then Redis came in later, and it changed how I thought about the game lifecycle entirely.

The original plan was simpler: save state to Mongo, edit it from Mongo. That works for a single-player app. It doesn’t work when you have two players sending moves over WebSockets and you want sub-100ms responses. So I split the storage. During a game, the state lives in Redis in two pieces: a Game object that holds the full mutable state and gets updated on every move, and a GameStaticInfo object that holds the immutable header data (players, game ID, start time). Splitting the immutable part from the mutable part meant I could reference the static info without serializing the whole game state every time.

Every move was an atomic update on the Redis state. With two players sending moves in real time, you can’t have one move clobbering another by reading-then-writing. Atomic ops or you have a bug waiting to happen.

The full game state never touches the database until the game ends. When the game finishes (win, loss, or surrender), the final state is written to MongoDB as a single document, the player statistics get recomputed and saved, and Redis cleans itself up.

This pattern is what made the latency I cared about possible. I won’t claim I knew this would be the right design on day one. I tried the simpler version first and watched the response times. Redis fixed the response times.

The statistics piece is worth calling out because it’s the kind of small decision that disproportionately matters. The naive approach to stats is to compute them on demand: when a player opens their profile, query all their games, count wins, count losses, compute head-to-head against other players, return the result. That works for a small number of games. It does not work at scale, and more importantly, it doesn’t feel instant.

Instead, statistics get aggregated once, at game end, and stored as denormalized documents. Profile screen loads pull a single document instead of running an aggregation. Head-to-head records are precomputed. Win rate is precomputed. The cost is a few extra writes when a game ends; the benefit is that every read is constant-time. The user opens their profile and the data is just there.

The in-game chat was the same kind of thinking. Messages live in a Redis list with a 24-hour TTL. When a game ends, the chat is gone. Players talk during the game, and that conversation doesn’t follow them into the rest of the app. Chat is part of the game’s lifecycle, not part of the player’s permanent history.

The AI was the other piece I worked on. I wanted three difficulty levels: easy, medium, hard. Easy is trivial: pick a random legal move from the engine’s MoveValidator. Medium and hard delegate to an LLM with a structured prompt describing the board state.

The interesting decision was the fallback. LLMs are slow and unreliable. They time out, they return invalid moves, they hallucinate pieces that aren’t on the board. I built the AI orchestrator so that if the LLM fails for any reason (timeout, malformed output, illegal move suggested) it falls back to a random legal move. The game continues. The player might lose a small amount of difficulty for that turn, but the game never freezes and never crashes. This is the kind of defensive design I think about a lot now. An AI feature that breaks the rest of the app when it fails isn’t a feature, it’s a liability.

The provider was pluggable. Ollama running llama3.2 was the default, but I built it so you could swap in Gemini or HuggingFace by changing a config value. Spring Profiles let us switch between local Ollama and cloud providers without touching code.

The real-time layer was the other thing I built. GameSocketController is the STOMP entry point for everything game-related: moves, join events, surrender, chat messages. When a player makes a move on their client, the message hits this controller, the server validates it against the game engine, updates Redis, and broadcasts the result to both players. The whole round-trip is over WebSocket. No polling, no refresh.

The bug that made me lose sleep on this layer was a routing one. Spring’s convertAndSendToUser lets you send a message to a specific user, but you have to give it a key. The key has to match whatever key the user is subscribed under. I had paths that used the player’s username as the key and other paths that used the player’s email. Most of the time these matched, because most accounts had matching values. But edge cases (matchmaking specifically) used a different code path, and the keys diverged. Symptom: a player joins a game, the server “sends” them the join event, but the client never receives it. The game looks frozen on one side. The logs say the send succeeded.

I spent longer than I want to admit chasing that. It looked like a WebSocket problem. It looked like a subscription problem. It was an identity problem. Two different paths in the codebase used two different keys for the same user, and Spring did exactly what I told it to: sent a message to a user who, from its perspective, didn’t exist.

The fix was three lines. Standardize on one key, audit every convertAndSendToUser call to use the same one. The lesson was more than three lines. Pick one canonical identifier for a user early in the project and route every identity-keyed operation through it. If you’re using two, you have a bug, you just haven’t found it yet.

There’s one more thing I want to mention because it became a small obsession: latency. Once Redis was in place, the bugs were fixed, and the game flow was tight, I started measuring response times for moves. The numbers got down low enough that I could see the floor. The remaining latency wasn’t in the game code anymore. It was in the JWT auth filter that runs on every request. Not because there was a bug in the auth code, just because that’s how the protocol is designed: validate the token, decode the claims, set the security context, run the handler. The auth layer became the floor I couldn’t optimize past without changing the security model, which I wasn’t going to do.

I’m quietly proud of that. Most projects never optimize enough to see the floor of the protocol they’re built on. We did.

I stress-tested the backend on my own k3s deployment running on the Proxmox server in my room. I wrote the k6 scenarios myself: simulated players joining lobbies, making moves, sending chat, surrendering. The setup got to 6,000 concurrent WebSocket sessions before saturating. Ilyass later ran similar tests on EKS as part of his infrastructure work.

By the end of S3, the backend had everything a multiplayer checkers game needs: a real game engine, real-time WebSocket play through STOMP, an AI opponent with graceful fallback, in-game chat, a matchmaking queue, open lobbies, and a statistics layer that’s instant to read. What we didn’t have was anyone using it. The Postman tests were green. The integration tests passed. The system scaled. But it was a backend without a face. That’s what S4 was for.

Between S3 and S4

The semester ended. We submitted the backend, presented it, got the grade. There was a gap before S4 started, which felt like rest at the time and now in retrospect feels like the only rest I got that year.

When S4 began, the question was who would lead the frontend.

The team had no experience with Flutter. None of us. The previous projects I’d led, at least one or two people on the team had used the relevant tech before. This time the whole team was starting from zero on the framework, on the tooling, on the patterns. Whoever led it would have to learn it themselves before they could lead anyone else through it.

I took the lead. The conversation that landed me there was short, with Ilyass, and I won’t recount it in detail. What I can say is that nobody else on the team was in a position to take it, and I knew that walking in. So did everyone else. The project happening at all depended on someone stepping into that role.

I want to name what that role was, because I don’t think the title “frontend lead” captures it. I wasn’t going to be a senior contributor on a frontend team with peers. I was going to be the person who learned Flutter first, then taught the team, then made every architectural decision, then unblocked everyone, then debugged everyone’s work, then shipped the polish. The team was capable but new. The framework was new. The deadline was four weeks. There wasn’t time for everyone to learn at the same pace and converge on a shared style. Someone had to set the style on day one, and everyone else had to build inside it.

The first decision I made, the same day I took the lead, was to archive the existing frontend.

We had an S3 frontend. While the rest of us were on the backend in S3, a teammate had worked on a prototype frontend in parallel. It was real work and real time put into it, but it wasn’t going to be the right starting point for what we needed to do in S4. None of us knew Flutter well enough at that point to have built a foundation that would carry a full real-time multiplayer client. Building on top of it would have meant inheriting decisions made under uncertainty, and we needed to start from decisions made with the full picture in mind.

So I made a new branch from main, archived the old frontend there, set the branch to read-only, and started fresh on main. The archive wasn’t theater. The work existed, somebody had done it, and even if we weren’t going to build on it, the history deserved to be preserved. Deleting it would have felt like erasing the fact that we’d tried something and it didn’t fit. Archiving it said “this happened, it just isn’t where we’re going.”

I told the team what I was doing and why. Nobody objected. We started over.

The thing I want to be honest about is the conditions we were working in. Four weeks sounds like a real timeline. It wasn’t, not in the way “four weeks of focused work” would have been. We had classes from 8:30 to 12:30 and 14:00 to 18:00, Monday through Friday. That’s eight hours a day inside a classroom before any project work could happen. We had other modules running in parallel, each with its own deliverables. Some of the team was dealing with family stuff. Some had health stuff. We came into S4 already tired from S3, and S3 had been hard.

So when I say four weeks, what I mean is: four weeks of whatever we could scrape from the edges of everything else.

That’s the project that started in week three of May. Five people, none of us knowing Flutter, building a real-time multiplayer mobile client on top of a backend that had taken a whole semester to write. We were going to ship something. I didn’t know yet what.

S4: Building the Client

The first week was structure.

I spent the first few days reading. Not coding, reading. Riverpod docs, GoRouter docs, Freezed docs, looking at example apps with shapes I might want to copy, watching talks. I didn’t know Flutter. I had to know it well enough to make architectural decisions before anyone else could write meaningful code.

Then I built the skeleton. Feature-folder layout under lib/features/, with each feature getting its own data, presentation, providers, and widgets subdirectories. lib/core/ for shared infrastructure: networking, routing, storage, theme. Riverpod for state with codegen, Freezed for models, GoRouter for navigation, Dio for HTTP with a persistent cookie jar for the session, STOMP over SockJS for the WebSocket layer. The whole skeleton was committed to main before anyone else started writing features.

This is the decision I’m most glad I made. Setting the structure on day one meant nobody had to invent it for themselves. Hajar started on the home screen knowing exactly where the data layer lived, how providers were wired, what the file naming convention was. Amine started on the chat knowing the same. The work everyone did fit into the same shape because the shape existed before they touched the codebase.

If I’d skipped this and let everyone start writing features in parallel, we would have had four different style decisions per feature by the end of week one, and the rest of the project would have been spent reconciling them. Time spent on structure in week one was the highest-leverage time I spent in the project. The visible work is the features. The invisible work is the structure that lets the features be built quickly.

The other thing I did in week one was set up the backend deployment.

I have a desktop at home that I run Proxmox on. It’s a hobby setup, basically a small home lab where I run my own services (I wrote about the full setup here). For DeepDame S4, I needed somewhere the team could hit the backend from their own machines while they developed the client. Running the backend locally on each laptop was not going to work. Mobile development is heavy. Running Android Studio, the emulator, the Flutter toolchain, and a Spring Boot backend with three databases on the same laptop is the kind of thing that crashes a normal machine.

So I spun up a new VM on the Proxmox host, cloned the backend repo onto it, brought it up with Docker Compose, and put a Cloudflare tunnel in front to expose it on a subdomain of my personal domain. From any device anywhere, the team’s app talked to my server. That setup ran for the entire duration of S4.

What I didn’t plan was that this dev setup would also become the production deployment. We never moved off it. The same Cloudflare tunnel that the team hit during development is the same one users hit if they download the APK and run it. The professor’s demo phone hit it during the presentation. It’s still running today.

This is funny to me in retrospect. The whole S3 backend was designed for horizontal scaling on Kubernetes, with Redis pub/sub for cross-instance synchronization and an LGTM observability stack. S4 production turned out to be a single VM on a desktop in my room behind a Cloudflare tunnel. It works.

By week two, the team was building features. Hajar took the home screen, the friends layout, the profile and stats. Amine picked up the general chat. Yasmine did the settings screen and the banned screen with Ilyass. Ilyass handled the auth flow, the FCM push notification integration, and the friend request flow on the client side. I took the game, the in-game chat, the matchmaking flow, the theme system, the sound, and everything that didn’t fit cleanly under someone else’s domain.

The way I led the team during this period was different from how I’d led before. The previous projects I’d led, I could review someone’s work and give pointed feedback because I knew the framework better than they did and could see where they’d gone wrong. This time we were all learning at the same time. My feedback couldn’t be authoritative because I wasn’t an authority. So I shifted from “you should do it this way” to “let’s both look at this together and figure it out.” That worked better than I expected. It also meant I was spending real time on every PR, not just rubber-stamping, because I needed to learn the pattern from reading the code in front of me before I could comment on it.

I was probably too lenient with the team during this period. I understood everyone’s other commitments because I was carrying the same load, so I gave a lot of slack on deadlines. Some of that slack turned into work I absorbed myself. When something needed to happen and the person owning it wasn’t going to get there in time, I’d just do it. That kept the project moving but it also meant I ended up doing more than was sustainable.

Week three was the slowest. Real-time multiplayer is the kind of feature that doesn’t work, doesn’t work, doesn’t work, and then suddenly works all at once. We had the STOMP subscriptions in place, the game state syncing, the moves being sent, but every test session would find a new edge case. Reconnects didn’t survive network drops. The chat sheet would lose messages when reopened. The opponent’s avatar wouldn’t load. Every fix found three more bugs. By the end of week three I genuinely didn’t know if we’d ship.

Week four was when it turned. I don’t know how to explain this except to say that real-time projects have a moment when all the pieces stop fighting each other. The reconnect logic started working consistently. The chat lifecycle stopped dropping messages. The animations stopped freezing the board. Each fix was small. The cumulative effect was that I could pick up my phone, start a game, and the game just worked. That’s the moment I knew we were going to ship.

I did most of week four’s work myself. Polish, sound effects, the in-game chat I built end-to-end after a teammate’s attempt didn’t come together, the theming pass, the onboarding flow, a final bug-fixing sweep. By the end of the week, the app on my phone was the app we demoed to the professor.

Then came presentation morning.

I left home and commuted to school. Somewhere in that hour, the power at my house went out. I didn’t know yet. I got to school, opened my laptop, and the backend was unreachable. Cloudflare tunnel down. Server unreachable. The app we were about to demo to the professor was talking to nothing.

The presentation was an hours away and the gap was closing. I checked with home. The power was already back, the desktop just hadn’t been turned on after the outage. I asked someone there to power it on. The whole recovery took maybe ten minutes from my side, plus however long the machine took to boot.

The setup paid off because it was set up to pay off. And not in the easy sense, like “I checked the autostart box on a Docker container.” The recovery worked because every layer of the stack underneath was set up to recover. The Proxmox host comes back on power-on. The local DNS server on the host starts before anything else, because nothing else can resolve names without it. The Tailscale subnet route comes up. The backend VM starts. Inside the VM, Docker Compose brings up the databases in the right order, then the backend service that depends on them. The Cloudflare tunnel reconnects to a backend that’s now reachable. Each layer assumes the layer below it is healthy. If any of them had been misconfigured, the chain would have broken somewhere and a person would have had to log in and fix it manually. There was no person available to log in and fix it manually. There was me, in a classroom, on a laptop, with hours but not unlimited time.

The accumulated discipline of “build the home lab properly even when it’s a hobby” turned out to be the difference between a presentation and a catastrophe. I’d spent months getting the service ordering right. Months figuring out which services needed to come up before which others. Months learning what happens when DNS isn’t ready and a service tries to resolve a hostname anyway. None of that work was for DeepDame. It was for me, for the home lab, because I wanted it to be reliable for its own sake. The morning of the presentation, it paid for itself.

What I’d Do Differently

When I read other engineers’ retrospectives, the “what I’d do differently” section is usually the most thoughtful part of the post. It’s also the part most likely to be wrong.

The honest answer for DeepDame is: probably not much, given the same constraints.

What would I plan more carefully? I don’t know. You can’t plan what you don’t know. On day one of S4, none of us had used Flutter. The plans I could have written that morning would have been based on guesses about a framework I’d never touched. Some of those guesses would have been right. Most would have been wrong. We would have ended up rewriting them anyway, and the time spent writing them would have been time not spent building.

The real answer to “what would I do differently” isn’t a plan. It’s that next time I have a head start. Next time I lead a mobile project, I already know Flutter. I know what Riverpod looks like at scale. I know how to think about real-time state on a mobile client. I know the shape of the work. The next project gets the benefit of this one, and that’s the actual return on the time we spent.

What I would push myself on is the leniency. I gave the team a lot of slack because I understood the load they were under. The reason I understood it was that I was under the same load. Some of that slack turned into work I absorbed myself, which kept the project moving but also meant I burned out faster than I should have. Next time I lead a team in the same kind of conditions, I want to find the line between “understanding the human” and “absorbing the work” more deliberately. It’s a real line and I don’t think I drew it well this time.

The other thing, which is less about leadership and more about myself, is that I worked alone too much. Week four, when most of the polish happened, was almost entirely solo. Some of that was unavoidable because by week four the team had other obligations that didn’t pause for our deadline. But some of it was me, defaulting to “I’ll just do it” because the path of least resistance was to ship myself rather than coordinate. That instinct shipped the project. It also means I have a habit I need to watch for. Leaders who absorb the work scale to one project. Leaders who delegate scale to many.

DeepDame was good for me because the idea started with me and I stayed close to the work at every stage. That’s also the limit of it. The next thing I want to learn is how to lead something where I’m not the person doing most of the building. That’s the kind of leadership the projects after this one are going to demand. DeepDame was the project where I led from the front. The next one needs to be a project where I lead from somewhere else.

The code is here if you want to look. The APK is in the releases. The backend setup is a screen on first launch, so you can point the app at your own server if you want to run the whole thing.

Thanks for reading.

The Idea

S3: The Backend

Between S3 and S4

S4: Building the Client

What I’d Do Differently

Comments