Tuesday, December 30, 2014

Microservice Architecture

Microactions, Parent Actions, Faction Actions, Family Actions

Microservices? Hah, just kidding. This isn't some web service SOA game. Because screw micropayment games. We're here to talk about the amazing and fascinating world of actions; the state machines that get your units doing the things you told them to do. In Cultura, it happens in a hierarchial manner: faction, family, unit. The short version is that when you queue up a big faction-wide action it then doles out unit actions to individual units. Some automated actions, like finding food for the family, are family actions which doles out actions to individual units within a family.

As it turns out, debugging this is super hard. A unit has a stack of actions to track what it should be doing, as each action finishes it pops and the next action in the stack is worked on. Every action has a simple shared interface so that the unit can processTick just by calling doAction on generic actions. Underneath, each action is a child class of UnitAction which has its own various states. But, big actions are complex. Something as simple as "harvest node" turns into a confusing mess of state transitions (what if my inventory is full? resource node is exhausted? I'm under attack? drop off location disappeared or is full?).

So, we go into refactoring code. The current round of refactoring involves devolving each action into "microactions". An action is now defined by three simple concepts: Exit Conditions, Stacking Conditions and the Action. A microaction is an action without Stacking Conditions.

Exit Conditions

In order to keep things simple, an action has clear exit conditions. By breaking things down to microactions, the exit conditions become very simple. Move action? If I'm there then exit. Build action? Apply enough labour and then exit. Drop off action? Drop off what I can and then exit. The larger actions usually wait until a microaction is finished for some its exit conditions and merely tracks the number of microactions successfully completed.

This requires a bit of paradigm shift within the code. Actions don't get wiped off the stack. There's a few edge cases where things get tricky (making sure that we keep track of resources used in the construction of an object don't get lost if a unit dies but they do get lost if the building they're in is destroyed/looted). Secondly, player initiated actions can get replaced or wiped but they need to only be of certain types of actions so that they don't hit those edge cases.

Stacking Conditions

For more complex actions, they are built using a variety of microactions. A single harvest action involves move actions, pick up actions, drop off actions and the actual harvesting itself. The core of a harvest action can be thought of as the "harvest" action. Everything else is a condition where we bump out of the action and do something else. In the case of microactions, we only ever exit; we never do any other action and therefore are atomic actions. In bigger actions, they may need to use a microaction to complete a task. By never wiping the stack, we can rely on the fact that no matter what happens between one action and the next, an action can be completed to the right conditions and we can safely pop.

A stack condition is something that is checked and if needed, a new action is stacked on the unit. For a harvest action, this may mean that a move action is triggered if the unit isn't close enough to a desired location. We can safely assume when this action completes we are at the right location. This is key. All actions can have any number of things happen before they actually complete. A move can be interrupted by another move, by a flee action or anything else but when the stack pops and goes back to the same move action, it does what we want it to do and when it is done we are assured we are in the right condition. And even if we are not, it'd hit the same stacking condition in the parent action.

As an example, say we are harvesting. We try to harvest but we hit "are we close enough" stacking condition. We aren't! So a move action is stacked. When that completes, we are at the node and can then do the harvest.

A more complex example: we are harvesting. We try to harvest but aren't close enough so we stack a move action. But then the player selects the unit to go chop a tree. Later when the tree has finished chopping, the unit is somewhere else entirely but it pops back to the original stacked move action. It then tries to move to the node before harvesting.

Action

Most of the actions involve some core action that it does. This is "harvest" or "drop off". However, some actions are merely just a collection of Exit Conditions and Stacking Conditions. Refactoring everything into simple if statements leaves the core of any action to be an incredibly simple and small amount of code. This is what makes the debugging super easy. State transitions are clearly defined.

The Future

At more well funded studios (ones with more than zero dollars) the state transitions and actions are usually data-driven. The engine defines a bunch of atomic actions and a whack of possible enter/exit conditions and/or triggers. Eventually, I would like to get Cultura to that point but the microaction refactoring is the first step. It makes a clear way for me to go ahead with the data-driven model. For instance, every "if statement" could instead be codified in a class with an interface that has a single bool function shouldContinue and the c++ code encompasses what it means but the data-driven part just uses labels like "check if inventory full/empty". Then actions are built dynamically on load time with arrays of exit and stacking conditions.

Huzzah.

Tuesday, December 9, 2014

Building Skynet

Artificial Intelligence

So, after the Alpha there's going to be some thought about how to build the opponents in the game: the AI. Most games are built using scripted AI and some limited heuristics in order for the player to compete against some rudimentary opponents. In games such as Civilization it's a make or break scenario; bad AI translates to a truly awful game experience. For most big budget games, ironically there's zero time/money for developing good AI, whereas for indie developed games, there's a lot more room for it. It's mostly because AI is about the most unreliable (timeline wise) aspect of a game to work on.

Cultura is going to use pre-trained AI. How much heuristics depends on how well the training goes but let's talk about the idealized world of having an AI trained from nothing!

The Metrics

The Cultura AI is going to be looked at from three different angles:

  • Data: What metrics is the AI going to rely upon and measure?
  • Score: This is a continuous game, so rather than a concept of winning/losing which the AI can train upon, it continuously checks itself against a score to see whether it is doing well
  • Actions: What is it that the AI actually has control over in order to attempt maximizing the Score?

The exact method used to maximize score could vary: using a perceptron or perhaps a multi-layer neural net or something more sophisticated would be the obvious choices. These can lead to different output values for types of actions and what to act upon. A highly scripted AI would be built similarly except that the metric -> action translation would be built by hand. In the case of a trained AI, it would rely on the Data and then produce an choice about which action to take.

Data

The data is one of the most important aspects of the AI. Choosing insufficient or irrelevant metrics will mean that the AI is not paying attention to the right things to make its actions. Some of the data is simple. It should know what land it owns, what resources are on that land at any given moment and the number of people. It might also try to track buildings, industrial buildings of different types, tools, weapons and military power. And finally, the more difficult to track is the relationship with each faction it knows about.

  • Non-renewable Resources: It should know about the current maximum draw rate and the current actual draw rate.
  • Renewable Resources: It should know about the current amount left, the rate of growth and the current draw rate.
  • People: It should know about the number of people, it might split them apart by labour skill
  • Buildings: It should know about number of buildings, industrial buildings
  • Goods: It should track number of tools and weapons available, by type and have more points for superior tools/weapons in a particular category
  • Military Power: It should track how powerful it's currently active military is

How might we track a relationship with someone? The issue is that an AI should be able to care that it has hostile neighbours. But how to do that?

  • Relationship Score
  • Military Power
  • Distance

We could track those two metrics for each known neighbour, have some interesting way of calculating distance (perhaps the closest distance between two land areas of the two factions) and then worry about the military power. Then, we could lump all this together for a total relationship score that is used for calculations... or we could leave it separate. Lumping it together makes it a bit easier for the "national AI" to make overall decisions. One hopes through training it figures out that highly hostile military powers near it means that it needs to defend itself against the enemy.

Score

Ultimately, any AI needs some kind of objective function or else it doesn't train against much. In the case of Cultura there is no winning or losing. It is a continuous game that you continue to play FOREVER. So, how might we picture this? I figure the simplest way is to see the game as a race with positions and you always want to be number one. So, how well an AI thinks it is doing depends on how well others are doing. This could make the AI pretty hard to train (though, another thought is to have opponents who are fake but always improving at a constant rate and try to train AI against this).

This means that we have a way of calculating score and then all the AI is trying to do is get one more point compared to the highest score.

The score is likely to be a function that is something like Total = Population * Happiness * Health + Wealth + Buildings + Land. Another option is to try to track population, health, happiness and wealth separately and have the AI try to be first place in each category.

Actions

And what is an AI without the ability to do anything? The main picture of the world to the AI is Labour Distribution. Oh ho ho, how communist! But more seriously, this is where the AI tries to decide economic activity to the best of its ability in order to cope with the world and become number one. Also, for those that might actually be concerned about a fully communist AI-run society, the future of Cultura involves a concept known as administrative cost which limits the amount the government can do (because any action it takes administration points thus it might be superior to leave some things to private hands).

Okay so one type of action is changing the amount of labour in a particular industry at the margin. Doing it at the margin is important because we don't want an AI that makes crazy decisions, we want one that edges a society toward the optimal solution in a finite amount of time. Another type of decision would be to expand or upgrade the industrial structures related to a particular industry. In a similar vein, it might decide to increase the number of tools being made for a particular industry to increase production.

But, the more difficult decisions comes from foreign interaction. Okay, so it's pretty easy to say "Gee my happiness is low, build some kind of luxury". Then it's more difficult to say what kind of good. But even if you were to decide that when would you figure out "Gee let's instead trade ivory figurines for wooden furniture because the comparative advantage makes this a superior choice than forming a domestic industry to do this?" One choice is to say that a generic action is "increase production of x" and then a subsequent sub-choice to that is "how to increase production of x" where "trade" is an action and then the sub-choice there is "what do I trade off and to who for this good?"

And then of course, there's raiding, war and peace. How to train an AI to be able to figure out when it should make these types of decisions? "Relationship metric is very low, attempt to fix this is triggered, peace is less costly than war" OR "Wealth Score needs to be improved, weak nation nearby, crush nation and take goods"

Conclusion

This is a pretty good start to the AI question. In the end, the most important thing is there to be an awesome AI that is interesting to play against and is highly intelligent. This is one of the few games where an AI who is really awesome at managing a society is totally okay. Striving for Nintendo Hard here.