Machine Learning in Unity — Part 2 of 3

In my previous blog, Part 1 in this series, I introduced a bit of history of the use of Artificial Intelligence (AI) and games, and the crucial role games have played in the advancement in certain areas in AI in general.

In Part 2, I would like to introduce some terminology and concepts that will help to understand machine learning and how it relates to AI and deep learning in conjunction with neural networks and the use of deep learning neural network brains to set the foundation for our ultimate goal — to apply machine learning to Unity development.

You may be hearing the term AI (Artificial Intelligence) being liberally mentioned these days. But, wait — Machine Learning (ML), Deep Learning (DL), Neural Networks (NN), Deep Neural Networks (DLN) have also arrived in the scene. Let’s take a few minutes to unpack these terms and provide some general understanding about what we need to know as machine learning practitioners in our Unity development.

Here’s an older, more informal definition if Artificial Intelligence by Arthur Samuel from the 1950's:

Yes, the concept of AI was conceived over 70 years ago! Although we are able to create software to perform very complex behavior, being able to “learn” and perform “without explicitly being programmed” and to be able to adapt to changes in an environment and respond to the behavior of others is what artificial intelligence represents.

When we refer to the terms AI, ML, and DL, they are essential referring to different aspects of the same generalization. This may provide some context:

AI vs ML vs DL

  • AI — A program’s ability to sense, reason, act, and adapt
  • ML — Algorithms whose performance improves as they are exposed to more data over time
  • DL — Multilayered neural networks learn from vast amounts of data
    conceptualized decades ago, but we finally have the computing resources to implement them

High end gaming technology inspired the standardization of powerful Graphics Processing Units (GPUs) that we are now able to utilize to implement deep learning algorithms theorized decades ago.

At a basic level, here are the three categories of machine learning.

Ref: Data Flair

For our purposes, we will be primarily focusing on Reinforcement Learning (RL) for game development.

Deep learning straddles the cross section between machine learning and neural networks.

Ref: The Scientist

The roots of scientific research are heavily rooted in statistics and probability, and with the advent of massive amounts of data (big data), the area known as data science has emerged to meet the needs of applying statistical analysis to the massive amounts of data. The ability to learn from that data (statistical learning) has evolved into the various models and algorithms representative of machine learning in general.

Ref: ICT Institute

This is, of course, a very high level overview that hopefully puts some of the buzz words in perspective.

Reinforcement Learning (RL)

We often hear that computers are pretty dumb, and will do only what they are told…but AI makes them smart(er)? One way to understand how a computer could possible “learn” to do something without being explicitly being programmed to do so is illustrated, here.


Baby steps:
Think of the Agent as the player, that is assigned an objective. The Agent player starts out taking random action in the game space or environment (or sometimes referred to as the Arena). The state of agent(s) in the environment are continuously monitored, and a positive or negative reward is issued for every move or action the Agent player makes. This sets out the Agent player to strive for more and more positive rewards (right moves or actions in the direction of achieving the goal) and avoid negative rewards (unsuccessful moves or actions). This feedback loop is run until the optimal amount of success and the minimal amount of error are achieved.

That’s it — Reinforcement Learning in a nutshell!

An alternative way of looking at this iterative process to training a model is illustrated here.

Ref: Lil’Log

Think of it this way: Of all the possible actions an agent player can take in the environment provided, issue a reward for action(s) taken, and strive to generate the next best action based on the most recent state. Then, keep doing it over until the agent player gets it right most of the time.

If you are curious to know more of the nitty gritty details, but not too much, check out What is Reinforcement Learning.

Looks simple enough, but what makes this work? Literally, think of how a human brain and nervous system works.

“Cells within the nervous system, called neurons, communicate with each other in unique ways. The neuron is the basic working unit of the brain, a specialized cell designed to transmit information to other nerve cells, muscle, or gland cells.” —


Translated to machine learning, artificial neurons are known as perceptrons. However, you will more likely encounter the biological term neuron.

In its simplest form as originally imagined:


“Perceptrons can be viewed as building blocks in a single layer in a neural network, made up of four different parts:

  1. Input Values (One Input Layer, x’s)
  2. Weights and Bias (W’s and b’s)
  3. Net sum
  4. Activation function”
  5. Output (One Output Layer, y)

Ref: Modified for clarity from TowardsDataScience

A more precise representation of an artificial neuron (perceptron):

Ref: Carnegie Mellon University

Okay, we will go with the flow and hereafter refer to the artificial neuron as simply a neuron, but keep in mind, it is technically a perceptron.

The weights and biases are imposed on the inputs and at the point of the Activation Function, respectively, which helps to determine whether the results of the summation “fired” or activated the desired outcome.

For example, consider the inputs different features of our player (armor, health, powerup, collectible, etc.). If the objective is to defend the castle, given the current state of its features, what would be the most optimal combination of features (and their values) for the player to successfully meet the objective? Depending on the scenario, different weights would be assigned to each feature or input x to achieve the best result.

In this basic form of interconnected neurons (perceptrons) and a single (linear) function, we have the foundation of a neural network.

Neural Networks (NNs) are generalized into two basic forms:

  • Single Layer (Linear)
  • Multiple Hidden Layers (Deep Learning Neural Network)

Simply put, the activation or firing of a given set of inputs can be represented by a linear equation (looks like a line!) and any additional, or multiple hidden layers are applied would be represented by a more complex function (non-linear).

Ref: TowardsDataScience

Not to worry about the details here, because there are people who specialize in this area to create algorithms to abstract this process for the rest of us.

As a humans with limited resources and time, it would be impossible to explicitly compare all the possible inputs and weights of all the features, especially in a multi-layered (non-linear) situation. Thus, the availability of GPUs has made it possible to train these models.

The human brain, however, learns and makes decisions up to a certain point, instantaneously, and based on past experiences. We can only have a limited amount of those experiences, but an artificial deep learning neural network operating in a reinforcement learning environment is capable of going beyond the limits of time and complexity to consider all of them, and in combination with each other. Note that when first training such a model, it will appear hopeless as the agent player barely knows how to move. After several minutes of training, some success or even expert level behavior will become apparent. Depending on the complexity of the scenario, several hours or days of training reveal behavior we as humans would not imagine to be effective, and often exhibits what might appear to be perplexing behavior or decision making that results in a achieving wining outcome.

Putting it all together, we would have our deep neural network (DNN) acting in the reinforcement learning system.

There are, of course, several other arrangements, but we have arrived here to specifically provide a high level overview and background to understand machine learning in Unity. Only a general understanding of the diagrams presented is necessary for where we are going with this topic — so not to worry if some of these diagrams appear as gibberish. In practice, pre-packaged algorithms and pre-trained models are available, depending on the desired results.

Ref: SmartLab AI

For machine learning or reinforcement learning in Unity, we will be looking at PPO (Proximal Policy Optimization), SAC (Soft Actor Critic), and Imitation Learning.

Congratulations if you are still with me! Now that we have a general understanding of the relevant parts of “AI” we are ready for the final part, Part 3, in this series, where we will dive into machine learning in Unity.

Happy New Year 2021!!



An unconventional software developer's journey

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store