IJCAI 2022 - The Neural MMO Challenge
Neural MMO Environment Tutorial
A tutorial notebook on Neural MMO Environment, observation and actions
Tutorial on NMMO's observation and action space¶
This is a tutorial to show the observation space and action space of Neural MMO's environment.
More information is available at https://neuralmmo.github.io/.
Installation¶
See https://gitlab.aicrowd.com/neural-mmo/ijcai2022-nmmo-starter-kit
# Create a new conda env
conda create -n nmmo python=3.9
conda activate nmmo
# install ijcai competition setup and nmmo
pip install git+http://gitlab.aicrowd.com/henryz/ijcai2022nmmo.git
After installation, we can establish a demo environment to show how NMMO works.
Environment Initialization¶
!pip install -q --ignore-requires-python openskill==0.2.0a0
!pip install -q git+http://gitlab.aicrowd.com/henryz/ijcai2022nmmo.git
import nmmo
from ijcai2022nmmo import CompetitionConfig
env = nmmo.Env(CompetitionConfig())
The env setting for this competition:
- Map of 128 * 128 grids
- 128 players (16 teams * 8 players/team)
- All players spawn concurrently when game starts
If you want to see the detailed configuration:
def printConfig(config):
for attr in dir(config):
if not attr.startswith('__'):
print('{}: {}'.format(attr,getattr(config,attr)))
printConfig(env.config)
Observation Space¶
You can get the observation space via env.observation_space(agent)
but the output is really verbose.
env.observation_space(agent=1)
It is hard to understand, so the obs will be explained in detail in the following chapter.
obs = env.reset()
print(obs.keys())
The env returns obs in a dict and the dict keys are the agent's id from 1 to 128.
And let's take a look at the obs single agent receives:
obs[1]
obs[1].keys()
The agent's obs consists of two parts:
- Entity: the information of yourself, other players and npcs.
- Tile: the information of local map with 15x15 size.
Entity Information¶
obs[1]['Entity'].keys()
obs[1]['Entity']['Continuous'].shape
obs[1]['Entity']['Discrete'].shape
obs[1]['Entity']['N']
The entity information is a dictionary with the following keys
- Continuous: the continuous features, a 2d
ndarray
with shape 100*13.- The first dimention 100 is the max number of agents that can be observed, and is controlled by
config.N_AGENT_OBS
. - The second dimention 13 is the number of feature columns and the meaning of each column will be explained in detail.
- The first dimention 100 is the max number of agents that can be observed, and is controlled by
- Discrete: the discrete features, a 2d
ndarray
with shape 100*4.- The first dimention 100 is the max number of agents that can be observed, and is controlled by
config.N_AGENT_OBS
. - The second dimention 4 is the number of feature columns.
- Notice that the discrete information is duplicate of (a part of) the continuous information, which means you can simply drop the discrete information.
- The first dimention 100 is the max number of agents that can be observed, and is controlled by
- N: the number of agents observed (including yourself) in current vision.
Tile Information¶
obs[1]['Tile'].keys()
obs[1]['Tile']['Continuous'].shape
obs[1]['Tile']['Discrete'].shape
The tile information is a dictionary with the following keys
- Continuous: the continuous features, a 2d
ndarray
with shape 225*4.- The first dimention 225 is the number of tiles within agent's vision, which is controlled by
config.NSTIM
. Whenconfig.NSTIM=7
by default, the number of tiles is(2*7+1)^2 = 225
. - The second dimention 4 is the number of feature columns and the meaning of each column will be explained in detail.
- The first dimention 225 is the number of tiles within agent's vision, which is controlled by
- Discrete: the discrete features, a 2d
ndarray
with shape 225*3.- The first dimention 225 is the number of tiles within agent's vision, which is controlled by
config.NSTIM
. - The second dimention 3 is the number of feature columns.
- Notice that the discrete information is also duplicate of (a part of) the continuous information, which means you can simply drop the discrete information.
- The first dimention 225 is the number of tiles within agent's vision, which is controlled by
Feature Columns¶
Information | Type | Index | Feature | Description |
---|---|---|---|---|
Entity | Continuous(100*13) | 0 | Mask | whether this row contains information, 1 for useful, 0 for null. |
1 | Entity_ID | ID of this entity; for players, ID>0, for NPCs, ID<0 | ||
2 | Attacker_ID | the ID of last agent that attacks this entity | ||
3 | Level | the level of this entity | ||
4 | Population | the population this entity belongs to, can be used to identify teammates and opponents; for players, population>0; for NPCs, population<0 | ||
5 | Row_index | the row index of this entity | ||
6 | Column_index | the column index of this entity | ||
7 | Damage | the damage this entity has been received | ||
8 | Timealive | The time this entity has been alive | ||
9 | Food | current food this entity has | ||
10 | Water | current water this entity has | ||
11 | Health | current health of this entity | ||
12 | Is_freezed | whether this entity is freezed right now, 1 for freezed, 0 for not. | ||
Discrete(100*4) | 0 | Mask | whether this row contains information, 1 for useful, 0 for null. | |
1 | Population | the population this entity belongs to, can be used to identify teammates and opponents; for players, population>0; for NPCs, population<0 | ||
2 | Row_index | the row index of this entity | ||
3 | Column_index | the column index of this entity | ||
N | / | N_agents_observed | the number of agents within this entity's vision right now. | |
Tile | Continuous(225*4) | 0 | N_entity | the current number of entities on this tile |
1 | Type | the type of this tile, 0 for lava, 1 for water, 2 for grass, 3 for scrub, 4 for forest, 5 for stone | ||
2 | Row_index | the row index of this tile | ||
3 | Column_index | the column index of this tile | ||
Discrete(225*3) | 0 | Type | the type of this tile, 0 for lava, 1 for water, 2 for grass, 3 for scrub, 4 for forest, 5 for stone | |
1 | Row_index | the row index of this tile | ||
2 | Column_index | the column index of this tile |
Action Space¶
from nmmo.io import action
act_space = env.action_space(agent=0)
print("Action space:")
print("*"*2,action.Attack,": ",act_space[action.Attack])
print("-"*8,action.Style,": ",act_space[action.Attack][action.Style])
print("-"*8,action.Target,": ",act_space[action.Attack][action.Target])
print("*"*2,action.Move,": ",act_space[action.Move])
print("-"*8,action.Direction,act_space[action.Move][action.Direction])
As shown above, the action space is presented as a nested dictionary. And the keys are the classes from nmmo.io.action
.
The agent can perform two actions at the same timestep:
Attack: you can attack an entity (include npc and other players) within your vision.The action can be empty.(If you don't send this action, the agent will not attack anyone.)
- Target: Choose the target the agent attack.
- Style: Choose the style the agent use to attack.
Move: you can move in the four directions. The action can be empty.(If you don't send this action, the agent will stay.)
- Direction: Choose the direction the agent moves.
And you should return actions like this:
from nmmo.io import action
actions = {
1: {action.Attack:{
action.Style: 0,
action.Target: 3
},
action.Move: {
action.Direction: 1
}
},
2: {action.Attack:{
action.Style: 2,
action.Target: 4
},
action.Move: {
action.Direction: 3
}
},
...
}
Team-Based Env¶
With game setting of the competition, it would be better to make the input and output based on team. Therefore, users can focus on training teams, and do not need to care about how to seperate players into teams manully.
The TeamBasedEnv
is almost the same as the nmmo.Env
execpt for the reset()
and step()
methods which handle the team-player mapping automatically.
TeamBasedEnv
is used in evaluation, and it may be used in training too.
from ijcai2022nmmo import CompetitionConfig, TeamBasedEnv
env = TeamBasedEnv(CompetitionConfig())
# input
actions_by_team = {
# actions of team0
0: {
0: action_of_player0_in_team0,
1: action_of_player1_in_team0,
...
7: action_of_player7_in_team0,
},
# actions of team1
1: {
0: action_of_player0_in_team1,
...
},
...
# actions of team15
15: {
...
7: action_of_player7_in_team15
},
}
(
observations_by_team,
rewards_by_team,
dones_by_team,
infos_by_team,
) = env.step(actions_by_team)
print(observations_by_team)
...
{
# observations of team0
0: {
0: obs_of_player0_in_team0,
1: obs_of_player1_in_team0,
...
7: obs_of_player7_in_team0,
},
# observations of team1
1: {
0: obs_of_player0_in_team1,
...
},
...
# observations of team15
15: {
...
7: obs_of_player7_in_team15
},
}
Replay¶
Before getting started, you need download a NMMO client from this page and run it locally. Here is an example showing how to save and load replay.
Note: This code block won't run in colab. Please run it on your local machine.
import nmmo
from ijcai2022nmmo import CompetitionConfig, TeamBasedEnv, scripted
class Config(CompetitionConfig):
SAVE_REPLAY = "demo"
def save_replay():
"""Demo for saving replay"""
config = Config()
env = TeamBasedEnv(config=config)
scripted_ai = scripted.CombatTeam(None, config)
obs = env.reset()
t, horizon = 0, 32
while True:
env.render()
decision = {}
for team_id, o in obs.items():
decision[team_id] = scripted_ai.act(o)
env.step(decision)
t += 1
if t >= horizon:
break
env.terminal()
def load_replay():
"""Demo for loading replay"""
replay = nmmo.Replay.load(Config.SAVE_REPLAY + ".replay")
replay.render()
if __name__ == "__main__":
save_replay()
# load_replay()
Content
Comments
You must login before you can post a comment.
whye value in Continue and Discrete obsevation space is different ?
Thank you for this notebook !