NeurIPS 2022 - The Neural MMO Challenge
NeurIPS 2022 Neural MMO Challenge Tutorial
Introduction of the action space and the observation space of the Neural MMO Challenge Environment
Tutorial on NMMO's observation and action space¶
This is a tutorial to show the observation space and action space of Neural MMO's environment.
Installation¶
See https://gitlab.aicrowd.com/neural-mmo/neurips2022-nmmo-starter-kit
# install neurips competition setup and nmmo
!pip install --ignore-requires-python openskill==0.2.0a0
!pip install git+http://gitlab.aicrowd.com/neural-mmo/neurips2022-nmmo.git
After installation, we can establish a demo environment to show how NMMO works.
import nmmo
from nmmo import config
class TestConfig(config.Medium, config.AllGameSystems):
pass
conf = TestConfig()
env = nmmo.Env( conf )
The env setting for this competition:
- Map of 128 * 128 grids
- 128 players (16 teams * 8 players/team)
- All players spawn concurrently when game starts
If you want to see the detailed configuration:
for attr in dir(config):
if not attr.startswith( '__' ):
print( f'{attr}: {getattr( config, attr )} ' )
Observation Space¶
You can get the observation space via env.observation_space(agent) but the output is really verbose.
env.observation_space(agent=1)
It is hard to understand, so the obs will be explained in detail in the following chapter.
obs = env.reset()
print(obs.keys())
The env returns obs in a dict and the dict keys are the agent's id from 1 to 128.
And let's take a look at the obs single agent receives:
obs[1].keys()
The agent's obs consists of two parts:¶
- Entity: the information of yourself, other players and npcs.
- Tile: the information of local map with 15x15 size.
- Item: the information of weapon, tool, comsummer that the Entity equiped
- Market: the information (in selling goods) of global market
Entity Information¶
obs[1]['Entity'].keys()
obs[1]['Entity']['Continuous'].shape
obs[1]['Entity']['Discrete'].shape
obs[1]['Entity']['N']
The entity information is a dictionary with the following keys
- Continuous: the continuous features, a 2d ndarray with shape 100*24.
- The first dimention 100 is the max number of agents that can be observed, and is controlled by config.N_AGENT_OBS.
- The second dimention 24 is the number of feature columns and the meaning of each column will be explained in detail.
- Discrete: the discrete features, a 2d ndarray with shape 100*5.
- The first dimention 100 is the max number of agents that can be observed, and is controlled by config.N_AGENT_OBS.
- The second dimention 5 is the number of feature columns.
- Notice that the discrete information is duplicate of (a part of) the continuous information, which means you can simply drop the discrete information.
- N: the number of agents observed (including yourself) in current vision.
Tile Information¶
obs[1]['Tile'].keys()
obs[1]['Tile']['Continuous'].shape
obs[1]['Tile']['Discrete'].shape
obs[1]['Tile']['N']
The tile information is a dictionary with the following keys
- Continuous: the continuous features, a 2d ndarray with shape 225*4.
- The first dimention 225 is the number of tiles within agent's vision, which is controlled by config.NSTIM. When config.NSTIM=7 by default, the number of tiles is (2*7+1)^2 = 225.
- The second dimention 4 is the number of feature columns and the meaning of each column will be explained in detail.
- Discrete: the discrete features, a 2d ndarray with shape 225*3.
- The first dimention 225 is the number of tiles within agent's vision, which is controlled by config.NSTIM.
- The second dimention 3 is the number of feature columns.
- Notice that the discrete information is also duplicate of (a part of) the continuous information, which means you can simply drop the discrete information.
- N: the number of distinct tile observations (fixed) ## Item Information
obs[1]['Item'].keys()
obs[1]['Item']['Continuous'].shape
obs[1]['Item']['Discrete'].shape
obs[1]['Item']['N']
The item information is a dictionary with the following keys
- Continuous: the continuous features, a 2d ndarray with shape 170*16.
- The first dimention 170 is the number of items, which is controlled by config.NPC_LEVEL_MAX. When config.NPC_LEVEL_MAX=10 by default, the number of items is 17*10 = 170.
- The second dimention 16 is the number of feature columns and the meaning of each column will be explained in detail.
- Discrete: the discrete features, a 2d ndarray with shape 170*3.
- The first dimention 170 is the number of items, which is controlled by config.NPC_LEVEL_MAX. When config.NPC_LEVEL_MAX=10 by default, the number of items is 17*10 = 170.
- The second dimention 3 is the number of feature columns.
- Notice that the discrete information is also duplicate of (a part of) the continuous information, which means you can simply drop the discrete information.
- N: the number of distinct item observations (fixed)
Market Information¶
obs[1]['Market'].keys()
obs[1]['Market']['Continuous'].shape
obs[1]['Market']['Discrete'].shape
obs[1]['Market']['N']
The market information is a dictionary with the following keys
- Continuous: the continuous features, a 2d ndarray with shape 170*16.
- The first dimention 170 is the max number of items in market, which is controlled by config.NPC_LEVEL_MAX. When config.NPC_LEVEL_MAX=10 by default, the number of items is 17*10 = 170.
- The second dimention 16 is the number of feature columns and the meaning of each column will be explained in detail.
- Discrete: the discrete features, a 2d ndarray with shape 170*3.
- The first dimention 170 is the number of items, which is controlled by config.NPC_LEVEL_MAX. When config.NPC_LEVEL_MAX=10 by default, the number of items is 17*10 = 170.
- The second dimention 3 is the number of feature columns.
- Notice that the discrete information is also duplicate of (a part of) the continuous information, which means you can simply drop the discrete information.
- N: the number of distinct item observations (fixed)
Feature Columns¶
Raw Features | index1 | index2 | Dimension | Type | attr |
---|---|---|---|---|---|
Obs (raw obs) | Entity (100*24) | 0 | 1 | scalar | Mask |
1 | 1 | scalar | "ID" | ||
2 | 1 | scalar | AttackerID | ||
3 | 1 | scalar | Level | ||
4 | 1 | scalar | ItemLevel | ||
5 | 1 | scalar | Comm | ||
6 | 1 | scalar | Population | ||
7 | 1 | scalar | R | ||
8 | 1 | scalar | C | ||
9 | 1 | scalar | Damage | ||
10 | 1 | scalar | TimeAlive | ||
11 | 1 | scalar | Freeze (deprecated) | ||
12 | 1 | scalar | Gold | ||
13 | 1 | scalar | Health | ||
14 | 1 | scalar | Food | ||
15 | 1 | scalar | Water | ||
16 | 1 | scalar | Melee Level | ||
17 | 1 | scalar | Range Level | ||
18 | 1 | scalar | Mage Level | ||
19 | 1 | scalar | Fishing Level | ||
20 | 1 | scalar | Herbalism Level | ||
21 | 1 | scalar | Prospecting Level | ||
22 | 1 | scalar | Carving Level | ||
23 | 1 | scalar | Alchemy Level | ||
Tile (225*4) | 0 | 1 | scalar | NEnts | |
1 | 1 | scalar | Index | ||
2 | 1 | scalar | R | ||
3 | 1 | scalar | C | ||
Item (170*16) | 0 | 1 | scalar | "ID" | |
1 | 1 | scalar | Index | ||
2 | 1 | scalar | Level | ||
3 | 1 | scalar | Capacity | ||
4 | 1 | scalar | Quantity | ||
5 | 1 | scalar | Tradable | ||
6 | 1 | scalar | MeleeAttack | ||
7 | 1 | scalar | RangeAttack | ||
8 | 1 | scalar | MageAttack | ||
9 | 1 | scalar | MeleeDefense | ||
10 | 1 | scalar | RangeDefense | ||
11 | 1 | scalar | MageDefense | ||
12 | 1 | scalar | HealthRestore | ||
13 | 1 | scalar | ResourceRestore | ||
14 | 1 | scalar | Price | ||
15 | 1 | scalar | Equipped | ||
Market (170*16) | 0 | 1 | scalar | "ID" | |
1 | 1 | scalar | Index | ||
2 | 1 | scalar | Level | ||
3 | 1 | scalar | Capacity | ||
4 | 1 | scalar | Quantity | ||
5 | 1 | scalar | Tradable | ||
6 | 1 | scalar | MeleeAttack | ||
7 | 1 | scalar | RangeAttack | ||
8 | 1 | scalar | MageAttack | ||
9 | 1 | scalar | MeleeDefense | ||
10 | 1 | scalar | RangeDefense | ||
11 | 1 | scalar | MageDefense | ||
12 | 1 | scalar | HealthRestore | ||
13 | 1 | scalar | ResourceRestore | ||
14 | 1 | scalar | Price | ||
15 | 1 | scalar | Equipped |
Action Space¶
from nmmo.io import action
act_space = env.action_space(agent=1)
print("Action space:")
print("*"*2,action.Attack,": ",act_space[action.Attack])
print("-"*8,action.Style,": ",act_space[action.Attack][action.Style])
print("-"*8,action.Target,": ",act_space[action.Attack][action.Target])
print("*"*2,action.Move,": ",act_space[action.Move])
print("-"*8,action.Direction,act_space[action.Move][action.Direction])
print("*"*2,action.Buy,": ",act_space[action.Buy])
print("-"*8,action.Item, act_space[action.Buy][action.Item])
print("*"*2,action.Sell,": ",act_space[action.Sell])
print("-"*8,action.Item, act_space[action.Sell][action.Item])
print("-"*8,action.Price, act_space[action.Sell][action.Price])
print("*"*2,action.Use,": ",act_space[action.Use])
print("-"*8,action.Item, act_space[action.Use][action.Item])
print("*"*2,action.Comm,": ",act_space[action.Comm])
print("-"*8,action.Token, act_space[action.Comm][action.Token])
As shown above, the action space is presented as a nested dictionary. And the keys are the classes from nmmo.io.action.
The agent can perform 6 actions at the same timestep:
Attack: you can attack an entity (include npc and other players) within your vision.The action can be empty.(If you don't send this action, the agent will not attack anyone.)
- Target: Choose the target the agent attack.
- Style: Choose the style the agent use to attack.
Move: you can move in the four directions. The action can be empty.(If you don't send this action, the agent will stay.)
- Direction: Choose the direction the agent moves.
Buy: you can buy some items from the glbal market. (If you don't send this action, the agent will not buy any items)
- Item: Choose items that the agent need in global market to buy.
Sell: you can put some items on the glbal market with preset price. (If you don't send this action, the agent will not sell any items)
- Item: Choose the items that the agent not need to sell.
- Price: Choose a preset price to sell the Item.
Use: you can use the consumer item to restrore healthy, food or water. (If you don't send this action, the agent will not use any items)
- Item: Choose the consumerable items to restore healthy, food or water.
Comm: you can show other agents that items you need
- Token: represent the items you need
And you should return actions like this:
from nmmo.io import action
actions = {
1: {
action.Attack:{
action.Style: 0,
action.Target: 3
},
action.Move: {
action.Direction: 1
},
action.Buy: {
action.Item: 4
},
action.Sell: {
action.Item: 2,
action.Price: 5
},
action.Use: {
action.Item: 4
},
action.Comm: {
action.Token: 0
}
},
2: {
action.Attack:{
action.Style: 0,
action.Target: 3
},
action.Move: {
action.Direction: 1
},
action.Buy: {
action.Item: 4
},
action.Sell: {
action.Item: 2,
action.Price: 5
},
action.Use: {
action.Item: 4
},
action.Comm: {
action.Token: 0
}
},
...
}
Team-Based Env¶
With game setting of the competition, it would be better to make the input and output based on team. Therefore, users can focus on training teams, and do not need to care about how to seperate players into teams manully.
The TeamBasedEnv is almost the same as the nmmo.Env except for the reset() and step() methods which handle the team-player mapping automatically.
TeamBasedEnv is used in evaluation, and it may be used in training too.
from neurips2022nmmo import CompetitionConfig, TeamBasedEnv
env = TeamBasedEnv(CompetitionConfig())
# input
actions_by_team = {
# actions of team0
0: {
0: action_of_player0_in_team0,
1: action_of_player1_in_team0,
...
7: action_of_player7_in_team0,
},
# actions of team1
1: {
0: action_of_player0_in_team1,
...
},
...
# actions of team15
15: {
...
7: action_of_player7_in_team15
},
}
(
observations_by_team,
rewards_by_team,
dones_by_team,
infos_by_team,
) = env.step(actions_by_team)
print(observations_by_team)
...
{
# observations of team0
0: {
0: obs_of_player0_in_team0,
1: obs_of_player1_in_team0,
...
7: obs_of_player7_in_team0,
'stat': current statistical data,
},
# observations of team1
1: {
0: obs_of_player0_in_team1,
...
},
...
# observations of team15
15: {
...
7: obs_of_player7_in_team15
},
}
Content
Comments
You must login before you can post a comment.