load predefined grid world environments -凯发k8网页登录

main content

load predefined grid world environments

reinforcement learning toolbox™ software provides several predefined grid world environments for which the actions, observations, rewards, and dynamics are already defined. you can use these environments to:

  • learn reinforcement learning concepts.

  • gain familiarity with reinforcement learning toolbox software features.

  • test your own reinforcement learning agents.

you can load the following predefined matlab® grid world environments using the rlpredefinedenv function.

environmentagent task
basic grid worldmove from a starting location to a target location on a two-dimensional grid by selecting moves from the discrete action space {n,s,e,w}.
waterfall grid worldmove from a starting location to a target location on a larger two-dimensional grid with unknown deterministic or stochastic dynamics.

for more information on the properties of grid world environments, see create custom grid world environments.

you can also load predefined matlab control system environments. for more information, see load predefined control system environments.

basic grid world

the basic grid world environment is a two-dimensional 5-by-5 grid with a starting location, terminal location, and obstacles. the environment also contains a special jump from state [2,4] to state [4,4]. the goal of the agent is to move from the starting location to the terminal location while avoiding obstacles and maximizing the total reward.

to create a basic grid world environment, use the rlpredefinedenv function. this function creates an rlmdpenv object representing the grid world.

env = rlpredefinedenv('basicgridworld');

you can visualize the grid world environment using the plot function.

  • the agent location is a red circle. by default, the agent starts in state [1,1].

  • the terminal location is a blue square.

  • the obstacles are black squares.

plot(env)

basic five-by-five grid world with agent (indicated by a red circle) positioned on the top left corner, terminal location (indicated by a blue square) in the bottom right corner, and four obstacle squares, in black, in the middle.

actions

the agent can move in one of four possible directions (north, south, east, or west).

rewards

the agent receives the following rewards or penalties:

  • 10 reward for reaching the terminal state at [5,5]

  • 5 reward for jumping from state [2,4] to state [4,4]

  • -1 penalty for every other action

deterministic waterfall grid worlds

the deterministic waterfall grid world environment is a two-dimensional 8-by-7 grid with a starting location and terminal location. the environment includes a waterfall that pushes the agent toward the bottom of the grid. the goal of the agent is to move from the starting location to the terminal location while maximizing the total reward.

to create a deterministic waterfall grid world, use the rlpredefinedenv function. this function creates an rlmdpenv object representing the grid world.

env = rlpredefinedenv('waterfallgridworld-deterministic');

as with the basic grid world, you can visualize the environment, where the agent is a red circle and the terminal location is a blue square.

plot(env)

basic 8-by-7 grid world with agent positioned on the left and terminal location in the middle.

actions

the agent can move in one of four possible directions (north, south, east, or west).

rewards

the agent receives the following rewards or penalties:

  • 10 reward for reaching the terminal state at [4,5]

  • -1 penalty for every other action

waterfall dynamics

in this environment, a waterfall pushes the agent toward the bottom of the grid.

basic 8-by-7 grid world with blue arrows indicating a waterfall that pushes the agent position downward.

the intensity of the waterfall varies between the columns, as shown at the top of the preceding figure. when the agent moves into a column with a nonzero intensity, the waterfall pushes it downward by the indicated number of squares. for example, if the agent goes east from state [5,2], it reaches state [7,3].

stochastic waterfall grid worlds

the stochastic waterfall grid world environment is a two-dimensional 8-by-7 grid with a starting location and terminal locations. the environment includes a waterfall that pushes the agent towards the bottom of the grid with a stochastic intensity. the goal of the agent is to move from the starting location to the target terminal location while avoiding the penalty terminal states along the bottom of the grid and maximizing the total reward.

to create a stochastic waterfall grid world, use the rlpredefinedenv function. this function creates an rlmdpenv object representing the grid world.

env = rlpredefinedenv('waterfallgridworld-stochastic');

as with the basic grid world, you can visualize the environment, where the agent is a red circle and the terminal location is a blue square.

plot(env)

basic 8-by-7 grid world with terminal locations indicated by blue squares in the bottom row.

actions

the agent can move in one of four possible directions (north, south, east, or west).

rewards

the agent receives the following rewards or penalties:

  • 10 reward for reaching the terminal state at [4,5]

  • -10 penalty for reaching any terminal state in the bottom row of the grid

  • -1 penalty for every other action

waterfall dynamics

in this environment, a waterfall pushes the agent towards the bottom of the grid with a stochastic intensity. the baseline intensity matches the intensity of the deterministic waterfall environment. however, in the stochastic waterfall case, the agent has an equal chance of experiencing the indicated intensity, one level above that intensity, or one level below that intensity. for example, if the agent goes east from state [5,2], it has an equal chance of reaching state [6,3], [7,3], or [8,3].

see also

functions

objects

related examples

more about

网站地图