reinforcement learning environment with a dynamic model implemented in simulink -凯发k8网页登录
reinforcement learning environment with a dynamic model implemented in simulink
since r2019a
description
the simulinkenvwithagent
object represents a reinforcement
learning environment that uses a dynamic model implemented in simulink®. the environment object acts as an interface such that when you call sim
or train
, these
functions in turn call the simulink model to generate experiences for the agents.
creation
to create a simulinkenvwithagent
object, use one of the following
functions.
rlsimulinkenv
— create an environment using a simulink model with at least one rl agent block.createintegratedenv
— use a reference model as a reinforcement learning environment.rlpredefinedenv
— create a predefined reinforcement learning environment.
properties
model
— simulink model name
string | character vector
simulink model name, specified as a string or character vector. the specified model must contain one or more rl agent blocks.
agentblock
— agent block paths
string | string array
agent block paths, specified as a string or string array.
if model
contains a single rl agent block for
training, then agentblock
is a string containing the block
path.
if model
contains multiple rl agent blocks for
training, then agentblock
is a string array, where each element
contains the path of one agent block.
model
can contain rl agent blocks whose path is
not included in agentblock
. such agent blocks behave as part of the
environment and select actions based on their current policies. when you call sim
or
train
, the
experiences of these agents are not returned and their policies are not updated.
the agent blocks can be inside of a model reference. for more information on configuring an agent block for reinforcement learning, see rl agent.
resetfcn
— reset behavior for environment
function handle | anonymous function handle
reset behavior for the environment, specified as a function handle or anonymous
function handle. the function must have a single
simulink.simulationinput
input argument and a single
simulink.simulationinput
output argument.
the reset function sets the initial state of the simulink environment. for example, you can create a reset function that randomizes certain block states such that each training episode begins from different initial conditions.
if you have an existing reset function myresetfunction
on the
matlab® path, set resetfcn
using a handle to the
function.
env.resetfcn = @(in)myresetfunction(in);
if your reset behavior is simple, you can implement it using an anonymous function
handle. for example, the following code sets the variable x0
to a
random value.
env.resetfcn = @(in) setvariable(in,'x0',rand());
the sim
function
calls the reset function to reset the environment at the start of each simulation, and
the train
function
calls it at the start of each training episode.
usefastrestart
— option to toggle fast restart
"on"
(default) | "off"
option to toggle fast restart, specified as either "on"
or
"off"
. fast restart allows you to perform iterative simulations
without compiling a model or terminating the simulation each time.
for more information on fast restart, see (simulink).
object functions
train | train reinforcement learning agents within a specified environment |
sim | simulate trained reinforcement learning agents within specified environment |
getobservationinfo | obtain observation data specifications from reinforcement learning environment, agent, or experience buffer |
getactioninfo | obtain action data specifications from reinforcement learning environment, agent, or experience buffer |
examples
create simulink environment using agent in workspace
create a simulink environment using the trained agent and corresponding simulink model from the create simulink environment and train agent example.
load the agent in the matlab® workspace.
load rlwatertankddpgagent
create an environment for the rlwatertank
model, which contains an rl agent block. since the agent used by the block is already in the workspace, you do not need to pass the observation and action specifications to create the environment.
env = rlsimulinkenv("rlwatertank","rlwatertank/rl agent")
env = simulinkenvwithagent with properties: model : rlwatertank agentblock : rlwatertank/rl agent resetfcn : [] usefastrestart : on
validate the environment by performing a short simulation for two sample times.
validateenvironment(env)
you can now train and simulate the agent within the environment by using train
and sim
, respectively.
create reinforcement learning environment for simulink model
this example uses:
for this example, consider the rlsimplependulummodel
simulink® model. the model is a simple frictionless pendulum that initially hangs in a downward position.
open the model.
mdl = "rlsimplependulummodel";
open_system(mdl)
create rlnumericspec
and rlfinitesetspec
objects for the observation and action information, respectively.
the observation is a vector containing three signals: the sine, cosine, and time derivative of the angle.
obsinfo = rlnumericspec([3 1])
obsinfo = rlnumericspec with properties: lowerlimit: -inf upperlimit: inf name: [0x0 string] description: [0x0 string] dimension: [3 1] datatype: "double"
the action is a scalar expressing the torque and can be one of three possible values, -2
nm, 0
nm and 2
nm.
actinfo = rlfinitesetspec([-2 0 2])
actinfo = rlfinitesetspec with properties: elements: [3x1 double] name: [0x0 string] description: [0x0 string] dimension: [1 1] datatype: "double"
you can use dot notation to assign property values for the rlnumericspec
and rlfinitesetspec
objects.
obsinfo.name = "observations"; actinfo.name = "torque";
assign the agent block path information, and create the reinforcement learning environment for the simulink model using the information extracted in the previous steps.
agentblk = mdl "/rl agent";
env = rlsimulinkenv(mdl,agentblk,obsinfo,actinfo)
env = simulinkenvwithagent with properties: model : rlsimplependulummodel agentblock : rlsimplependulummodel/rl agent resetfcn : [] usefastrestart : on
you can also specify a reset function using dot notation. for this example, randomly initialize theta0
in the model workspace.
env.resetfcn = @(in) setvariable(in,"theta0",randn,"workspace",mdl)
env = simulinkenvwithagent with properties: model : rlsimplependulummodel agentblock : rlsimplependulummodel/rl agent resetfcn : @(in)setvariable(in,"theta0",randn,"workspace",mdl) usefastrestart : on
create simulink environment for multiple agents
create an environment for the simulink model from the example train multiple agents to perform collaborative task.
load the agents in the matlab workspace.
load rlcollaborativetaskagents
create an environment for the rlcollaborativetask
model, which has two agent blocks. since the agents used by the two blocks (agenta
and agentb
) are already in the workspace, you do not need to pass their observation and action specifications to create the environment.
env = rlsimulinkenv( ... "rlcollaborativetask", ... ["rlcollaborativetask/agent a","rlcollaborativetask/agent b"])
env = simulinkenvwithagent with properties: model : rlcollaborativetask agentblock : [ rlcollaborativetask/agent a rlcollaborativetask/agent b ] resetfcn : [] usefastrestart : on
you can now simulate or train the agents within the environment using sim
or train
, respectively.
create continuous simple pendulum model environment
use the predefined "simplependulummodel-continuous"
keyword to create a continuous simple pendulum model reinforcement learning environment.
env = rlpredefinedenv("simplependulummodel-continuous")
env = simulinkenvwithagent with properties: model : rlsimplependulummodel agentblock : rlsimplependulummodel/rl agent resetfcn : [] usefastrestart : on
create environment from simulink model
this example uses:
this example shows how to use createintegratedenv
to create an environment object starting from a simulink model that implements the system with which the agent will interact, and that does not have an agent block. such a system is often referred to as plant, open-loop system, or reference system, while the whole (integrated) system that includes the agent is often referred to as the closed-loop system.
for this example, use the flying robot model described in train ddpg agent to control sliding robot as the reference (open-loop) system.
open the flying robot model.
open_system("rlflyingrobotenv")
initialize the state variables and sample time.
% initial model state variables theta0 = 0; x0 = -15; y0 = 0; % sample time ts = 0.4;
create the simulink model myintegratedenv
containing the flying robot model connected in a closed loop to the agent block. the function also returns the reinforcement learning environment object env
to be used for training.
env = createintegratedenv( ... "rlflyingrobotenv", ... "myintegrateden")
env = simulinkenvwithagent with properties: model : myintegrateden agentblock : myintegrateden/rl agent resetfcn : [] usefastrestart : on
the function can also return the block path to the rl agent block in the new integrated model, as well as the observation and action specifications for the reference model.
[~,agentblk,observationinfo,actioninfo] = ... createintegratedenv( ... "rlflyingrobotenv","myintegratedenv")
agentblk = "myintegratedenv/rl agent"
observationinfo = rlnumericspec with properties: lowerlimit: -inf upperlimit: inf name: "observation" description: [0x0 string] dimension: [7 1] datatype: "double"
actioninfo = rlnumericspec with properties: lowerlimit: -inf upperlimit: inf name: "action" description: [0x0 string] dimension: [2 1] datatype: "double"
returning the block path and specifications is useful in cases in which you need to modify descriptions, limits, or names in observationinfo
and actioninfo
. after modifying the specifications, you can then create an environment from the integrated model integratedenv
using the rlsimulinkenv
function.
version history
introduced in r2019a
see also
functions
objects
blocks
打开示例
您曾对此示例进行过修改。是否要打开带有您的编辑的示例?
matlab 命令
您点击的链接对应于以下 matlab 命令:
请在 matlab 命令行窗口中直接输入以执行命令。web 浏览器不支持 matlab 命令。
select a web site
choose a web site to get translated content where available and see local events and offers. based on your location, we recommend that you select: .
you can also select a web site from the following list:
how to get best site performance
select the china site (in chinese or english) for best site performance. other mathworks country sites are not optimized for visits from your location.
americas
- (español)
- (english)
- (english)
europe
- (english)
- (english)
- (deutsch)
- (español)
- (english)
- (français)
- (english)
- (italiano)
- (english)
- (english)
- (english)
- (deutsch)
- (english)
- (english)
- switzerland
- (english)
asia pacific
- (english)
- (english)
- (english)
- 中国
- (日本語)
- (한국어)