create environment object from a simulink environment model that does not contain an agent block -凯发k8网页登录
create environment object from a simulink environment model that does not contain an agent block
since r2019a
syntax
description
given a simulink® environment model that does not include your agent block, the
createintegratedenv
function generates a new closed-loop simulink model that contains an agent block and references your original environment
model from its environment block. the function also returns an environment object that you can
use for training and simulation. the environment object acts as an interface so that when you
call sim
or train
, these
functions in turn call the created (and compiled) simulink model to generate experiences for the agents.
to create an environment object from a simulink model that already includes an agent block, use rlsimulinkenv
instead. for more information on simulink reinforcement learning environments, see create custom simulink environments.
creates a simulink model with the name specified by env
= createintegratedenv(refmodel
,newmodel
)newmodel
and returns a
reinforcement learning environment object, env
, for this model. the
new model contains an rl agent block and references
refmodel
within its environment block. for more
information on model referencing, see (simulink).
[
returns the block path to the rl agent block in the new model and the
observation and action data specifications for the reference model,
env
,agentblock
,obsinfo
,actinfo
] = createintegratedenv(___)obsinfo
and actinfo
, respectively.
[___] = createintegratedenv(___,
creates a model and environment interface using port, observation, and action sets
specified using one or more name=value
)name=value
arguments.
examples
create environment from simulink model
this example uses:
this example shows how to use createintegratedenv
to create an environment object starting from a simulink model that implements the system with which the agent will interact, and that does not have an agent block. such a system is often referred to as plant, open-loop system, or reference system, while the whole (integrated) system that includes the agent is often referred to as the closed-loop system.
for this example, use the flying robot model described in train ddpg agent to control sliding robot as the reference (open-loop) system.
open the flying robot model.
open_system("rlflyingrobotenv")
initialize the state variables and sample time.
% initial model state variables theta0 = 0; x0 = -15; y0 = 0; % sample time ts = 0.4;
create the simulink model myintegratedenv
containing the flying robot model connected in a closed loop to the agent block. the function also returns the reinforcement learning environment object env
to be used for training.
env = createintegratedenv( ... "rlflyingrobotenv", ... "myintegrateden")
env = simulinkenvwithagent with properties: model : myintegrateden agentblock : myintegrateden/rl agent resetfcn : [] usefastrestart : on
the function can also return the block path to the rl agent block in the new integrated model, as well as the observation and action specifications for the reference model.
[~,agentblk,observationinfo,actioninfo] = ... createintegratedenv( ... "rlflyingrobotenv","myintegratedenv")
agentblk = "myintegratedenv/rl agent"
observationinfo = rlnumericspec with properties: lowerlimit: -inf upperlimit: inf name: "observation" description: [0x0 string] dimension: [7 1] datatype: "double"
actioninfo = rlnumericspec with properties: lowerlimit: -inf upperlimit: inf name: "action" description: [0x0 string] dimension: [2 1] datatype: "double"
returning the block path and specifications is useful in cases in which you need to modify descriptions, limits, or names in observationinfo
and actioninfo
. after modifying the specifications, you can then create an environment from the integrated model integratedenv
using the rlsimulinkenv
function.
create integrated environment with specified port names
this example uses:
open the open-loop water tank model.
open_system("rlwatertankopenloop")
set the sample time of the discrete integrator block used to generate the observation, so the simulation can run.
ts = 1;
call createintegratedenv
using name-value pairs to specify port names. the first argument of createintegratedenv
is the name of the reference simulink model that contains the system with which the agent must interact. such a system is often referred to as plant, or open-loop system.
for this example, the reference system is the model of a water tank. the input port is called u
(instead of action
), and the first and third output ports are called y
and stop
(instead of observation
and isdone
). specify the port names using name-value pairs.
env = createintegratedenv("rlwatertankopenloop","integratedwatertank",... actionportname="u",observationportname="y",isdoneportname="stop")
env = simulinkenvwithagent with properties: model : integratedwatertank agentblock : integratedwatertank/rl agent resetfcn : [] usefastrestart : on
the new model integratedwatertank
contains the reference model connected in a closed-loop with the agent block. the function also returns the reinforcement learning environment object to be used for training.
input arguments
refmodel
— name of environment reference model
string | character vector
name of environment reference model, specified as a string or character vector. this
is the simulink model implementing the environment the agent interacts with. such a
system is often referred to as plant or open
loop system, while the whole (integrated) system that includes both agent
and environment is often referred to as the closed loop system. the
generated model newmodel
is a closed loop system that contains an
rl agent block and references refmodel
within its
environment block. for more information on model referencing, see (simulink).
note
the reward signal at time t must be the one corresponding to the transition between the observation output at time t-1 and the observation output at time t. therefore, the environment output signal corresponds to the signal called next observation in the agent-environment illustration presented in reinforcement learning environments.
if your observation contains multiple channels, group the signals carried by the channels into a single observation bus. for more information about bus signals, see (simulink).
note
to avoid (potentially unsolvable) algebraic loops, you must avoid any direct feedthrough (that is any direct dependency in the same time step) from the action to any of the output signals. this is because simulink treats the agent block that is eventually added to the model as having a direct feedthrough from all its inputs (that is the action output at a given time step is considered to be directly dependent on the observation, reward and is-done inputs at the same time step). additionally, if the environment block is a referenced subsystem it is also normally treated as a direct feedthrough block unless the minimize algebraic loop occurrences parameter is enabled.
in general, adding a (simulink) or (simulink) block to the action signal between the agent and environment blocks removes the algebraic loop (alternatively you can add delay or memory blocks to all the environment output signals). for more information on algebraic loops and how to remove some of them, see (simulink) and (simulink).
newmodel
— name of the generated model
string | character vector
name of the generated model, specified as a string or character vector.
createintegratedenv
creates a simulink model with this name, but does not save the model. the created model
newmodel
is a closed loop system that contains an rl
agent block and references refmodel
within its
environment block. for more information on model referencing, see (simulink).
name-value arguments
specify optional pairs of arguments as
name1=value1,...,namen=valuen
, where name
is
the argument name and value
is the corresponding value.
name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
before r2021a, use commas to separate each name and value, and enclose
name
in quotes.
example: isdoneportname="stopsim"
sets the stopsim
port of the reference model as the source of the isdone
signal.
observationportname
— name of the observation output port in the environment reference model
"observation"
(default) | string | character vector
name of the observation output port in the environment reference model, specified
as a string or character vector. specify observationportname
when
the name of the observation output port of the reference model is not
"observation"
.
example: observationportname="x"
actionportname
— name of the action input port in the environment reference model
"action"
(default) | string | character vector
name of the action input port in the environment reference model, specified as a
string or character vector. specify actionportname
when the name of
the action input port of the reference model is not
"action"
.
example: actionportname="u"
rewardportname
— name of the reward output port in the environment reference model
"reward"
(default) | string | character vector
name of the reward output port in the environment reference model, specified as a
string or character vector. specify rewardportname
when the name of
the reward output port of the reference model is not
"reward"
.
example: rewardportname="r"
isdoneportname
— name of the is-done output port in the environment reference model
"isdone"
(default) | string | character vector
name of the is-done output port in the environment reference model, specified as a
string or character vector. specify isdoneportname
when the name of
the is-done flag output port of the reference model is not
"isdone"
.
example: isdoneportname="done"
observationbuselementnames
— names of observation bus leaf elements
string array
names of observation bus leaf elements for which to create specifications,
specified as a string array. to create observation specifications for a subset of the
elements in a simulink bus object, specify buselementnames
. if you do not
specify buselementnames
, a data specification is created for each
leaf element in the bus.
observationbuselementnames
is applicable only when the
observation output port is a bus signal.
example: observationbuselementnames=["sin" "cos"]
creates
specifications for the observation bus elements with the names
"sin"
and "cos"
.
observationdiscreteelements
— elements of finite observation set
cell array of name-value pairs
elements of finite observation set, specified as a cell array of name-value pairs. each name-value pair consists of an element name and an array of discrete values.
if the observation output port of the reference model is:
a bus signal, specify the name of one of the leaf elements of the bus specified in by
observationbuselementnames
nonbus signal, specify the name of the observation port, as specified by
observationportname
the specified discrete values must be castable to the data type of the observation signal arriving to the observation output port in the environment reference model.
if you do not specify discrete values for an observation channel, the signals carried by the channel are continuous.
example: observationdiscreteelements={"observation",[-1 0 1]}
specifies discrete values for a nonbus observation signal with port name
observation
.
example: observationdiscreteelements={"gear",[-1 0 1 2],"direction",[1 2 3
4]}
specifies discrete values for the "gear"
and
"direction"
leaf elements of a bus action signal.
actiondiscreteelements
— elements of finite action set
cell array of name-value pairs
elements of finite action set, specified as a cell array of name-value pairs. each name-value pair consists of an element name and an array of discrete values.
if the action input port of the reference model is:
a bus signal, specify the name of a leaf element of the bus
nonbus signal, specify the name of the action port, as specified by
actionportname
the specified discrete values must be castable to the data type of the action signal that can be accepted by the action input port in the environment reference model.
if you do not specify discrete values for the action channel, the signals carried by the channel are continuous.
example: actiondiscreteelements={"action",[-1 0 1]}
specifies
discrete values for a nonbus action signal with port name
"action"
.
example: actiondiscretelements={"force",[-10 0 10],"torque",[-5 0
5]}
specifies discrete values for the 'force'
and
'torque'
leaf elements of a bus action signal.
note
while creating an integrated environment more than action channel is possible, reinforcement learning toolbox™ agents only allow a single action channel.
output arguments
env
— reinforcement learning environment
simulinkenvwithagent
object
reinforcement learning environment interface, returned as an simulinkenvwithagent
object. you can use this object to train and simulate
agents in the same way as with any other environment.
for more information on reinforcement learning environments, see create custom simulink environments.
agentblock
— block path to the agent block
string
block path to the agent block in the new model, returned as a string. to train an
agent in the new simulink model, you must create an agent and specify the agent name in the rl agent block
indicated by agentblock
.
for more information on creating agents, see reinforcement learning agents.
obsinfo
— observation data specifications
rlnumericspec
object | rlfinitesetspec
object | array of data specification objects
observation data specifications, returned as one of the following:
rlnumericspec
object for a single continuous observation specificationrlfinitesetspec
object for a single discrete observation specificationarray of data specification objects for multiple specifications
actinfo
— action data specifications
rlnumericspec
object | rlfinitesetspec
object | array of data specification objects
action data specifications, returned as one of the following:
rlnumericspec
object for a single continuous action specificationrlfinitesetspec
object for a single discrete action specificationarray of data specification objects for multiple action specifications
note
while creating an integrated environment more than action channel is possible, reinforcement learning toolbox agents only allow a single action channel.
version history
introduced in r2019a
see also
functions
objects
blocks
打开示例
您曾对此示例进行过修改。是否要打开带有您的编辑的示例?
matlab 命令
您点击的链接对应于以下 matlab 命令:
请在 matlab 命令行窗口中直接输入以执行命令。web 浏览器不支持 matlab 命令。
select a web site
choose a web site to get translated content where available and see local events and offers. based on your location, we recommend that you select: .
you can also select a web site from the following list:
how to get best site performance
select the china site (in chinese or english) for best site performance. other mathworks country sites are not optimized for visits from your location.
americas
- (español)
- (english)
- (english)
europe
- (english)
- (english)
- (deutsch)
- (español)
- (english)
- (français)
- (english)
- (italiano)
- (english)
- (english)
- (english)
- (deutsch)
- (english)
- (english)
- switzerland
- (english)
asia pacific
- (english)
- (english)
- (english)
- 中国
- (日本語)
- (한국어)