main content

create specifications object for a numeric action or observation channel -凯发k8网页登录

create specifications object for a numeric action or observation channel

since r2019a

description

an rlnumericspec object contains specifications for a channel that carries an action or observation belonging to a continuous (infinite) set.

creation

description

example

spec = rlnumericspec(dimension) creates a data specification object for a continuous action or observation channel and sets the dimension property.

spec = rlnumericspec(dimension,name=value) creates the specification object spec and sets its properties using one or more name-value arguments.

properties

lower limit of the data space, specified as a scalar or matrix of the same size as the data space. when lowerlimit is specified as a scalar, rlnumericspec applies it to all entries in the data space. ddpg, td3 and sac agents use this property to enforce lower limits on the action. when using other agents, if you need to enforce constraints on the action, you must do so within the environment.

example: lowerlimit=-1

upper limit of the data space, specified as a scalar or matrix of the same size as the data space. when upperlimit is specified as a scalar, rlnumericspec applies it to all entries in the data space. ddpg, td3 and sac agents use this property to enforce upper limits on the action. when using other agents, if you need to enforce constraints on the action, you must do so within the environment.

example: upperlimit=1

name of the rlnumericspec object, specified as a string. use this property to set a meaningful name for the signal carried by this data channel. this property is used by the rl agent block to match the bus signal elements with their corresponding environment channels.

example: name="observation"

description of the rlnumericspec object, specified as a string. you can use this property to specify a meaningful description of the signal carried by this environment channel.

example: description="measured cart velocity in m/s"

this property is read-only.

dimension of the data space, specified as a numeric vector. this property is essential for creating agents and function approximators objects that work with a given environment.

example: dimension=[3 1]

this property is read-only.

information about the type of data, specified as a string, such as "double" or "single". the software uses this property to enforce data type consistency for observations and actions.

example: datatype="double"

object functions

rlsimulinkenvcreate environment object from a simulink model already containing agent and environment
rlfunctionenvcreate custom reinforcement learning environment using your reset and step functions
value function approximator object for reinforcement learning agents
rlqvaluefunction q-value function approximator object for reinforcement learning agents
vector q-value function approximator for reinforcement learning agents
rlcontinuousdeterministicactor deterministic actor with a continuous action space for reinforcement learning agents
stochastic categorical actor with a discrete action space for reinforcement learning agents
stochastic gaussian actor with a continuous action space for reinforcement learning agents

examples

for this example, consider the rlsimplependulummodel simulink® model. the model is a simple frictionless pendulum that initially hangs in a downward position.

open the model.

mdl = "rlsimplependulummodel";
open_system(mdl)

create rlnumericspec and rlfinitesetspec objects for the observation and action information, respectively.

the observation is a vector containing three signals: the sine, cosine, and time derivative of the angle.

obsinfo = rlnumericspec([3 1]) 
obsinfo = 
  rlnumericspec with properties:
     lowerlimit: -inf
     upperlimit: inf
           name: [0x0 string]
    description: [0x0 string]
      dimension: [3 1]
       datatype: "double"

the action is a scalar expressing the torque and can be one of three possible values, -2 nm, 0 nm and 2 nm.

actinfo = rlfinitesetspec([-2 0 2])
actinfo = 
  rlfinitesetspec with properties:
       elements: [3x1 double]
           name: [0x0 string]
    description: [0x0 string]
      dimension: [1 1]
       datatype: "double"

you can use dot notation to assign property values for the rlnumericspec and rlfinitesetspec objects.

obsinfo.name = "observations";
actinfo.name = "torque";

assign the agent block path information, and create the reinforcement learning environment for the simulink model using the information extracted in the previous steps.

agentblk = mdl   "/rl agent";
env = rlsimulinkenv(mdl,agentblk,obsinfo,actinfo)
env = 
simulinkenvwithagent with properties:
           model : rlsimplependulummodel
      agentblock : rlsimplependulummodel/rl agent
        resetfcn : []
  usefastrestart : on

you can also specify a reset function using dot notation. for this example, randomly initialize theta0 in the model workspace.

env.resetfcn = @(in) setvariable(in,"theta0",randn,"workspace",mdl)
env = 
simulinkenvwithagent with properties:
           model : rlsimplependulummodel
      agentblock : rlsimplependulummodel/rl agent
        resetfcn : @(in)setvariable(in,"theta0",randn,"workspace",mdl)
  usefastrestart : on

if your environment has a an observation space consisting of multiple channels, some continuous, some discrete, use a vector of specification objects (each defining a single channel) to define the observation space.

for example, define an observation space as consisting of four channels. the first one carries a single number labeled 7, 9, 19 or -2. the second one carries a vector over a continuous three-dimensional space. the third channel carries a two by two matrix that can be either zero or the identity. finally, the fourth channel carries a continuous matrix with four rows and three columns.

obsinfo = [  rlfinitesetspec([7 9 19 -2])
             rlnumericspec([3 1])
             rlfinitesetspec({zeros(2), eye(2)})
             rlnumericspec([4 3]) ]
obsinfo=4×1 object
  4x1 heterogeneous rldataspec (rlfinitesetspec, rlnumericspec) array with properties:
    name
    description
    dimension
    datatype

you can access each channel specification using dot notation.

obsinfo(1)
ans = 
  rlfinitesetspec with properties:
       elements: [4x1 double]
           name: [0x0 string]
    description: [0x0 string]
      dimension: [1 1]
       datatype: "double"
obsinfo(2).name = "velocity";
obsinfo(2).description = "velocity vector in m/s in body reference frame";
obsinfo(2)
ans = 
  rlnumericspec with properties:
     lowerlimit: -inf
     upperlimit: inf
           name: "velocity"
    description: "velocity vector in m/s in body reference frame"
      dimension: [3 1]
       datatype: "double"

version history

introduced in r2019a

网站地图