main content

create specifications object for a finite-凯发k8网页登录

create specifications object for a finite-set action or observation channel

since r2019a

description

an rlfinitesetspec object contains specifications for a channel that carries an action or observation belonging to a finite set.

creation

description

example

spec = rlfinitesetspec(elements) creates a data specification object for a finite-set action or observation channel, setting the elements property.

spec = rlfinitesetspec(elements,name=value) creates the specification object spec and sets its properties using one or more name-value arguments.

properties

set of valid actions or observations for the environment, specified as one of the following:

  • vector — specify valid numeric values for a single action or single observation.

  • cell array — specify valid numeric value combinations when you have more than one action or observation. each entry of the cell array must have the same dimensions.

example: elements=[-2 -1 0 1 2]

name of the rlfinitesetspec object, specified as a string. use this property to set a meaningful name for the signal carried by this data channel. this property is used by the rl agent block to match the bus signals with their corresponding environment channels.

example: name="action"

description of the rlfinitesetspec object, specified as a string. you can use this property to specify a meaningful description of the signal carried by this environment channel.

example: description="applied force in n"

this property is read-only.

size of each element, specified as a vector.

if you specify elements as a vector, then dimension is [1 1]. otherwise, if you specify a cell array, then dimension indicates the size of the entries in elements. this property is essential for creating agents and function approximators objects that work with a given environment.

example: dimension=[1 1]

this property is read-only.

information about the type of data, specified as a string, such as "double" or "single". the software uses this property to enforce data type consistency for observations and actions.

example: datatype="single"

object functions

rlsimulinkenvcreate environment object from a simulink model already containing agent and environment
rlfunctionenvcreate custom reinforcement learning environment using your reset and step functions
value function approximator object for reinforcement learning agents
rlqvaluefunction q-value function approximator object for reinforcement learning agents
vector q-value function approximator for reinforcement learning agents
rlcontinuousdeterministicactor deterministic actor with a continuous action space for reinforcement learning agents
stochastic categorical actor with a discrete action space for reinforcement learning agents
stochastic gaussian actor with a continuous action space for reinforcement learning agents

examples

for this example, consider the rlsimplependulummodel simulink® model. the model is a simple frictionless pendulum that initially hangs in a downward position.

open the model.

mdl = "rlsimplependulummodel";
open_system(mdl)

create rlnumericspec and rlfinitesetspec objects for the observation and action information, respectively.

the observation is a vector containing three signals: the sine, cosine, and time derivative of the angle.

obsinfo = rlnumericspec([3 1]) 
obsinfo = 
  rlnumericspec with properties:
     lowerlimit: -inf
     upperlimit: inf
           name: [0x0 string]
    description: [0x0 string]
      dimension: [3 1]
       datatype: "double"

the action is a scalar expressing the torque and can be one of three possible values, -2 nm, 0 nm and 2 nm.

actinfo = rlfinitesetspec([-2 0 2])
actinfo = 
  rlfinitesetspec with properties:
       elements: [3x1 double]
           name: [0x0 string]
    description: [0x0 string]
      dimension: [1 1]
       datatype: "double"

you can use dot notation to assign property values for the rlnumericspec and rlfinitesetspec objects.

obsinfo.name = "observations";
actinfo.name = "torque";

assign the agent block path information, and create the reinforcement learning environment for the simulink model using the information extracted in the previous steps.

agentblk = mdl   "/rl agent";
env = rlsimulinkenv(mdl,agentblk,obsinfo,actinfo)
env = 
simulinkenvwithagent with properties:
           model : rlsimplependulummodel
      agentblock : rlsimplependulummodel/rl agent
        resetfcn : []
  usefastrestart : on

you can also specify a reset function using dot notation. for this example, randomly initialize theta0 in the model workspace.

env.resetfcn = @(in) setvariable(in,"theta0",randn,"workspace",mdl)
env = 
simulinkenvwithagent with properties:
           model : rlsimplependulummodel
      agentblock : rlsimplependulummodel/rl agent
        resetfcn : @(in)setvariable(in,"theta0",randn,"workspace",mdl)
  usefastrestart : on

if your environment has a an observation space consisting of multiple channels, some continuous, some discrete, use a vector of specification objects (each defining a single channel) to define the observation space.

for example, define an observation space as consisting of four channels. the first one carries a single number labeled 7, 9, 19 or -2. the second one carries a vector over a continuous three-dimensional space. the third channel carries a two by two matrix that can be either zero or the identity. finally, the fourth channel carries a continuous matrix with four rows and three columns.

obsinfo = [  rlfinitesetspec([7 9 19 -2])
             rlnumericspec([3 1])
             rlfinitesetspec({zeros(2), eye(2)})
             rlnumericspec([4 3]) ]
obsinfo=4×1 object
  4x1 heterogeneous rldataspec (rlfinitesetspec, rlnumericspec) array with properties:
    name
    description
    dimension
    datatype

you can access each channel specification using dot notation.

obsinfo(1)
ans = 
  rlfinitesetspec with properties:
       elements: [4x1 double]
           name: [0x0 string]
    description: [0x0 string]
      dimension: [1 1]
       datatype: "double"
obsinfo(2).name = "velocity";
obsinfo(2).description = "velocity vector in m/s in body reference frame";
obsinfo(2)
ans = 
  rlnumericspec with properties:
     lowerlimit: -inf
     upperlimit: inf
           name: "velocity"
    description: "velocity vector in m/s in body reference frame"
      dimension: [3 1]
       datatype: "double"

within reinforcement learning toolbox™ software, agents can only have one action channel. however, if your desired design involves multiple discrete action channels, you can convert them into a single action channel with a number of elements equal to the product of the number of elements of each action channel of your original design. specifically, each element of the single action channel corresponds to a particular combination of elements in the action channels of your original design.

for example, suppose that the valid values for a two-output system are [1 2] for the first output and [10 20 30] for the second output. create a discrete action space specification for all possible output combinations.

actionspec = rlfinitesetspec({[1 10],[1 20],[1 30],...
                              [2 10],[2 20],[2 30]})
actionspec = 
  rlfinitesetspec with properties:
       elements: {6x1 cell}
           name: [0x0 string]
    description: [0x0 string]
      dimension: [1 2]
       datatype: "double"

version history

introduced in r2019a

网站地图