deterministic actor with a continuous action space for reinforcement learning agents -凯发k8网页登录
deterministic actor with a continuous action space for reinforcement learning agents
since r2022a
description
this object implements a function approximator to be used as a deterministic actor
within a reinforcement learning agent with a continuous action space. a continuous
deterministic actor takes an environment observation as input and returns as output an action
that is a parametrized deterministic function of the observation, thereby implementing a
parametrized deterministic policy. after you create an
rlcontinuousdeterministicactor
object, use it to create a suitable agent,
such as . for more
information on creating representations, see create policies and value functions.
creation
syntax
description
creates a continuous deterministic actor object using the deep neural network
actor
= rlcontinuousdeterministicactor(net
,observationinfo
,actioninfo
)net
as underlying approximation model. for this actor,
actioninfo
must specify a continuous action space. the network
input layers are automatically associated with the environment observation channels
according to the dimension specifications in observationinfo
. the
network must have a single output layer with the same data type and dimensions as the
action specified in actioninfo
. this function sets the
observationinfo
and actioninfo
properties of
actor
to the observationinfo
and
actioninfo
input arguments, respectively.
specifies the names of the network input layers to be associated with the environment
observation channels. the function assigns, in sequential order, each environment
observation channel specified in actor
= rlcontinuousdeterministicactor(net
,observationinfo
,actioninfo
,observationinputnames=netobsnames
)observationinfo
to the layer
specified by the corresponding name in the string array
netobsnames
. therefore, the network input layers, ordered as the
names in netobsnames
, must have the same data type and dimensions
as the observation channels, as ordered in observationinfo
.
creates a continuous deterministic actor object using a custom basis function as
underlying approximation model. the first input argument is a two-element cell array
whose first element is the handle actor
= rlcontinuousdeterministicactor({basisfcn
,w0
},observationinfo
,actioninfo
)basisfcn
to a custom basis
function and whose second element is the initial weight vector w0
.
this function sets the observationinfo
and
actioninfo
properties of actor
to the
observationinfo
and actioninfo
input
arguments, respectively.
specifies the device used to perform computational operations on the
actor
= rlcontinuousdeterministicactor(___,usedevice=usedevice
)actor
object, and sets the usedevice
property of actor
to the usedevice
input
argument. you can use this syntax with any of the previous input-argument
combinations.
input arguments
properties
object functions
deep deterministic policy gradient (ddpg) reinforcement learning agent | |
rltd3agent | twin-delayed deep deterministic (td3) policy gradient reinforcement learning agent |
obtain action from agent, actor, or policy object given environment observations | |
evaluate | evaluate function approximator object given observation (or observation-action) input data |
evaluate gradient of function approximator object given observation and action input data | |
accelerate | option to accelerate computation of gradient for approximator object based on neural network |
obtain learnable parameter values from agent, function approximator, or policy object | |
set learnable parameter values of agent, function approximator, or policy object | |
set approximation model in function approximator object | |
get approximation model from function approximator object |
examples
version history
introduced in r2022a