twin-凯发k8网页登录
twin-delayed deep deterministic (td3) policy gradient reinforcement learning agent
since r2020a
description
the twin-delayed deep deterministic (td3) policy gradient algorithm is an actor-critic, model-free, online, off-policy, continuous action-space reinforcement learning method which attempts to learn the policy that maximizes the expected discounted cumulative long-term reward.
use rltd3agent
to create one of the following types of agents.
twin-delayed deep deterministic policy gradient (td3) agent with two q-value functions. this agent prevents overestimation of the value function by learning two q value functions and using the minimum values for policy updates.
delayed deep deterministic policy gradient (delayed ddpg) agent with a single q value function. this agent is a ddpg agent with target policy smoothing and delayed policy and target updates.
for more information, see twin-delayed deep deterministic (td3) policy gradient agents. for more information on the different types of reinforcement learning agents, see reinforcement learning agents.
creation
syntax
description
create agent from observation and action specifications
creates a td3 agent for an environment with the given observation and action
specifications, using default initialization options. the actor and critics in the agent
use default deep neural networks built from the observation specification
agent
= rltd3agent(observationinfo
,actioninfo
)observationinfo
and the action specification
actioninfo
. the observationinfo
and
actioninfo
properties of agent
are set to
the observationinfo
and actioninfo
input
arguments, respectively.
creates a deep deterministic policy gradient agent for an environment with the given
observation and action specifications. the agent uses default networks configured using
options specified in the agent
= rltd3agent(observationinfo
,actioninfo
,initopts
)initopts
object. for more information on
the initialization options, see .
create agent from actor and critic
specify agent options
creates a td3 agent and sets the agent
= rltd3agent(___,agentoptions
)agentoptions
property to the agentoptions
input argument. use this syntax after
any of the input arguments in the previous syntaxes.
input arguments
properties
object functions
train | train reinforcement learning agents within a specified environment |
sim | simulate trained reinforcement learning agents within specified environment |
obtain action from agent, actor, or policy object given environment observations | |
extract actor from reinforcement learning agent | |
set actor of reinforcement learning agent | |
extract critic from reinforcement learning agent | |
set critic of reinforcement learning agent | |
generate matlab function that evaluates policy of an agent or policy object |
examples
version history
introduced in r2020a
see also
apps
functions
- | | | | | |
getactioninfo
|getobservationinfo
objects
rltd3agentoptions
| |rlqvaluefunction
|rlcontinuousdeterministicactor
| | | |