main content

creates an optimizer object for actors and critics -凯发k8网页登录

creates an optimizer object for actors and critics

since r2022a

description

create an optimizer object that updates the learnable parameters of an actor or critic in a custom training loop

example

algobj = rloptimizer returns a default optimizer object. you can modify the object properties using dot notation.

example

algobj = rloptimizer(algoptions) returns an optimizer object with properties specified by the optimizer options object algoptions.

examples

use rloprimizer to create a default optimizer algorithm object to use for the training of an actor or critic in a custom training loop.

myalg = rloptimizer
myalg = 
  rladamoptimizer with properties:
           gradientdecayfactor: 0.9000
    squaredgradientdecayfactor: 0.9990
                       epsilon: 1.0000e-08
                     learnrate: 0.0100
        l2regularizationfactor: 1.0000e-04
             gradientthreshold: inf
       gradientthresholdmethod: "l2norm"

by default, the function returns an rladamoptimizer object with default options. you can use dot notation to change some parameters.

myalg.learnrate = 0.1;

you can now create a structure and set its criticoptimizer or actoroptimizer field to myalg. when you call runepisode, pass the structure as an input parameter. the runepisode function can then use the update method of myalg to update the learnable parameters of your actor or critic.

use rloprimizeroptions to create an optimizer option object. specify the algorithm as "rmsprop" and set the learning rate to 0.2.

myoptions=rloptimizeroptions( ...
    algorithm="rmsprop", ...
    learnrate=0.2);

use rloptimizer to create an optimizer algorithm object to use for the training of an actor or critic in a custom training loop. specify the optimizer option set myoptions as input parameter.

myalg=rloptimizer(myoptions)
myalg = 
  rlrmspropoptimizer with properties:
    squaredgradientdecayfactor: 0.9990
                       epsilon: 1.0000e-08
                     learnrate: 0.2000
        l2regularizationfactor: 1.0000e-04
             gradientthreshold: inf
       gradientthresholdmethod: "l2norm"

the function returns an rlrmspropoptimizer object with default options. you can use dot notation to change some parameters.

myalg.gradientthreshold = 2;

you can now create a structure and set its criticoptimizer or actoroptimizer field to myalg. when you call runepisode, pass the structure as an input parameter. the runepisode function can then use the update method of myalg to update the learnable parameters of your actor or critic.

input arguments

algorithm options object, specified as an rloptimizeroptions object.

example: rloptimizeroptions(algorithm="sgdm",learnrate=0.2)

output arguments

algorithm optimizer object, returned as an rladamoptimizer, rlsgdmoptimizer, or rlrmspropoptimizer object. the runepisode function uses the update method of the returned object to update the learnable parameter of an actor or critic.

version history

introduced in r2022a

网站地图