specify training options in reinforcement learning designer -凯发k8网页登录

specify training options in reinforcement learning designer

to configure the training of an agent in the reinforcement learning designer app, specify training options on the train tab.

the train tab, showing example training options.

specify basic options

on the train tab, you can specify the following basic training options.

option	description
max episodes	maximum number of episodes to train the agent, specified as a positive integer.
max episode length	maximum number of steps to run per episode, specified as a positive integer.
stopping criteria	training termination condition, specified as one of the following values. `averagesteps` — stop training when the running average number of steps per episode equals or exceeds the critical value specified by stopping value. `averagereward` — stop training when the running average reward equals or exceeds the critical value. `episodereward` — stop training when the reward in the current episode equals or exceeds the critical value. `globalstepcount` — stop training when the total number of steps in all episodes (the total number of times the agent is invoked) equals or exceeds the critical value. `episodecount` — stop training when the number of training episodes equals or exceeds the critical value.
stopping value	critical value of the training termination condition in stopping criteria, specified as a scalar.
average window length	window length for averaging the scores, rewards, and number of steps for the agent when either stopping criteria or save agent criteria specify an averaging condition.

specify additional options

to specify additional training options, on the train tab, click more options.

in the more training options dialog box, you can specify the following options.

option	description
save agent criteria	condition for saving agents during training, specified as one of the following values. `none` — do not save any agents during training. `averagesteps` — save the agent when the running average number of steps per episode equals or exceeds the critical value specified by save agent value. `averagereward` — save the agent when the running average reward equals or exceeds the critical value. `episodereward` — save the agent when the reward in the current episode equals or exceeds the critical value. `globalstepcount` — save the agent when the total number of steps in all episodes (the total number of times the agent is invoked) equals or exceeds the critical value. `episodecount` — save the agent when the number of training episodes equals or exceeds the critical value.
save agent value	critical value of the save agent condition in save agent criteria, specified as a scalar or `"none"`.
save directory	folder for saved agents. if you specify a name and the folder does not exist, the app creates the folder in the current working directory. to interactively select a folder, click browse.
show verbose output	select this option to display training progress at the command line.
stop on error	select this option to stop training when an error occurs during an episode.
training plot	option to graphically display the training progress in the app, specified as one of the following values. `"training-progress"` or `"none"`. `training-progress` — show training progress `none` — do not show training progress

specify parallel training options

to train your agent using parallel computing, on the train tab, click . training agents using parallel computing requires parallel computing toolbox™ software. for more information, see train agents using parallel computing and gpus.

to specify options for parallel training, select use parallel > parallel training options.

parallel training options dialog box.

in the parallel training options dialog box, you can specify the following training options.

option	description
parallel computing mode	parallel computing mode, specified as one of the following values. `sync` — use `parpool` to run synchronous training on the available workers. the parallel pool client (the process that starts the training) updates the parameters of its actor and critic, based on the results from all the workers, and sends the updated parameters to all workers. in this case, workers must pause execution until all workers are finished, and as a result the training only advances as fast as the slowest worker allows. `async` — use `parpool` to run asynchronous training on the available workers. in this case, workers send their data back to the client as soon as they finish and receive updated parameters from the client. the workers then continue with their task.
transfer workspace variables to workers	select this option to send model and workspace variables to parallel workers. when you select this option, the parallel pool client (the process that starts the training) sends variables used in models and defined in the matlab^® workspace to the workers.
random seed for workers	randomizer initialization for workers, specified as one of the following values. `–1` — assign a unique random seed to each worker. the value of the seed is the worker id. `–2` — do not assign a random seed to the workers. vector — manually specify the random seed for each worker. the number of elements in the vector must match the number of workers.
files to attach to parallel pool	additional files to attach to the parallel pool. specify names of files in the current working directory, with one name on each line.
worker setup function	function to run before training starts, specified as a handle to a function having no input arguments. this function is run once per worker before training begins. write this function to perform any processing that you need prior to training.
worker cleanup function	function to run after training ends, specified as a handle to a function having no input arguments. you can write this function to clean up the workspace or perform other processing after training terminates.

the following figure shows an example parallel training configuration for the following files and functions.

data file attached to the parallel pool — workerdata.mat
worker setup function — mysetup.m
worker cleanup function — mycleanup.m

parallel training options dialog showing file and function information.

specify training options in reinforcement learning designer -凯发k8网页登录