log reinforcement learning training data to mat-凯发k8网页登录
log reinforcement learning training data to mat-files
since r2022b
description
use a filelogger
object to log data to mat-files, within the
train function or inside a custom training loop. to log data when using the train function,
specify appropriate callback functions in filelogger
, as shown in the
examples. these callbacks are executed at different stages of training, for example,
episodefinishedfcn
is executed after the completion of an episode. the
output of a callback function is a structure containing the data to log at that stage of
training.
note
using a filelogger
object to log data when using the train
function does
not affect (and is not affected by) any option to save agents during training specified
within an rltrainingoptions
object.
note
filelogger
is a handle object. if you assign an existing
filelogger
object to a new filelogger
object, both the
new object and the original one refer to the same underlying object in memory. to preserve
the original object parameters for later use, save the object to a mat-file. for more
information about handle objects, see .
creation
create a filelogger
object using rldatalogger
without any input arguments.
properties
loggingoptions
— object containing a set of logging options
matfileloggingoptions
object (default)
object containing a set of logging options, returned as a
matfileloggingoptions
object. a
matfileloggingoptions
object has the following properties, which you
can access using dot notation after creating the filelogger
object.
loggingdirectory
— name of the logging directory
"logs"
subdirectory of the current
directory (default) | string or character vector
name or fully qualified path of the logging directory, specified as a string
or a character vector. this is the name of the directory where the mat-files
containing the logged data are saved. as a default, a subdirectory called
logs
is created in the current folder during setup and files
are saved there during training.
example: loggingdirectory=mylogs
filenamerule
— rule to name the mat-files
"loggeddata"
(default) | string
| char
vector
rule to name the mat-files, specified as a string or a character vector. for
example, the naming rule "episode
results in the file
names episode001.mat
, episode002.mat
and so
on.
example: filenamerule="thirdrun
version
— mat-files version
"-v7"
(default) | string
| char
vector
mat-file versions, specified as a string or character vector. the default is "-v7". for more information, see .
example: version="-v7.3"
usecompression
— option to use compression
true
(default) | false
option to use compression when saving data to a mat-file, specified as a
logical variable. the default is true
. for more information,
see .
example: usecompression=false
datawritefrequency
— disk data write period
1
(default) | positive integer
disk data write period, specified as a positive integer. it is the number of
episodes after which data is saved to disk. for example, if
datawritefrequency
is 5
then data from
episodes 1 to 5 will be cached in memory and be written to disk at the end of the
5-th episode. this improves performance in some cases. the default is
1
.
example: datawritefrequency=10
maxepisodes
— maximum number of episodes
500
(default) | positive integer
maximum number of episodes, specified as a positive integer. when using
train
,
the value is automatically initialized. set this value when using the logger
object in a custom training loop. the default is 500
.
example: maxepisodes=1000
episodefinishedfcn
— callback to log data after episode completion
[]
(default) | function handle
callback to log data after episode completion, specified as a function handle object. the specified function is automatically called by the training loop at the end of each episode, and must return a structure containing the data to log, such as experiences, simulation information, or initial observations.
your function must have the following signature.
function datatolog = myepisodefinishedfcn(data)
here, data
is a structure that contains the following fields:
episodecount
— current episode numberenvironment
— environment objectagent
— agent objectexperience
— structure array containing the experiences. each element of this array corresponds to a step and is a structure containing the fieldsnextobservation
,observation
,action
,reward
andisdone
.episodeinfo
— structure containing the fieldscumulativereward
,stepstaken
andinitialobservation
.simulationinfo
— contains simulation information from the episode. for matlab environments this is a structure with the fieldsimulationerror
, and for simulink® environments it is asimulink.simulationoutput
object.
the function output datatolog
is the structure containing the
data to be logged to disk.
example: episodefinishedfcn=@myeploggingfcn
agentstepfinishedfcn
— callback to log data after training step completion
[]
(default) | function handle | cell array of function handles
callback to log data after training step completion within an episode, specified as a function handle object. the specified function is automatically called by the training loop at the end of each training step, and must return a structure containing the data to log, such as for example the state of the agent's exploration policy.
your function must have the following signature.
function datatolog = myagentstepfinishedfcn(data)
here, data
is a structure that contains the following fields:
episodecount
— current episode numberagentstepcount
— cumulative number of steps taken by the agentsimulationtime
— current simulation time in the environmentagent
— agent object
the function output datatolog
is the structure containing the
data to be logged to disk.
for multi agent training, agentstepfinishedfcn
can be a cell
array of function handles with as many elements as number of agent groups.
note
logging data using the agentstepfinishedfcn
callback is not
supported when training agents in parallel with the train function.
example: agentstepfinishedfcn=@myagtsteploggingfcn
agentlearnfinishedfcn
— callback to log data after completion of the learn subroutine
[]
(default) | function handle | cell array of function handles
callback to log data after completion of the learn subroutine, specified as a function handle object. the specified function is automatically called by the training loop at the end of each learning subroutine, and must return a structure containing the data to log, such as the training losses of the actor and critic networks, or, for a model-based agent, the environment model training losses.
your function must have the following signature.
function datatolog = myagentlearnfinishedfcn(data)
here, data
is a structure that contains the following fields:
episodecount
— current episode numberagentstepcount
— cumulative number of steps taken by the agentagentlearncount
— cumulative number of learning steps taken by the agentenvmodeltraininginfo
— structure containing model-based agents related fieldstransitionfcnloss
,rewardfcnloss
andisdonefcnloss
.agent
— agent objectactorloss
— loss of the actorcriticloss
— loss of the critic
the function output datatolog
is the structure containing the
data to be logged to disk.
for multi agent training, agentlearnfinishedfcn
can be a cell
array of function handles with as many elements as number of agent groups.
example: agentlearnfinishedfcn=@myagtlearnloggingfcn
object functions
examples
log data to disk during built-in training
this example shows how to log data to disk when using train
.
create a filelogger
object using rldatalogger
.
logger = rldatalogger();
specify a directory to save logged data.
logger.loggingoptions.loggingdirectory = "mydatalog";
create callback functions to log the data (for this example, see the helper function section), and specify the appropriate callback functions in the logger object.
logger.episodefinishedfcn = @myepisodefinishedfcn; logger.agentstepfinishedfcn = @myagentstepfinishedfcn; logger.agentlearnfinishedfcn = @myagentlearnfinishedfcn;
to train the agent, you can now call train
, passing logger
as an argument such as in the following command.
trainresult = train(agent, env, trainopts, logger=logger);
while the training progresses, data will be logged to the specified directory, according to the rule specified in the filenamerule
property of logger.loggingoptions
.
logger.loggingoptions.filenamerule
ans = "loggeddata"
example logging functions
episode completion logging function. this function is automatically called by the training loop at the end of each episode, and must return a structure containing the episode-related data to log, such as experiences, simulation information, or initial observations.
function datatolog = myepisodefinishedfcn(data) datatolog.experience = data.experience; end
agent step completion logging function. this function is automatically called by the training loop at the end of each training step, and must return a structure containing the step-related data to log, such as for example the state of the agent exploration policy.
function datatolog = myagentstepfinishedfcn(data) datatolog.state = getstate(getexplorationpolicy(data.agent)); end
learn subroutine completion logging function. this function is automatically called by the training loop at the end of each learning subroutine, and must return a structure containing the learning-related data to log, such as the training losses of the actor and critic networks, or, for a model-based agent, the environment model training losses.
function datatolog = myagentlearnfinishedfcn(data) datatolog.actorloss = data.actorloss; datatolog.criticloss = data.criticloss; end
for the specific function signatures and more information on the function input structure, see the corresponding properties of filelogger
. for a related example, see log training data to disk.
log data to disk in a custom training loop
this example shows how to log data to disk when training an agent using a custom training loop.
create a filelogger
object using rldatalogger
.
flgr = rldatalogger();
set up the logger object. this operation initializes the object performing setup tasks such as, for example, creating the directory to save the data files.
setup(flgr);
within a custom training loop, you can now store data to the logger object memory and write data to file.
for this example, store random numbers to the file logger object, grouping them in the variables context1
and context2
. when you issue a write command, a mat-file corresponding to an iteration and containing both variables is saved with the name specified in flgr.loggingoptions.filenamerule
, in the folder specified by flgr.loggingoptions.loggingdirectory
.
for iter = 1:10 % store three random numbers in memory % as elements of the variable "context1" for ct = 1:3 store(flgr, "context1", rand, iter); end % store a random number in memory % as the variable "context2" store(flgr, "context2", rand, iter); % write data to file every 4 iterations if mod(iter,4)==0 write(flgr); end end
clean up the logger object. this operation performs clean up tasks like for example writing to file any data still in memory.
cleanup(flgr);
limitations
logging data using the
agentstepfinishedfcn
callback is not supported when training agents in parallel with the train function.
version history
introduced in r2022b
打开示例
您曾对此示例进行过修改。是否要打开带有您的编辑的示例?
matlab 命令
您点击的链接对应于以下 matlab 命令:
请在 matlab 命令行窗口中直接输入以执行命令。web 浏览器不支持 matlab 命令。
select a web site
choose a web site to get translated content where available and see local events and offers. based on your location, we recommend that you select: .
you can also select a web site from the following list:
how to get best site performance
select the china site (in chinese or english) for best site performance. other mathworks country sites are not optimized for visits from your location.
americas
- (español)
- (english)
- (english)
europe
- (english)
- (english)
- (deutsch)
- (español)
- (english)
- (français)
- (english)
- (italiano)
- (english)
- (english)
- (english)
- (deutsch)
- (english)
- (english)
- switzerland
- (english)
asia pacific
- (english)
- (english)
- (english)
- 中国
- (日本語)
- (한국어)