custom evaluation function, specified as a function handle. the train
function
calls evalfcn
after evalperiod
episodes.
your evaluation function must have three inputs and three outputs, as illustrated by
the following signature.
given an agent, its environment, and training episode information, the custom
evaluation function runs a number of evaluation episodes and returns a corresponding
summarizing statistic, a vector of episode scores, and any additional data that might be
needed for logging.
the required input arguments (passed to evalfcn
from
train
) are:
agent
— agent to evaluate, specified as a reinforcement
learning agent object. for multiagent environments, this is a cell array of agent
objects.
environment
— environments within which the agents are
evaluated, specified as a reinforcement environment object.
traininginfo
— a structure containing the following fields.
episodeindex
— current episode index, specified as a
positive integer
episodeinfo
— a structure containing the fields
cumulativereward
, stepstaken
, and
initialobservation
, which contain, respectively, the
cumulative reward, the number of steps taken, and the initial observations of
the current training episode
the output arguments (passed from evalfcn
to
train
) are:
statistic
— a statistic computed from a group of consecutive
evaluation episodes. common statistics are the mean, medium, maximum, and minimum.
at the end of the training, this value is returned by train
as
the element of the evaluationstatistics
vector corresponding to the last training
episode.
scores
— a vector of episode scores from each evaluation
episode. you can use a logger object to store this argument during training.
data
— any additional data from evaluation that you might
find useful, for example for logging purposes. you can use a logger object to store
this argument during training.
to use additional input arguments beyond the allowed two, define your additional
arguments in the matlab workspace, then specify stepfcn
as an anonymous
function that in turn calls your custom function with the additional arguments defined
in the workspace, as shown in the example create custom environment using step and reset functions.
example: evalfcn=@myevalfcn