main content

monitor and plot training progress for deep learning custom training loops -凯发k8网页登录

monitor and plot training progress for deep learning custom training loops

since r2022b

description

use a trainingprogressmonitor object to track training progress when using a custom training loop.

you can use a trainingprogressmonitor object to:

  • create animated custom metric plots and record custom metrics during training.

  • display and record training information during training.

  • stop training early.

  • track training progress with a progress bar.

  • track elapsed time.

this image shows an example of the training progress window during training. for more information about configuring the training progress window and an example showing how to generate this figure, see .

training progress window. the figure contains plots of the loss and accuracy for both the training and validation data, and information about the training progress, status, elapsed time, epoch number, execution environment, iteration, and learning rate.

creation

description

example

monitor = trainingprogressmonitor creates a trainingprogressmonitor object that you can use to track the training progress and create training plots.

example

monitor = trainingprogressmonitor(name=value) sets the metrics, info, visible, progress, status, and xlabel properties using one or more name-value arguments.

properties

metric names, specified as a string scalar, character vector, string array, or cell array of character vectors. valid names begin with a letter, and contain letters, digits, and underscores. each metric appears in its own training subplot. to plot more than one metric in a single subplot, use the function.

example: ["trainingloss","validationloss"];

data types: char | string | cell

information names, specified as a string scalar, character vector, string array, or cell array of character vectors. valid names begin with a letter, and contain letters, digits, and underscores. these names appear in the training progress window but do not appear as training plots.

example: ["gradientdecayfactor","squaredgradientdecayfactor"];

data types: char | string | cell

this property is read-only.

request to stop training, specified as a numeric or logical 0 (false) or 1 (true). the value of this property changes to 1 when you click the stop button in the training progress window. the stop button only appears if you set the visible property to 'on' or 1 (true).

data types: logical

state of visibility, specified as 'on' or 'off', or as numeric or logical 1 (true) or 0 (false). a value of 'on' is equivalent to true, and 'off' is equivalent to false. thus, you can use the value of this property as a logical value. the value is stored as an on/off logical value of type .

  • 'on' — display the training progress window.

  • 'off' — hide the training progress window without deleting it. you still can access the properties of an invisible object.

example: 'off'

training progress percentage, specified as a scalar or dlarray object in the range [0, 100].

example: 17;

horizontal axis label in the training plot, specified as a string scalar or character vector.

example: "iteration";

data types: char | string | cell

training status, specified as a string scalar or character vector.

example: "running";

data types: char | string | cell

this property is read-only.

metric values, specified as a structure. use the metrics property to specify the field names for the structure. each field contains a matrix with two columns. the first column contains the custom training loop step values and the second column contains the metric values recorded by the function.

data types: struct

this property is read-only.

information values, specified as a structure. use the info property to specify the field names for the structure. each field is a column vector that contains the values updated by the function.

data types: struct

object functions

group metrics in training plot
record metric values for custom training loops
update information values for custom training loops

examples

use a trainingprogressmonitor object to track training progress and produce training plots for custom training loops.

create a trainingprogressmonitor object. the monitor automatically tracks the start time and the elapsed time. the timer starts when you create the object.

tip

to ensure that the elapsed time accurately reflects the training time, make sure you create the trainingprogressmonitor object close to the start of your custom training loop.

monitor = trainingprogressmonitor;

before you start the training, specify names for the information and metric values.

monitor.info = ["learningrate","epoch","iteration"];
monitor.metrics = ["trainingloss","validationloss","trainingaccuracy","validationaccuracy"];

specify the horizontal axis label for the training plot. group the training and validation loss in the same subplot, and group the training and validation accuracy in the same plot.

monitor.xlabel = "iteration";
groupsubplot(monitor,"loss",["trainingloss","validationloss"]);
groupsubplot(monitor,"accuracy",["trainingaccuracy","validationaccuracy"]);

during training:

  • evaluate the stop property at the start of each step in your custom training loop. when you click the stop button in the training progress window, the stop property changes to 1. training stops if your training loop exits when the stop property is 1.

  • update the information values. the updated values appear in the training progress window.

  • record the metric values. the recorded values appear in the training plot.

  • update the training progress percentage based on the fraction of iterations completed.

note

the following example code is a template. you must edit this training loop to compute your metric and information values. for a complete example that you can run in matlab, see .

epoch = 0;
iteration = 0;
monitor.status = "running";
while epoch < maxepochs && ~monitor.stop
    epoch = epoch   1;
    while hasdata(mbq) && ~monitor.stop
        iteration = iteration   1;
        % add code to calculate metric and information values.
        % losstrain = ...
       updateinfo(monitor, ...
            learningrate=learnrate, ...
            epoch=string(epoch)   " of "   string(maxepochs), ...
            iteration=string(iteration)   " of "   string(numiterations));
       recordmetrics(monitor,iteration, ...
            trainingloss=losstrain, ...
            trainingaccuracy=accuracytrain, ...
            validationloss=lossvalidation, ...
            validationaccuracy=accuracyvalidation);
        monitor.progress = 100*iteration/numiterations;
    end
end

the training progress window shows animated plots of the metrics, and the information values, training progress bar, and elapsed time.

  • the training plots update each time you call .

  • the values under information update each time you call .

  • the elapsed time updates each time you call or , and when you update the progress property.

training progress window. the first plot shows the training and validation loss and the second plot shows the training and validation accuracy.

a trainingprogressmonitor object has the same properties and object functions as an object. therefore, you can easily adapt your plotting code for use in an setup script.

how you monitor training depends on where you are training.

  • if you are using a custom training loop script, you must create and manage a trainingprogressmonitor object yourself.

  • if you are using a custom training experiment, experiment manager creates an experiments.monitor object for each trial of your experiment. by default, experiment manager saves the experiments.monitor object as the variable monitor.

in experiment manager, you can use the experiments.monitor object in place of the trainingprogressmonitor object in your custom training loop code.

for example, suppose your training script creates a trainingprogressmonitor object to track and plot training and validation loss.

monitor = trainingprogressmonitor( ...
    metrics=["trainingloss","validationloss"], ...
    xlabel="iteration");
groupsubplot(monitor,"loss",["trainingloss","validationloss"]);
iteration = 1;
recordmetrics(monitor,iteration,trainingloss=loss,validationloss=lossval);

to adapt this code for use in experiment manager with an experiments.monitor object:

  • convert any code that sets properties using name=value syntax to use dot notation.

  • delete the call to trainingprogressmonitor. this is because experiment manager creates a monitor for you.

use the adapted code inside your experiment manager setup function.

% inside custom training experiment setup function
monitor.metrics=["trainingloss","validationloss"];
monitor.xlabel = "iteration";
groupsubplot(monitor,"loss",["trainingloss","validationloss"]);
iteration = 1;
recordmetrics(monitor,iteration,trainingloss=loss,validationloss=lossval);

note

experiment manager accesses the monitor object as the second input argument of the training function. you must check that the second input argument matches the variable name of your monitor object. for more information, see .

tips

  • the information values appear in the training progress window and the training plot shows a record of the metric values. use information values for text and for numerical values that you want to display in the training window but not in the training plot.

  • when you click the stop button in the training progress window, the stop property is set to 1 (true). this stops training if your training loop exits when the stop property is 1. for example, to enable early stopping, include the following code in your custom training loop.

    while numepochs < maxepochs && ~monitor.stop    
    % custom training loop code.   
    end
  • the elapsed time updates each time you call recordmetrics or updateinfo, and when you update the progress property.

version history

introduced in r2022b

see also

| | |

topics

    网站地图