training and validation -凯发k8网页登录

train and simulate reinforcement learning agents

to learn an optimal policy, a reinforcement learning agent interacts with the environment through a repeated trial-and-error process. during training, the agent tunes the parameters of its policy representation to maximize the long-term reward. reinforcement learning toolbox™ software provides functions for training agents and validating the training results through simulation. for more information, see train reinforcement learning agents.

apps

reinforcement learning designer

design, train, and simulate reinforcement learning agents

functions

train agents

`train`	train reinforcement learning agents within a specified environment
`rltrainingoptions`	options for training reinforcement learning agents
`rlmultiagenttrainingoptions`	options for training multiple reinforcement learning agents
`trainfromdata`	train off-policy reinforcement learning agent using existing data
`rltrainingfromdataoptions`	options to train reinforcement learning agents using existing data
`inspecttrainingresult`	plot training information from a previous training session

log data

`rldatalogger`	create either a file logger object or a monitor logger object to log training data
`rldataviewer`	open reinforcement learning data viewer tool
`filelogger`	log reinforcement learning training data to mat files
`monitorlogger`	log reinforcement learning training data to monitor window
`trainingprogressmonitor`	monitor and plot training progress for deep learning custom training loops
`setup`	set up reinforcement learning environment or initialize data logger object
`store`	store data in the internal memory of a (file or monitor) logger object
`write`	transfer stored data from the internal logger memory to the logging target
`cleanup`	clean up reinforcement learning environment or data logger object

simulate agents

`sim`	simulate trained reinforcement learning agents within specified environment
`rlsimulationoptions`	options for simulating a reinforcement learning agent within an environment

custom training

`runepisode`	simulate reinforcement learning environment against policy or agent
`setup`	set up reinforcement learning environment or initialize data logger object
`cleanup`	clean up reinforcement learning environment or data logger object
`future`	object that supports deferred outputs for reinforcement learning environment simulations running on workers
`fetchnext`	retrieve next available unread outputs from a reinforcement learning environment simulations running on workers
`fetchoutputs`	retrieve results from all reinforcement learning environment simulations running on workers
`cancel`	cancel unfinished reinforcement learning environment simulations on workers
`wait`	wait for reinforcement learning environment simulations running on a workers to finish

blocks

rl agent	reinforcement learning agent
policy	reinforcement learning policy

topics

training and simulation basics

train reinforcement learning agents
find the optimal policy by training your agent within a specified environment.
train reinforcement learning agent in basic grid world
train q-learning and sarsa agents to solve a grid world in matlab^®.
train reinforcement learning agent in mdp environment
train a reinforcement learning agent in a generic markov decision process environment.
create simulink environment and train agent
train a controller using reinforcement learning with a plant modeled in simulink^® as the training environment.
train reinforcement learning agent for simple contextual bandit problem
train q and dqn agents to solve a contextual bandit problem.
log training data to disk
log a variety of data to disk while training an agent.
train agent or tune environment parameters using parameter sweeping
tune a ddpg agent using hyperparameter sweeping.

use the reinforcement learning designer app

design and train agent using reinforcement learning designer
design and train a dqn agent for a cart-pole system using the reinforcement learning designer app.
specify simulation options in reinforcement learning designer
interactively specify options for simulating reinforcement learning agents using the reinforcement learning designer app.
specify training options in reinforcement learning designer
interactively specify options for training reinforcement learning agents using the reinforcement learning designer app.

use multiple processes and gpus

train agents using parallel computing and gpus
accelerate agent training by running simulations in parallel on multiple cores, gpus, clusters or cloud resources.
train ac agent to balance cart-pole system using parallel computing
train a discrete action space ac agent using asynchronous parallel computing.
train dqn agent for lane keeping assist using parallel computing
train a dqn agent for an automated driving application using parallel computing.

multi-agent training

train multiple agents to perform collaborative task
train two continuous action space ppo agents to collaboratively move an object.
train multiple agents for area coverage
train three discrete action space ppo agents to explore a grid-world environment in a collaborative-competitive manner.
train multiple agents for path following control
train a dqn and a ddpg agent to collaboratively perform adaptive cruise control and lane keeping assist to follow a path.

train agents to control double integrator system

train ddpg agent to control double integrator system
train a ddpg agent to control a second-order dynamic system modeled in matlab and compare it to an lqr controller.
train a discrete action space pg agent with a baseline to control a double integrator system modeled in matlab.

train agents to balance cart-pole system

train dqn agent to balance cart-pole system
train a dqn agent to balance a cart-pole system modeled in matlab.
train a discrete action space pg agent to balance a cart-pole system modeled in matlab.
train ac agent to balance cart-pole system
train a discrete action space ac agent to balance a cart-pole system modeled in matlab.
train a ddpg agent to swing up and balance a cart-pole system modeled in simscape™ multibody™.
train mbpo agent to balance cart-pole system
a model-based reinforcement learning agent learns a model of its environment that it can use to generate additional experiences for training.

train agents to swing up and balance pendulum

train dqn agent to swing up and balance pendulum
train a dqn agent to swing up and balance a pendulum modeled in simulink.
train a ddpg agent to balance a pendulum modeled in simulink.
train a ddpg agent to balance a pendulum simulink model that contains observations in a bus signal.
train ddpg agent to swing up and balance pendulum with image observation
train a ddpg agent using an image-based observation signal.
create dqn agent using deep network designer and train using image observations
create a reinforcement learning agent using the deep network designer app from the deep learning toolbox™.

train agents to perform control tasks

tune pi controller using reinforcement learning
tune the gains of a pi controller using a td3 agent.
train sac agent for ball balance control
train a sac agent to balance a ball on a flat surface using a robot arm.
train sac and ppo agents to balance the quanser qube rotational inverted pendulum.
train a td3 agent to control the currents in a permanent magnet synchronous motor.
train a dqn agent with a recurrent network to control the temperature of an house.
train a ddpg agent with actions constrained using the constraint enforcement block.

train agents to control robots

train ddpg agent to control flying robot
train a ddpg agent to control a flying robot model.
train a discrete action space ppo agent to land a flying robot.
train biped robot to walk using reinforcement learning agents
compare ddpg and td3 agent for the control a biped walking robot modeled in simscape multibody.

generate rewards from control specifications

generate reward function from a model predictive controller for a servomotor
generate a reward function from an mpc controller applied to a servomotor and use it to train a td3 agent.
generate reward function from a model verification block for a water tank system
generate a reward function from an model verification block applied to a water tank system and use it to train a td3 agent.

imitation learning

train a deep neural network to imitate the behavior of a model predictive controller within a lane keeping assist system.
train a deep neural network to imitate the behavior of a nonlinear model predictive controller for a flying robot.
train a ddpg agent using an actor network that has been previously trained using supervised learning.

train agents for automotive applications

train dqn agent for lane keeping assist
train a dqn agent for a lane keeping assist application.
train a ddpg agent for an adaptive cruise control application.
train a ddpg agent for a lane following application.
train a discrete action space ppo agent to park a car in an open parking space.

other applications

this example shows how to use the reinforcement learning toolbox™ and deep learning toolbox™ to design agents for optimal trade execution.
train a deep q-network (dqn) reinforcement learning agent for beam selection in a 5g new radio communications system.
water distribution system scheduling using reinforcement learning
train a dqn agent to optimally activate pumps in a water distribution system.

develop custom agents and training algorithms

train reinforcement learning policy using custom training loop
train a reinforcement learning policy using your own custom training loop.
custom training loop with simulink action noise
use a custom training loop to train a continuous action space reinforcement learning policy in simulink when action noise is generated within the model.
create agent for custom reinforcement learning algorithm.
create and train a custom agent that solves an lqr problem.
model-based reinforcement learning using custom training loop
you can create a model-based reinforcement learning agent using your own custom training loop.

deploy agents and policies

verify a reinforcement learning agent in software-in-the-loop and processor-in-the-loop modes.
generate a policy block to deploy a trained policy.