deterministic reward function approximator object for neural network-凯发k8网页登录
deterministic reward function approximator object for neural network-based environment
since r2022a
description
when creating a neural network-based environment using rlneuralnetworkenvironment
, you can specify the reward function approximator using
an rlcontinuousdeterministicrewardfunction
object. do so when you do not know
a ground-truth reward signal for your environment but you expect the reward signal to be
deterministic.
the reward function approximator object uses a deep neural network as internal approximation model to predict the reward signal for the environment given one of the following input combinations.
observations, actions, and next observations
observations and actions
actions and next observations
next observations
to specify a stochastic reward function, use an rlcontinuousgaussianrewardfunction
object.
creation
syntax
description
creates the deterministic reward function approximator object
rwdfcnappx
= rlcontinuousdeterministicrewardfunction(net
,observationinfo
,actioninfo
,name=value
)rwdfcnappx
using the deep neural network net
and sets the observationinfo
and actioninfo
properties.
when creating a reward function you must specify the names of the deep neural network inputs using one of the following combinations of name-value pair arguments.
observationinputnames
,actioninputnames
, andnextobservationinputnames
observationinputnames
andactioninputnames
actioninputnames
andnextobservationinputnames
nextobservationinputnames
you can also specify the usedevice
property using and an optional
name-value pair argument. for example, to use a gpu for prediction, specify
usedevice="gpu"
.
input arguments
properties
object functions
rlneuralnetworkenvironment | environment model with deep neural network transition models |
examples
version history
introduced in r2022a