stochastic gaussian reward function approximator object for neural network-凯发k8网页登录
stochastic gaussian reward function approximator object for neural network-based environment
since r2022a
description
when creating a neural network-based environment using rlneuralnetworkenvironment
, you can specify the reward function approximator using
an rlcontinuousdeterministicrewardfunction
object. do so when you do not know
a ground-truth reward signal for your environment and you expect the reward signal to be
stochastic.
the reward function object uses a deep neural network as internal approximation model to predict the reward signal for the environment given one of the following input combinations.
observations, actions, and next observations
observations and actions
actions and next observations
next observations
to specify a deterministic reward function approximator, use an rlcontinuousdeterministicrewardfunction
object.
creation
description
creates a stochastic reward function using the deep neural network
rwdfcnappx
= rlcontinuousgaussianrewardfunction(net
,observationinfo
,actioninfo
,name=value
)net
and sets the observationinfo
and
actioninfo
properties.
when creating a reward function you must specify the names of the deep neural network inputs using one of the following combinations of name-value pair arguments.
observationinputnames
,actioninputnames
, andnextobservationinputnames
observationinputnames
andactioninputnames
actioninputnames
andnextobservationinputnames
nextobservationinputnames
you must also specify the names of the deep neural network outputs using the
rewardmeanoutputname
and
rewardstandarddeviationoutputname
name-value pair arguments.
you can also specify the usedevice
property using an optional
name-value pair argument. for example, to use a gpu for prediction, specify
usedevice="gpu"
.
input arguments
properties
object functions
rlneuralnetworkenvironment | environment model with deep neural network transition models |
examples
version history
introduced in r2022a