compute confidence intervals for estimated parameters (requires statistics and machine learning toolbox) -凯发k8网页登录
compute confidence intervals for estimated parameters (requires statistics and machine learning toolbox)
description
computes 95% confidence intervals for the estimated parameters from
ci
= sbioparameterci(fitresults
)fitresults
, an or returned by the function.
ci
is a object that contains the computed
confidence intervals.
uses additional options specified by one or more ci
= sbioparameterci(fitresults
,name,value
)name,value
pair arguments.
examples
compute confidence intervals for estimated pk parameters and model predictions
this example uses:
load data
load the sample data to fit. the data is stored as a table with variables id , time , centralconc , and peripheralconc. this synthetic data represents the time course of plasma concentrations measured at eight different time points for both central and peripheral compartments after an infusion dose for three individuals.
load data10_32r.mat gdata = groupeddata(data); gdata.properties.variableunits = {'','hour','milligram/liter','milligram/liter'}; sbiotrellis(gdata,'id','time',{'centralconc','peripheralconc'},'marker',' ',... 'linestyle','none');
create model
create a two-compartment model.
pkmd = pkmodeldesign; pkc1 = addcompartment(pkmd,'central'); pkc1.dosingtype = 'infusion'; pkc1.eliminationtype = 'linear-clearance'; pkc1.hasresponsevariable = true; pkc2 = addcompartment(pkmd,'peripheral'); model = construct(pkmd); configset = getconfigset(model); configset.compileoptions.unitconversion = true;
define dosing
define the infusion dose.
dose = sbiodose('dose','targetname','drug_central'); dose.starttime = 0; dose.amount = 100; dose.rate = 50; dose.amountunits = 'milligram'; dose.timeunits = 'hour'; dose.rateunits = 'milligram/hour';
define parameters
define the parameters to estimate. set the parameter bounds for each parameter. in addition to these explicit bounds, the parameter transformations (such as log, logit, or probit) impose implicit bounds.
responsemap = {'drug_central = centralconc','drug_peripheral = peripheralconc'}; paramstoestimate = {'log(central)','log(peripheral)','q12','cl_central'}; estimatedparam = estimatedinfo(paramstoestimate,... 'initialvalue',[1 1 1 1],... 'bounds',[0.1 3;0.1 10;0 10;0.1 2]);
fit model
perform an unpooled fit, that is, one set of estimated parameters for each patient.
unpooledfit = sbiofit(model,gdata,responsemap,estimatedparam,dose,'pooled',false);
perform a pooled fit, that is, one set of estimated parameters for all patients.
pooledfit = sbiofit(model,gdata,responsemap,estimatedparam,dose,'pooled',true);
compute confidence intervals for estimated parameters
compute 95% confidence intervals for each estimated parameter in the unpooled fit.
ciparamunpooled = sbioparameterci(unpooledfit);
display results
display the confidence intervals in a table format. for details about the meaning of each estimation status, see .
ci2table(ciparamunpooled)
ans = 12x7 table group name estimate confidenceinterval type alpha status _____ ______________ ________ __________________ ________ _____ ___________ 1 {'central' } 1.422 1.1533 1.6906 gaussian 0.05 estimable 1 {'peripheral'} 1.5629 0.83143 2.3551 gaussian 0.05 constrained 1 {'q12' } 0.47159 0.20093 0.80247 gaussian 0.05 constrained 1 {'cl_central'} 0.52898 0.44842 0.60955 gaussian 0.05 estimable 2 {'central' } 1.8322 1.7893 1.8751 gaussian 0.05 success 2 {'peripheral'} 5.3368 3.9133 6.7602 gaussian 0.05 success 2 {'q12' } 0.27641 0.2093 0.34351 gaussian 0.05 success 2 {'cl_central'} 0.86034 0.80313 0.91755 gaussian 0.05 success 3 {'central' } 1.6657 1.5818 1.7497 gaussian 0.05 success 3 {'peripheral'} 5.5632 4.7557 6.3708 gaussian 0.05 success 3 {'q12' } 0.78361 0.65581 0.91142 gaussian 0.05 success 3 {'cl_central'} 1.0233 0.96375 1.0828 gaussian 0.05 success
plot the confidence intervals. if the estimation status of a confidence interval is success
, it is plotted in blue (the first default color). otherwise, it is plotted in red (the second default color), which indicates that further investigation into the fitted parameters may be required. if the confidence interval is not estimable
, then the function plots a red line with a centered cross. if there are any transformed parameters with estimated values 0 (for the log transform) and 1 or 0 (for the probit or logit transform), then no confidence intervals are plotted for those parameter estimates. to see the color order, type get(groot,'defaultaxescolororder')
.
groups are displayed from left to right in the same order that they appear in the groupnames
property of the object, which is used to label the x-axis. the y-labels are the transformed parameter names.
plot(ciparamunpooled)
compute the confidence intervals for the pooled fit.
ciparampooled = sbioparameterci(pooledfit);
display the confidence intervals.
ci2table(ciparampooled)
ans = 4x7 table group name estimate confidenceinterval type alpha status ______ ______________ ________ __________________ ________ _____ ___________ pooled {'central' } 1.6626 1.3287 1.9965 gaussian 0.05 estimable pooled {'peripheral'} 2.687 0.89848 4.8323 gaussian 0.05 constrained pooled {'q12' } 0.44956 0.11445 0.85152 gaussian 0.05 constrained pooled {'cl_central'} 0.78493 0.59222 0.97764 gaussian 0.05 estimable
plot the confidence intervals. the group name is labeled as "pooled" to indicate such fit.
plot(ciparampooled)
plot all the confidence interval results together. by default, the confidence interval for each parameter estimate is plotted on a separate axes. vertical lines group confidence intervals of parameter estimates that were computed in a common fit.
ciall = [ciparamunpooled;ciparampooled]; plot(ciall)
you can also plot all confidence intervals in one axes grouped by parameter estimates using the 'grouped' layout.
plot(ciall,'layout','grouped')
in this layout, you can point to the center marker of each confidence interval to see the group name. each estimated parameter is separated by a vertical black line. vertical dotted lines group confidence intervals of parameter estimates that were computed in a common fit. parameter bounds defined in the original fit are marked by square brackets. note the different scales on the y-axis due to parameter transformations. for instance, the y-axis of q12
is in the linear scale, but that of central
is in the log scale due to its log transform.
compute confidence intervals for model predictions
calculate 95% confidence intervals for the model predictions, that is, simulation results using the estimated parameters.
% for the pooled fit cipredpooled = sbiopredictionci(pooledfit); % for the unpooled fit cipredunpooled = sbiopredictionci(unpooledfit);
plot confidence intervals for model predictions
the confidence interval for each group is plotted in a separate column, and each response is plotted in a separate row. confidence intervals limited by the bounds are plotted in red. confidence intervals not limited by the bounds are plotted in blue.
plot(cipredpooled)
plot(cipredunpooled)
input arguments
fitresults
— parameter estimation results from sbiofit
nlinresults
object | optimresults
object | vector
parameter estimation results from , specified as an , , or a vector of objects for unpooled fits
that were returned from the same sbiofit
call.
name-value arguments
specify optional pairs of arguments as
name1=value1,...,namen=valuen
, where name
is
the argument name and value
is the corresponding value.
name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
before r2021a, use commas to separate each name and value, and enclose
name
in quotes.
example: 'alpha',0.01,'type','profilelikelihood'
specifies to
compute a 99% confidence interval using the profile likelihood
approach.
depending on the type of confidence interval, the compatible name-value arguments differ. the table below lists all the name-value arguments and their corresponding confidence interval types. a check mark (✔) indicates that the name-value argument is applicable for that type.
name-value argument | gaussian (default) | optimization-based profile likelihood | integration-based profile likelihood | bootstrap |
---|---|---|---|---|
alpha | ✔ | ✔ | ✔ | ✔ |
type | ✔ | ✔ | ✔ | ✔ |
display | ✔ | ✔ | ✔ | ✔ |
useparallel | ✔ | ✔ | ✔ | ✔ |
numsamples | ✔ | |||
tolerance | ✔ | ✔ | ✔ | |
parameters | ✔ | ✔ | ||
maxstepsize | ✔ | ✔ | ||
useintegration | ✔ | ✔ | ||
integrationoptions | ✔ |
alpha
— confidence level
0.05 (default) | positive scalar
confidence level, (1-alpha) * 100%
, specified as the
comma-separated pair consisting of 'alpha'
and a positive scalar
between 0 and 1. the default value is 0.05
, meaning a 95%
confidence interval is computed.
example: 'alpha',0.01
type
— confidence interval type
'gaussian'
(default) | 'profilelikelihood'
| 'bootstrap'
confidence interval type, specified as the comma-separated pair consisting of
'type'
and a character vector. the valid choices are:
'gaussian'
— use the gaussian approximation of the distribution of parameter estimates.'profilelikelihood'
— compute the profile likelihood intervals. the function has two methods to compute profile likelihood curves. by default, the function uses the optimization-based method. to use the integration-based method, you must also set'useintegration'
totrue
.the optimization-based method fixes one parameter value at a time and reruns an optimization to compute the maximum likelihood. this optimization is done for every parameter and every point on the curve of the profile likelihood. the integration-based method is based on integrating the differential equations derived from the lagrange equations of the optimization-based method. for details about these two methods, see profile likelihood confidence interval calculation.
note
this type is not supported for parameter estimates from hierarchical models, that is, estimated results from (such as age or sex). in other words, if you set the
categoryvariablename
property of the in your original fit, then the fit results are hierarchical and you cannot compute theprofilelikelihood
confidence intervals on the results.'bootstrap'
— compute confidence intervals using the bootstrap method.
example: 'type','bootstrap'
display
— level of display returned to the command line
'off'
(default) | 'none'
| 'final'
| 'iter'
level of display returned to the command line, specified as the comma-separated
pair consisting of 'display'
and a character vector.
'off'
(default) or 'none'
displays no
output. 'final'
displays a message when a computation finishes.
'iter'
displays output at each iteration.
example: 'display','final'
useparallel
— logical flag to compute confidence intervals in parallel
true
| false
logical flag to compute confidence intervals in parallel, specified as the
comma-separated pair consisting of 'useparallel'
and
true
or false
. by default, the parallel
options in the original fit are used. if this argument is set to
true
and parallel computing toolbox™ is available, the parallel options in the original fit are ignored,
and confidence intervals are computed in parallel.
for the gaussian confidence intervals:
if the input
fitresults
is a vector of results objects, then the computation of confidence intervals for each object is performed in parallel. the gaussian confidence intervals are quick to compute. so, it might be more beneficial to parallelize the original fit (sbiofit
) and not setuseparallel
to true forsbioparameterci
.
for the profile likelihood confidence intervals:
if the number of results objects in the input
fitresults
vector is greater than the number of estimated parameters, then the computation of confidence intervals for each object is performed in parallel.otherwise, the confidence intervals for all estimated parameters within one results object are computed in parallel before the function moves on to the next results object.
for the bootstrap confidence intervals:
the function forwards the
useparallel
flag tobootci
. there is no parallelization over the input vector of results objects.
note
if you have a global stream for random number generation with several
substreams to compute in parallel in a reproducible fashion,
sbioparameterci
first checks to see if the number
of workers is same as the number of substreams. if so,
sbioparameterci
sets
usesubstreams
to true
in the
statset
option and passes it to (statistics and machine learning toolbox). otherwise, the
substreams are ignored by default.
example: 'useparallel',true
numsamples
— number of samples for bootstrapping
1000 (default) | positive integer
number of samples for bootstrapping, specified as the comma-separated pair
consisting of 'numsamples'
and a positive integer. this number
defines the number of fits that are performed during the confidence interval
computation to generate bootstrap samples. the smaller the number is, the faster the
computation of the confidence intervals becomes, at the cost of decreased
accuracy.
example: 'numsamples',500
tolerance
— tolerance for profile likelihood and bootstrap confidence interval computations
1e-5
(default) | positive scalar
tolerance for the profile likelihood and bootstrap confidence interval
computations, specified as the comma-separated pair consisting of
'tolerance'
and a positive scalar.
the profile likelihood method uses this value as a termination tolerance. for details, see profile likelihood confidence interval calculation.
the bootstrap method uses this value to determine whether a confidence interval is constrained by bounds specified in the original fit. for details, see bootstrap confidence interval calculation.
example: 'tolerance',1e-6
parameters
— names of parameters for which profile likelihood curves are calculated
character vector | string | string vector | cell array of character vectors
names of parameters for which the profile likelihood curves are calculated,
specified as a character vector, string, string vector, or cell array of character
vectors. by default, the function computes the confidence intervals for all
parameters listed in the property of the
fitresults
object. you can also specify a subset of those
parameters if needed.
note
this name-value argument is applicable only when you specify
type
as 'profilelikelihood'
.
example: 'parameters',{'ka'}
maxstepsize
— maximum step size used for computing profile likelihood curves
positive scalar | []
| cell array
maximum step size used for computing profile likelihood curves, specified as the
comma-separated pair consisting of 'maxstepsize'
and a positive
scalar, []
, or cell array.
for the optimization-based method, the default value is
0.1
. if you set'maxstepsize'
to[]
, then the maximum step size is set to 10% of the width of the gaussian approximation of the confidence interval, if it exists. you can specify a maximum step size (or[]
) for each estimated parameter using a cell array.for the integration-based method, the default value is
inf
. internally, the function uses theode15s
solver.
example: 'maxstepsize',0.5
useintegration
— flag to use integration-based profile likelihood confidence interval method
false
(default) | true
flag to use the integration-based profile likelihood confidence interval method,
specified as true
or false
. the
integration-based method integrates differential equations derived from the lagrange
equations. by default, the function uses the optimization-based method. for details
about these two methods, see profile likelihood confidence interval calculation.
example: 'useintegration',true
integrationoptions
— options for integration-based profile likelihood confidence interval method
structure
options for the integration-based profile likelihood confidence interval method, specified as a structure. specify options as fields of the structure as follows.
field name | field value description |
---|---|
hessian |
|
correctionfactor | nonnegative scalar. the default value is 0. |
absolutetolerance | positive scalar for the step size control in
ode15s . the default value is
1e-2 . |
relativetolerance | positive scalar less than 1 for the step size
control in ode15s . the default value is
1e-2 . |
initialstepsize | positive scalar as the initial step size for solving the
differential equations. if a parameter is bounded, the function uses
the default initial step size of . if not, it
uses 1e-4 . |
output arguments
ci
— confidence interval results
parameterconfidenceinterval
object
confidence interval results, returned as a
parameterconfidenceinterval
object. for an unpooled
fit, ci
can be a vector of
parameterconfidenceinterval
objects.
more about
gaussian confidence interval calculation
the function uses the wald test statistic [1] to compute the
confidence intervals. assuming that there are enough data, the parameter estimates,
pest, are approximately student's
t-distributed with the covariance matrix s (the
covariancematrix
property of the results object) returned
by sbiofit
.
the confidence interval for the ith parameter estimate pest,i is computed as follows:
, where tinv is the
student's t inverse cumulative distribution function ( (statistics and machine learning toolbox)) with the probability 1-(alpha/2)
, and
si,i is the diagonal element
(variance) of the covariance matrix s.
in cases where the confidence interval is constrained by the parameter bounds defined in the original fit, the confidence interval bounds are adjusted according to the approach described by wu, h. and neale, m. [2].
for each parameter estimate, the function first decides whether the confidence interval of the parameter estimate is unbounded. if so, the function sets the estimation status of the corresponding parameter estimate to
not estimable
.otherwise, if the confidence interval for a parameter estimate is constrained by a parameter bound defined in the original fit, the function sets the estimation status to
constrained
. parameter transformations (such aslog
,probit
, orlogit
) impose implicit bounds on the estimated parameters, for example, positivity constraints. such bounds can lead to the overestimation of confidence, that is, the confidence interval can be smaller than expected.if no confidence interval has the estimation status
not estimable
orconstrained
, then the function sets the estimation statuses of all parameter estimates tosuccess
. otherwise, the estimation statuses of remaining parameter estimates are set toestimable
.
profile likelihood confidence interval calculation
define l to be the likelihood,
lh, of the parameter estimates (stored in the
parameterestimates
property of the results object) returned
by sbiofit
, , where pest is a vector
of parameter estimates, pest,1,
pest,2, …,
pest,n.
the profile likelihood function pl for a parameter pi is defined as , where n is the total number of parameters.
per wilks's theorem [3], the likelihood ratio test statistic, , is chi-square distributed with 1 degree of freedom.
therefore, find all pi so that: .
equivalently, , where is the target value used in computing the log profile likelihood curve. the function provides two methods to compute such curve.
start at pest,i and evaluate the likelihood l.
compute the log profile likelihood at pest,i
k * maxstepsize
for each side (or direction) of the confidence interval, that is,k = 1, 2, 3,…
andk = -1, -2, -3,…
.stop if one of these stopping criteria is met on each side.
the log profile likelihood falls below the target value. in this case, start bisecting between pbelow and pabove, where pbelow is the parameter value with the largest log profile likelihood value below the target value, and pabove the parameter value with the smallest log profile likelihood value greater than the target value. stop the bisection if one of the following is true:
either neighboring log profile likelihood values are less than tolerance apart. set the status for the corresponding side of the confidence interval to
success
.the bisection interval becomes smaller than
max(tolerance,2*eps('double'))
and the profile likelihood curve computed so far is above the target value. set the status of the corresponding side tonot estimable
.the linear gradient approximation of the profile likelihood curve (finite difference between two neighboring parameter values) is larger than -tolerance (the negative value of the tolerance). set the status of the corresponding side to
not estimable
.
the step is limited by a bound defined in the original fit. evaluate at the bound and set the status of the corresponding side to
constrained
.
this method [4] solves the constrained optimization problem by integrating the differential equations derived from the lagrange equations
here, is the ith
canonical unit vector, the lagrange multiplier is , and c =
pi
.
in other words, instead of optimizing point by point, this method solves differential equations that define the profile likelihood curve as follows.
here, is the hessian of the log likelihood function.
using the finite-difference approximation of the hessian matrix is recommended. however, the numerical computation of the hessian matrix using finite differencing can be computationally expensive. to reduce the computational costs, chen and jennrich [4] proposed an approximate version based on the assumption that the second-order sufficient karush-kuhn-tucker conditions must hold with strict inequality at every point in the domain of the profile likelihood curve as outlined in assumption 2 in the appendix of [4]. in other words, at every point on the profile likelihood curve, the remaining parameters must be estimable.
if this assumption holds, then the hessian can be replaced with the identity matrix i as follows:
here, is the gradient of the log likelihood and γ is a correction factor to ensure the solution of the differential equation stays on the path of the profile likelihood curve.
if γ is too small, the approximation of the profile likelihood curve may become inaccurate, resulting in an underestimation of the profile likelihood confidence intervals. setting γ to a large value ensures accurate results, but might require to take smaller steps, which increases the computational cost.
tip
you can specify the hessian approximation and correction factor using
the integrationoptions
name-value argument.
the stopping criterion of the algorithm is when one of the following conditions becomes true:
the gradient approximation of the profile likelihood curve is larger than -tolerance.
the profile likelihood falls below the target value.
a parameter bound is reached.
if both sides of the confidence interval are unsuccessful, that is, have the status
not estimable
, the function sets the estimation status () tonot estimable
.if no side has the status
not estimable
and one side has the statusconstrained
, the function sets the estimation status () toconstrained
.if the computation for all parameters on both sides of the confidence intervals is successful, set the estimation status () to
success
.otherwise, the function sets the estimation statuses of the remaining parameter estimates to
estimable
.
bootstrap confidence interval calculation
the (statistics and machine learning toolbox) function from statistics and machine learning toolbox™ is used to compute the bootstrap confidence intervals. the first input
nboot is the number of samples
(numsamples
), and the second input bootfun is
a function that performs these actions:
resample the data (independently within each group, if multiple groups are available).
run a parameter fit with the resampled data.
return the estimated parameters.
if a confidence interval is closer than tolerance
to a
parameter bound, as defined in the original fit, the function sets the
estimation status to constrained
. if all confidence intervals
are further away from the parameter bounds than tolerance
,
the function sets the status to success
. otherwise, it is set
to estimable
.
references
[1] wald, a. "tests of statistical hypotheses concerning several parameters when the number of observations is large." transactions of the american mathematical society. 54 (3), 1943, pp. 426-482.
[2] wu, h., and m.c. neale. "adjusted confidence intervals for a bounded parameter." behavior genetics. 42 (6), 2012, pp. 886-898.
[3] wilks, s.s. "the large-sample distribution of the likelihood ratio for testing composite hypotheses." the annals of mathematical statistics. 9 (1), 1938, pp. 60–62.
[4] chen, jian-shen, and robert i. jennrich. “simple accurate approximation of likelihood profiles.” journal of computational and graphical statistics 11, no. 3 (september 2002): 714–32.
extended capabilities
automatic parallel support
accelerate code by automatically running computation in parallel using parallel computing toolbox™.
to run in parallel, set 'useparallel'
to true
.
for more information, see the 'useparallel'
name-value pair argument.
version history
introduced in r2017b
see also
| | |
打开示例
您曾对此示例进行过修改。是否要打开带有您的编辑的示例?
matlab 命令
您点击的链接对应于以下 matlab 命令:
请在 matlab 命令行窗口中直接输入以执行命令。web 浏览器不支持 matlab 命令。
select a web site
choose a web site to get translated content where available and see local events and offers. based on your location, we recommend that you select: .
you can also select a web site from the following list:
how to get best site performance
select the china site (in chinese or english) for best site performance. other mathworks country sites are not optimized for visits from your location.
americas
- (español)
- (english)
- (english)
europe
- (english)
- (english)
- (deutsch)
- (español)
- (english)
- (français)
- (english)
- (italiano)
- (english)
- (english)
- (english)
- (deutsch)
- (english)
- (english)
- switzerland
- (english)
asia pacific
- (english)
- (english)
- (english)
- 中国
- (日本語)
- (한국어)