home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
C/C++ Interactive Guide
/
c-cplusplus-interactive-guide.iso
/
c_ref
/
csource4
/
209_01
/
ldhfitr.doc
< prev
next >
Wrap
Text File
|
1990-03-05
|
23KB
|
732 lines
LDHFITR.DOC VERS:- 01.00 DATE:- 09/26/86 TIME:- 10:02:47 PM
description of computer reduction of data from kinetic
measurements for the enzyme lactate dehydrogenase
information on how to run program LDHFIT
By J. A. Rupley, Tucson, Arizona
NOTES ON DATA REDUCTION BY COMPUTER
AND INFORMATION ON RUNNING THE PROGRAM LDHFIT
INTRODUCTION:
In order to obtain conclusions from quantitative measurements,
there must be some form of data reduction. This can be as simple
as a comparison by eye of two curves drawn through the data. If,
however, the data set is large and complex, for example with more
than one independent variable, and if the questions posed are
detailed or involve a complicated nonlinear model, then visual or
graphical methods are less satisfactory than a computer-based
analysis. Procedures of the latter type are now widely used.
This laboratory is a short introduction to data reduction by use
of a computer. The intent is to show that a sophisticated
computer program can be handled easily, that its use saves time
and effort, that it can treat a more complicated model than can
be treated graphically, and that it produces information such as
estimates of uncertainties in the parameters that is difficult or
impossible to obtain from graphical methods.
The data to be analyzed are initial rate measurements made
on the lactate dehydrogenase catalyzed reaction of pyruvate with
NADH, in the presence and absence of lactate as inhibitor. The
results of the computer fit are the following: (1) values of the
kinetic constants V, KmA, KmB, KmAB, KmQ/KmPQ, and KBInhib. The
first five constants are those that can be evaluated by the
standard graphical methods of primary and secondary reciprocal
plots. The constant KBInhib is the dissociation constant for the
dead-end complex LDH-NADH-lactate, which is included in the
mechanism fit by the computer program but cannot be included in
the mechanism on which the graphical methods are based. (2)
Estimates of the standard deviations of the kinetic constants.
These are needed for an understanding of the reliability and
significance of the values calculated for the kinetic constants.
(3) A list of the coordinates of points suitable for construction
of the lines of the reciprocal plots of the standard graphical
methods.
1
By J. A. Rupley, Tucson, Arizona
THEORY:
A. REMARKS ON FITTING OF A MODEL TO DATA
In a typical data reduction, a particular model to be tested is
fit to a set of data points under some criterion for best fit.
The ith data point of a set of N data points consists of a single
value for the dependent variable Yobserved(i) measured for
corresponding single values for the one or more independent
variables Xobserved(i). The commonly-used least squares criterion
for quality of fit is the minimum value of the sum of the squares
of the deviations between the observed values of Y and the values
of Y calculated according to the model being tested.
Working from the model to be fit to the data, one develops an
equation relating, for each of the N data points, the dependent
variable Y to the independent variables X and to a set of M
variable parameters p:
Ymodel(i) = F(Xobserved(i); p(j), j=1,M) eq. 1
For example, if the model predicts a linear relationship between
Y and a single independent variable X :
Ymodel(i) = p(1) + p(2) * Xobserved(i) eq. 2
The constants p(1) and p(2) of equation (2) are the Y axis
intercept and the slope, respectively, and of course are the same
for all data points (for all pairs of values Y(i) and X(i)).
The fitting of a model to data consists of finding the values of
the M variable parameters p that give the best agreement between
the N pairs of values of Ymodel(i) and Yobserved(i). Best
agreement can be defined as the minimum value of the least
squares function y:
N
y = SUM (Yobserved(i) - Ymodel(i))^2 * W(i) eq. 3
i=1
The factor W(i) of equation (3) is the normalized reciprocal
variance (the statistical weighting) of the ith data point, and
it can be set at unity if the data points are all of equal
estimated uncertainty.
Combining equations (1) and (3), one sees that the least squares
function y of equation (3) is a function of the full set of N
data points and a set of M variable parameters:
y = f(Yobserved(i), Xobserved(i), i=1,N; p(j), j=1,M) eq. 4
2
The fitting problem therefore consists of finding the minimum
value of the least squares function y, which for a given set of
data depends only on the M variable parameters p (the data points
Yobserved(i)---Xobserved(i) in equation (4) are constant in the
fitting). There are several methods commonly used to find the
minimum of y and thus evaluate the best fit values of the
parameters p. The more useful of these can handle nonlinear model
functions F (equation (1)) of arbitrary mathematical form. The
rate law for lactate dehydrogenase is an example of a nonlinear
model function.
In the simplex method used here, one constructs an M dimensional
polyhedron with M + 1 vertices (the simplex). Each dimension of
the simplex corresponds to a variable parameter of equation (4).
Each vertex of the simplex is a point in the M dimensional space,
which is called "parameter space" or "factor space." The M
coordinates of each vertex are values of the M parameters. Thus
each vertex of the simplex has an associated value of the least
squares function y. The starting simplex is constructed to be so
large as to include within it the point corresponding to the
minimum value of y. This minimum point has as its coordinates of
the best fit values of the parameters.
The minimization process shrinks the simplex about the minimum
point, even though the coordinates of the minimum are not known
beforehand, until the vertices of the simplex are so close
together and so nearly equal that an exit test is satisfied. The
exit test is set so that a desired level of accuracy is obtained.
The values of the M parameters averaged over all the vertices, ie
the parameter values for the centroid of the simplex, serve as
reliable estimates of the best fit parameter values (those for
the least squares function minimum), because the minimum point is
known to be inside the shrunken simplex and thus near the
centroid.
We generally want to estimate the uncertainties in the parameter
values obtained for a model fit to a particular set of data
points. Standard deviations of the parameters are calculated by
the program used here.
There are likely to be large uncertainties in the parameters if
there are few data points or if there are large deviations
between Ymodel and Yobserved. As a rule, one should have 5 to 10
times as many data points as parameters.
The first try at estimating uncertainties of the parameters can
fail. The calculation involves matrix inversion, the use of
differences between nearly equal large numbers, and the