As we have mentioned previously, one way of de-correlating probabilities
is to use a model. Take for example a set of data described by the
function where a defines the set of free parameters
defining ** f ** and is the generating data set. If we now
define the variation of the observed measurements about the
generating function with some random error we can see that the
probability will be equivalent to
as the model and generation point completely define
all but the random error.

Choosing Gaussian random errors with a standard deviation of gives

where is a normalisation constant. We can now construct the maximum likelihood function

which leads to the definition of log likelihood

This expression can be maximized as a function of the parameters a and this process is generally called a least squares fit. Whenever you encounter least squares there is therefore a built in assumption of independence and Gaussian distribution. In practical situations the validity of these assumptions should be checked by plotting the distribution of to make sure that it is Gaussian.

The choice of a least squares error metric gives many advantages in terms
of computational simplicity and later we will see that it is also
used extensively for definitions of error covariance and optimal
combination of data. However, the distribution of random variation
on the observed data ** X ** is something that generally we have
no initial control over and could well be arbitrary. This may initially
be seen as an overwhelming problem but
in most circumstances it is possible to make distributions
handleable (Gaussian) by transformation and ,
where ** g ** is chosen so that the initial distribution of maps
to a Gaussian distribution in ** g **.

One good example of this is in the
location of a known object in 3D data derived from a stereo vision
system. In the coordinate system where the viewing direction
corresponds to the ** z ** axis, x and y measures have errors
determined by image plane measurement. However, the depth
for a given point is given by

where ** I ** is the interoccular separation, f is the focal length
and and are image plane measurements. Attempts to
perform a least squares fit directly in space results
in instability due to the non-Gaussian nature of the
distribution. However, transformation to
yields Gaussian distributions and good results.

Under many circumstances, even after taking care to obtain Gaussian variation on the fitted quantities, there is still one final problem which needs to be addressed. This is the problem of fliers or outliers. Fliers are the name given to the data generated by any real system which do not conform to the assumed statistical distribution. These are generally caused by complete failure of the data measurement system and generated well away from the expected mean of the distribution. If ignored they can completely dominate the fitting process giving meaningless results. For example, measurement of the distance to an object pre-supposes that we have selected the correct object. The correct way to deal with these measures is to modify the expected probability distribution to include the long tails from fliers, this leads to the branch of numerical methods known as robust statistics. The simplest way to do this which allows us to continue to use standard methods for covariance estimation and optimal data combination, which assume Gaussian distribution, is to limit the contribution to the distribution from any data point to some maximum value . This makes the assumption that the statistical distribution is constant for any gearing point greater than from the expected position. Unfortunately this process precludes the use of standard least squares solution methods and solution must generally be iterative as the gearing point will vary for each data point during parameter estimation. This process is efficiently executed by the probabilistic Hough transform for small numbers of parameters [7].

Fri Mar 28 14:12:50 GMT 1997