Mathematics Behind the UVS Library

Variability is a common feature of everyday life. The weather, the volume of traffic we face on the way to work, and the stock market all vary in ways we cannot control. A machine that makes machine screws never makes two exactly the same and complex systems sometimes work and sometimes don't. Although we can't control variability we can't ignore it either. About the best we can do is study it to try to minimize the damage it does.

To develop a C++ library to help with these studies one must make some sort of simplifying assumption, since the complete set of tools provided by statisticians is voluminous indeed. The "UVS" library deals with statistical distributions of a single variable ("UniVariate Statistics"). Thus multivariate tools such as regression, ANOVA, and design of experiments are excluded. However, all of these are based at least in part on univariate statistics, so the library may still be of use for authors of multivariate analysis software.

The first line of defense used by statisticians against variability is statistical modeling. For independent trials of a process which can be characterized by a single variable this will take the form of a univariate statistical distribution. "Discrete" random variables, usually found by counting, generally take on non-negative integer values. "Continuous" random variables are used for random variables which can take on a continuous range of values. Although mathematical techniques exist for dealing with both types in a single formalism, statisticians have maintained a tradition of treating them separately and we will honor that tradition here.

Once we have a statistical model for a variable process we can use it to compute probabilities for various outcomes. At the highest level, this allows us to distinguish expected sampling variability from unexpected changes in the state of a process. This is important in many fields because without this understanding the tendency is to either waste time chasing noise or fail to act on information which is misinterpreted as noise.