Typically, three Pythagorean means are defined and constitute the classical means used in various scientific fields; they are the arithmetic mean, the geometric mean, and the harmonic mean. Pythagoreans are reported to have first studied these means, along with later generations of Greek mathematicians, and this is why the means got the name ‘Pythagorean’.
In terms of a formal definition, for a vector of values , they are defined as follows:
In terms of the application of the Pythagorean means, there are differences in their usage due to their unique characteristics in capturing tendencies in data. Now, supposing the data are drawn from simple Gaussians, their most likely value coincides with the central tendency in them, expressed by the average value that takes the place of the arithmetic mean. Thus, the arithmetic mean is most appropriate in Gaussian-like distributions with values of the same units of measure. On the other hand, the geometric mean can be used when the data are expressed in different units. The harmonic mean is more applicable when the data express rates (ratios of quantities, like true positive rate, etc.), with most prominent the harmonic mean used in machine learning, no other than the -score (the traditional F-measure or balanced F-score), typically defined as
One must be informed that the definition of the means implies that:
, where the equality holds only if ’s are all equal.
Several restrictions to keep in mind include:
- The arithmetic mean is heavily affected by outliers, and its performance is degraded when tackling non-Gaussian distribution data (e.g. data with multiple peaks, multi-modal probability distributions).
- The geometric mean only tackles positive values.
- The harmonic mean also works only with positive values of rates; in addition, it has a strong tendency towards the min value in the data.
To see how the Pythagorean means tackle “average” values in data a colab notebook was created, which explores how the means “behave” for simple cases like linear, exponential and Gaussian data.
The following figure shows an example of the location of the means on a graph showing linear data. The line equation used was and was sampled over a total of 100 locations, linearly spaced in the range . It is definitely clear how the harmonic mean tends towards the minimum data value.
The following figure shows an example of the location of the means on a graph showing exponential data. The function used was and was sampled over a total of 100 locations, linearly spaced in the range .
The following figure shows an example of the location of the means on a graph showing 1-D Gaussian data, with zero mean and standard deviation of 5, taken as absolute values to accommodate for the means that require positive values. 1000 samples were used in this case. The graph also shows the minimum and maximum.
Last but not least, the following figure shows an example of the computation of the means on 2-D Gaussian mixture data, in which the two distributions are of the following characteristics:
- Gaussian #1: with an offset of 7 and a scaling factor of 2
- Gaussian #2: with an offset of 3 and a scaling factor of 4
There are a number of resources to consult regarding the Pythagorean means and here is just a list to begin exploring: