*** HOW TO CHOOSE THE BIN SIZE OF A HISTOGRAM ? ***


[NB:  Implemented in NAPA header] :


=======================
Scott rule
=======================

  From http://www.fmrib.ox.ac.uk/analysis/techrep/tr00mj2/tr00mj2/node24.html

  It has been shown [Scott, 1979] that the optimal histogram bin size, which provides the most efficient,
  unbiased estimation of the probability density Function, is achieved when:

    Bin size = 3.49 S / N**(1/3)

  where W is the width of the histogram bin, S is the standard deviation of the distribution and N is the
  number of available samples. In practice, the estimated standard deviation, s, must be used.


..................................................................................................................................

  [NB:  NOT yet implemented in NAPA header] :

  A similar, but more robust, result was also obtained by Freedman and Diaconis (summarised in [Izenman, 1991]),
  which gives the bin width as:


========================
Freedman-Diaconis rule
========================
  From Wikipedia, the free encyclopedia

  In statistics,
  the Freedman-Diaconis rule is used to specify the number of bins to be used in a histogram.
  It is used to smooth the data. The general equation for the rule is:

    Bin size = 2.0 IQR / N**(1/3)

  where IQR is the interquartile range of the data and N is the number of observations in the sample.

  Reference:
   Freedman D and Diaconis P (1981).
   On the histogram as a density estimator:L2 theory. Probability Theory and Related Fields. 57(4): 453-476


..................................................................................................................................

  [NB:  NOT yet implemented in NAPA header] :

========================
Interquartile range
========================
  From Wikipedia, the free encyclopedia

  In descriptive statistics, the interquartile range (IQR) is the difference between the
  third and first quartiles and is a measure of statistical dispersion. The interquartile
  range is a more stable statistic than the range, and is often preferred to that statistic.
  Since 25% of the data are less than or equal to the first quartile and 25% are greater than
  or equal to the third quartile, the difference is the length of an interval that includes
  about half of the data. This difference should be measured in the same units as the data.
  Interquantile range is used to build Box plots, that can give a simple graphical
  representation of a probability distribution.

Example
     i    x[i]
     1    102
     2    104
     3    105 ---- the first quartile, Q1 = 105
     4    107
     5    108
     6    109 ---- the second quartile, Q2 or median = 109
     7    110
     8    112
     9    115 ---- the third quartile, Q3 = 115
    10    115
    11    118

  From this table, the interquartile range is 115 - 105 = 10.
  The interquartile mean is a measure of statistical dispersion.


For normally distributed data: IQR = 1.35   Sigma


..................................................................................................................................

  [NB:  NOT yet implemented in NAPA header] :

========================
Quartile
========================
  From Wikipedia, the free encyclopedia

  In descriptive statistics, a quartile is any of the three values which divide the sorted data set into
  four equal parts, so that each part represents 1/4th of the sample or population.

  Thus:

  first quartile  (designated Q1) = lower quartile = cuts off lowest 25% of data                 = 25th percentile
  second quartile (designated Q2) = median         = cuts data set in half                       = 50th percentile
  third quartile  (designated Q3) = upper quartile = cuts off highest 25% of data, or lowest 75% = 75th percentile

  The difference between the upper and lower quartiles is called the interquartile range.


  Example 1:

  Data Set:         6, 47, 49, 15, 42, 41,  7, 39, 43, 40, 36
  Ordered Data Set: 6,  7, 15, 36, 39, 40, 41, 42, 43, 47, 49

  Q1 = 15
  Q2 = 40
  Q3 = 43

  Example 2:

  Ordered Data Set: 7, 15, 36, 39, 40, 41

  Q1 = 15
  Q2 = (39+36)/2 = 37.5
  Q3 = 40


..................................................................................................................................