JuliaStats
diff --git a/‎docs/source/counts.rst
Lines changed: 1 addition & 1 deletion b/‎docs/source/counts.rst
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/source/cov.rst
Lines changed: 13 additions & 6 deletions b/‎docs/source/cov.rst
Lines changed: 13 additions & 6 deletions
diff --git a/‎docs/source/empirical.rst
Lines changed: 5 additions & 6 deletions b/‎docs/source/empirical.rst
Lines changed: 5 additions & 6 deletions
diff --git a/‎docs/source/means.rst
Lines changed: 2 additions & 2 deletions b/‎docs/source/means.rst
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/source/sampling.rst
Lines changed: 13 additions & 16 deletions b/‎docs/source/sampling.rst
Lines changed: 13 additions & 16 deletions
diff --git a/‎docs/source/scalarstats.rst
Lines changed: 41 additions & 22 deletions b/‎docs/source/scalarstats.rst
Lines changed: 41 additions & 22 deletions
@@ -8,7 +8,7 @@ Counting over an Integer Range
 
 .. function:: counts(x, a:b[, wv])
 
-    Count the number of times (or total weights if a weight vector ``wv`` is given) values in ``a:b`` appear in array ``x``. Here, the optional argument ``wv`` should be a weight vector of type ``WeightVec`` (see :ref:`weightvec`).
+    Count the number of times (or total weights if a weight vector ``wv`` is given) values in ``a:b`` appear in array ``x``. Here, the optional argument ``wv`` should be a weight vector of type ``AbstractWeights`` (see :ref:`weightvec`).
 
     This function returns a vector ``r`` of length ``n``, with ``n = length(a:b) = b-a+1``. In particular, we have
 
 
@@ -22,14 +22,21 @@ This package implements functions for computing scatter matrix, as well as weigh
 
 .. function:: scatter(X, wv[; vardim=..., mean=...])
 
-    Weighted scatter matrix. The weights are given by a weight vector ``wv`` of type ``WeightVec`` (see :ref:`weightvec`).
+    Weighted scatter matrix. The weights are given by a weight vector ``wv`` of type ``AbstractWeights`` (see :ref:`weightvec`).
 
-.. function:: cov(X, wv[; vardim=..., mean=...])
+.. function:: cov(X, w[; vardim=..., mean=..., corrected=...])
 
-    Weighted covariance matrix. 
+  Compute the weighted covariance matrix. Similar to ``var`` and ``std`` the biased covariance matrix (``corrected=false``) is computed by multiplying ``scattermat(X, w)`` by :math:`\frac{1}{\sum{w}}` to normalize.
+  However, the unbiased covariance matrix (``corrected=true``) is dependent on the type of weights used:
 
-    **Note:** By default, the covariance is normalized by the sum of weights, that is, ``cov(X, wv)`` is equal to ``scatter(X, wv) / sum(wv)``.
+    * ``AnalyticWeights``: :math:`\frac{1}{\sum w - \sum {w^2} / \sum w}`
+    * ``FrequencyWeights``: :math:`\frac{1}{\sum{w} - 1}`
+    * ``ProbabilityWeights``: :math:`\frac{n}{(n - 1) \sum w}` where ``n`` equals ``count(!iszero, w)``
+    * ``Weights``: ``ArgumentError`` (bias correction not supported)
 
-.. function:: mean_and_cov(x[, wv][; vardim=...])
+.. function:: mean_and_cov(x[, wv][; vardim=..., corrected=...])
 
-  Jointly compute the mean and covariance of ``x``. 
+  Jointly compute the mean and covariance matrix as a tuple.
+  A weighting vector ``wv`` can be specified. ``vardim`` that designates whether the variables are columns in the matrix (``1``) or rows (``2``).
+  Finally, bias correction is applied to the covariance calculation if ``corrected=true``.
+  See ``cov`` documentation for more details.
@@ -14,12 +14,12 @@ Histograms can be fitted to data using the ``fit`` method.
 
 **Arguments:**
 
-``data`` 
+``data``
   is either a vector (for a 1-dimensional histogram), or a tuple of
   vectors of equal length (for an *n*-dimensional histogram).
 
 ``weight``
-  is an optional ``:ref:`weightvec` WeightVec``` (of the same length as the
+  is an optional ``:ref:`weightvec` AbstractWeights``` (of the same length as the
   data vectors), denoting the weight each observation contributes to the
   bin. If no weight vector is supples, each observation has weight 1.
 
@@ -30,7 +30,7 @@ Histograms can be fitted to data using the ``fit`` method.
 
 **Keyword arguments:**
 
-``closed=:left/:right`` 
+``closed=:left/:right``
   determines whether the bin intervals are left-closed [a,b), or right-closed
   (a,b] (default = ``:right``).
 
@@ -48,7 +48,7 @@ Histograms can be fitted to data using the ``fit`` method.
     h = fit(Histogram, rand(100), weights(rand(100)), 0:0.1:1.0)
     h = fit(Histogram, [20], 0:20:100)
     h = fit(Histogram, [20], 0:20:100, closed=:left)
-    
+
     # Multivariate
     h = fit(Histogram, (rand(100),rand(100)))
     h = fit(Histogram, (rand(100),rand(100)),nbins=10)
@@ -60,7 +60,6 @@ Empirical Cumulative Distribution Function
 
 .. function:: ecdf(x)
 
-  Return an empirical cumulative distribution function based on a vector of samples given in ``x``. 
+  Return an empirical cumulative distribution function based on a vector of samples given in ``x``.
 
   **Note:** this is a higher-level function that returns a function, which can then be applied to evaluate CDF values on other samples.
-
@@ -32,7 +32,7 @@ The package provides functions to compute means of different kinds.
 
 .. function:: mean(x, w)
 
-  The ``mean`` function is also extended to accept a weight vector of type ``WeightVec`` (see :ref:`weightvec`) to compute weighted mean.
+  The ``mean`` function is also extended to accept a weight vector of type ``AbstractWeights`` (see :ref:`weightvec`) to compute weighted mean.
 
   **Examples:**
 
@@ -43,7 +43,7 @@ The package provides functions to compute means of different kinds.
 
 .. function:: mean(x, w, dim)
 
-  Compute weighted means of ``x`` along a certain dimension (specified by an integer ``dim``). The weights are given by a weight vector ``w`` (of type ``WeightVec``).
+  Compute weighted means of ``x`` along a certain dimension (specified by an integer ``dim``). The weights are given by a weight vector ``w`` (of type ``AbstractWeights``).
 
 .. function:: mean!(dst, x, w, dim)
 
 
@@ -9,12 +9,11 @@ The package provides functions for sampling from a given population (with or wit
 .. function:: sample([rng], a)
 
     Randomly draw an element from an array ``a``.
-
     Optionally specify a random number generator ``rng`` as the first argument (defaults to ``Base.GLOBAL_RNG``).
 
-.. function:: sample([rng], a, n[; replace=true, ordered=false])  
+.. function:: sample([rng], a, n[; replace=true, ordered=false])
 
-    Randomly draw ``n`` elements from ``a``. 
+    Randomly draw ``n`` elements from ``a``.
 
     Optionally specify a random number generator ``rng`` as the first argument (defaults to ``Base.GLOBAL_RNG``).
 
@@ -26,14 +25,13 @@ The package provides functions for sampling from a given population (with or wit
 .. function:: sample!([rng], a, x[; replace=true, ordered=false])
 
     Draw ``length(x)`` elements from ``a`` and write them to a pre-allocated array ``x``.
-
     Optionally specify a random number generator ``rng`` as the first argument (defaults to ``Base.GLOBAL_RNG``).
 
-.. function:: sample([rng], wv) 
+.. function:: sample([rng], wv)
 
-    Draw an integer in ``1:length(wv)`` with probabilities proportional to the weights given in ``wv``. 
+    Draw an integer in ``1:length(wv)`` with probabilities proportional to the weights given in ``wv``.
 
-    Here, ``wv`` should be a weight vector of type ``WeightVec`` (see :ref:`weightvec`).
+    Here, ``wv`` should be a weight vector of type ``AbstractWeights`` (see :ref:`weightvec`).
 
     Optionally specify a random number generator ``rng`` as the first argument (defaults to ``Base.GLOBAL_RNG``).
 
@@ -52,7 +50,7 @@ The package provides functions for sampling from a given population (with or wit
     **Keyword arguments**
 
     - ``replace``: indicates whether to have replacement (default = ``true``).
-    - ``ordered``: indicates whether to arrange the samples in ascending order (default = ``false``).    
+    - ``ordered``: indicates whether to arrange the samples in ascending order (default = ``false``).
 
 .. function:: sample!([rng], a, wv, x[; replace=true, ordered=false])
 
@@ -74,7 +72,7 @@ Here are a list of algorithms implemented in the package. The functions below ar
 
 - ``a``: source array representing the population
 - ``x``: the destination array
-- ``wv``: the weight vector (of type ``WeightVec``), for weighted sampling
+- ``wv``: the weight vector (of type ``AbstractWeights``), for weighted sampling
 - ``n``: the length of ``a``
 - ``k``: the length of ``x``. For sampling without replacement, ``k`` must not exceed ``n``.
 - ``rng``: optional random number generator (defaults to ``Base.GLOBAL_RNG``)
@@ -108,7 +106,7 @@ All following functions write results to ``x`` (pre-allocated) and return ``x``.
 
 .. function:: fisher_yates_sample!([rng], a, x)
 
-    *Fisher-Yates shuffling* (with early termination). 
+    *Fisher-Yates shuffling* (with early termination).
 
     Pseudo-code ::
 
@@ -118,15 +116,15 @@ All following functions write results to ``x`` (pre-allocated) and return ``x``.
             swap inds[i] with a random one in inds[i:n]
             set x[i] = a[inds[i]]
         end
-    
+
 
     This algorithm consumes ``k`` random numbers. It uses an integer array of length ``n`` internally to maintain the shuffled indices. It is considerably faster than Knuth's algorithm especially when ``n`` is greater than ``k``.
 
 .. function:: self_avoid_sample!([rng], a, x)
 
-    Use a set to maintain the index that has been sampled. Each time draw a new index, if the index has already been sampled, redraw until it draws an unsampled one. 
+    Use a set to maintain the index that has been sampled. Each time draw a new index, if the index has already been sampled, redraw until it draws an unsampled one.
 
-    This algorithm consumes about (or slightly more than) ``k`` random numbers, and requires ``O(k)`` memory to store the set of sampled indices. Very fast when ``n >> k``. 
+    This algorithm consumes about (or slightly more than) ``k`` random numbers, and requires ``O(k)`` memory to store the set of sampled indices. Very fast when ``n >> k``.
 
     However, if ``k`` is large and approaches ``n``, the rejection rate would increase drastically, resulting in poorer performance.
 
@@ -153,7 +151,7 @@ All following functions write results to ``x`` (pre-allocated) and return ``x``.
 
     *Direct sampling.*
 
-    Draw each sample by scanning the weight vector. 
+    Draw each sample by scanning the weight vector.
 
     This algorithm: (1) consumes ``k`` random numbers; (2) has time complexity ``O(n k)``, as scanning the weight vector each time takes ``O(n)``; and (3) requires no additional memory space.
 
@@ -173,5 +171,4 @@ All following functions write results to ``x`` (pre-allocated) and return ``x``.
 
     It makes a copy of the weight vector at initialization, and sets the weight to zero when the corresponding sample is picked.
 
-    This algorithm consumes ``O(k)`` random numbers, and has overall time complexity ``O(n k)``. 
-
+    This algorithm consumes ``O(k)`` random numbers, and has overall time complexity ``O(n k)``.
@@ -6,54 +6,73 @@ The package implements functions for computing various statistics over an array
 Moments
 ---------
 
-.. function:: var(x, wv[; mean=...])
+.. function:: var(x, w, [dim][; mean=..., corrected=...])
 
-  Compute weighted variance.
+  Compute the variance of a real-valued array ``x``, optionally over a dimension ``dim``.
+  Observations in ``x`` are weighted using weight vector ``w``.
+  The uncorrected (when ``corrected=false``) sample variance is defined as:
 
-  One can set the keyword argument ``mean``, which can be either ``nothing`` (to compute the mean value within the function), ``0``, or a pre-computed mean value.
+  :math:`\frac{1}{\sum{w}} \sum_{i=1}^n {w_i\left({x_i - m}\right)^2 }`
 
-  **Note:** the result is normalized by ``sum(wv)`` without correction.
+  where ``n`` is the length of the input and ``m`` is the mean.
+  The unbiased estimate (when ``corrected=true``) of the population variance is computed by
+  replacing :math:`\frac{1}{\sum{w}}` with a factor dependent on the type of weights used:
 
-.. function:: var(x, wv, dim[; mean=...])
+    * ``AnalyticWeights``: :math:`\frac{1}{\sum w - \sum {w^2} / \sum w}`
+    * ``FrequencyWeights``: :math:`\frac{1}{\sum{w} - 1}`
+    * ``ProbabilityWeights``: :math:`\frac{n}{(n - 1) \sum w}` where ``n`` equals ``count(!iszero, w)``
+    * ``Weights``: ``ArgumentError`` (bias correction not supported)
 
-  Weighted variance along a specific dimension.
+.. function:: std(v, w, [dim][; mean=..., corrected=...])
 
-.. function:: std(x, wv[; mean=...])
+  Compute the standard deviation of a real-valued array ``x``, optionally over a dimension ``dim``.
+  Observations in ``x`` are weighted using weight vector ``w``.
+  The uncorrected (when ``corrected=false``) sample standard deviation is defined as:
 
-  Compute weighted standard deviation.
+  :math:`\sqrt{\frac{1}{\sum{w}} \sum_{i=1}^n {w_i\left({x_i - m}\right)^2 }}`
 
-  One can set the keyword argument ``mean``, which can be either ``nothing`` (to compute the mean value within the function), ``0``, or a pre-computed mean value.
+  where ``n`` is the length of the input and ``m`` is the mean.
+  The unbiased estimate (when ``corrected=true``) of the population standard deviation is
+  computed by replacing :math:`\frac{1}{\sum{w}}` with a factor dependent on the type of
+  weights used:
 
-.. function:: std(x, wv, dim[; mean=...])
+    * ``AnalyticWeights``: :math:`\frac{1}{\sum w - \sum {w^2} / \sum w}`
+    * ``FrequencyWeights``: :math:`\frac{1}{\sum{w} - 1}`
+    * ``ProbabilityWeights``: :math:`\frac{n}{(n - 1) \sum w}` where ``n`` equals ``count(!iszero, w)``
+    * ``Weights``: ``ArgumentError`` (bias correction not supported)
 
-  Weighted standard deviation along a specific dimension.
+.. function:: mean_and_var(x[, w][, dim][; corrected=...])
 
-.. function:: mean_and_var(x[, wv][, dim])
+  Jointly compute the mean and variance of a real-valued array ``x``, optionally over a dimension ``dim``, as a tuple.
+  Observations in ``x`` can be weighted using weight vector ``w``.
+  Finally, bias correction is be applied to the variance calculation if ``corrected=true``.
+  See ``var`` documentation for more details.
 
-  Jointly compute the mean and variance of ``x``.
+.. function:: mean_and_std(x[, w][, dim][; corrected=...])
 
-.. function:: mean_and_std(x[, wv][, dim])
-
-  Jointly compute the mean and standard deviation of ``x``.
+  Jointly compute the mean and standard deviation of a real-valued array ``x``, optionally over a dimension ``dim``, as a tuple.
+  A weighting vector ``w`` can be specified to weight the estimates.
+  Finally, bias correction is applied to the standard deviation calculation if ``corrected=true``.
+  See ``std`` documentation for more details.
 
 .. function:: skewness(x[, wv])
 
   Compute the (standardized) `skewness <http://en.wikipedia.org/wiki/Skewness>`_ of ``x``.
 
-  One can optionally supply a weight vector of type ``WeightVec`` (see :ref:`weightvec`).
+  One can optionally supply a weight vector of type ``AbstractWeights`` (see :ref:`weightvec`).
 
 .. function:: kurtosis(x[, wv])
 
   Compute the (excessive) `kurtosis <http://en.wikipedia.org/wiki/Kurtosis>`_ of ``x``.
 
-  One can optionally supply a weight vector of type ``WeightVec`` (see :ref:`weightvec`).
+  One can optionally supply a weight vector of type ``AbstractWeights`` (see :ref:`weightvec`).
 
 .. function:: moment(x, k[, m][, wv])
 
   Compute the ``k``-th order central moment of the values in `x`. It is the sample mean of
   ``(x - mean(x)).^k``.
 
-  One can optionally supply the center ``m``, and/or a weight vector of type ``WeightVec`` (see :ref:`weightvec`).
+  One can optionally supply the center ``m``, and/or a weight vector of type ``AbstractWeights`` (see :ref:`weightvec`).
 
 
 Measurements of Variation
@@ -160,7 +179,7 @@ Quantile and Friends
 
 .. function:: median(x, w)
 
-  Compute the weighted median of ``x``, using weights given by a weight vector ``w`` (of type ``WeightVec``).  The weight and data vectors must have the same length.  The weighted median :math:`x_k` is the element of ``x`` that satisfies :math:`\sum_{x_i < x_k} w_i \le \frac{1}{2} \sum_{j} w_j` and :math:`\sum_{x_i > x_k} w_i \le \frac{1}{2} \sum_{j} w_j`.  If a weight has value zero, then its associated data point is ignored.  If none of the weights are positive, an error is thrown.  ``NaN`` is returned if ``x`` contains any ``NaN`` values.  An error is raised if ``w`` contains any ``NaN`` values.
+  Compute the weighted median of ``x``, using weights given by a weight vector ``w`` (of type ``AbstractWeights``).  The weight and data vectors must have the same length.  The weighted median :math:`x_k` is the element of ``x`` that satisfies :math:`\sum_{x_i < x_k} w_i \le \frac{1}{2} \sum_{j} w_j` and :math:`\sum_{x_i > x_k} w_i \le \frac{1}{2} \sum_{j} w_j`.  If a weight has value zero, then its associated data point is ignored.  If none of the weights are positive, an error is thrown.  ``NaN`` is returned if ``x`` contains any ``NaN`` values.  An error is raised if ``w`` contains any ``NaN`` values.
 
   **Examples:**
 
@@ -171,8 +190,8 @@ Quantile and Friends
 
 .. function:: quantile(x, w, p)
 
-  Compute the weighted quantiles of a vector ``x`` at a specified set of probability values ``p``, using weights given by a weight vector ``w`` (of type ``WeightVec``).  Weights must not be negative. The weights and data vectors must have the same length. The quantile for :math:`p` is defined as follows.  Denoting :math:`S_k = (k-1)w_k + (n-1) \sum_{i<k}w_i`, define :math:`x_{k+1}` the smallest element of ``x`` such that :math:`S_{k+1}/S_{n}` is strictly superior to :math:`p`. The function returns :math:`(1-\gamma) x_k + \gamma x_{k+1}` with  :math:`\gamma = (pS_n- S_k)/(S_{k+1}-S_k)`. This corresponds to  R-7, Excel, SciPy-(1,1), Maple-6 when ``w`` is one (see https://en.wikipedia.org/wiki/Quantile).
-  
+  Compute the weighted quantiles of a vector ``x`` at a specified set of probability values ``p``, using weights given by a weight vector ``w`` (of type ``AbstractWeights``).  Weights must not be negative. The weights and data vectors must have the same length. The quantile for :math:`p` is defined as follows.  Denoting :math:`S_k = (k-1)w_k + (n-1) \sum_{i<k}w_i`, define :math:`x_{k+1}` the smallest element of ``x`` such that :math:`S_{k+1}/S_{n}` is strictly superior to :math:`p`. The function returns :math:`(1-\gamma) x_k + \gamma x_{k+1}` with  :math:`\gamma = (pS_n- S_k)/(S_{k+1}-S_k)`. This corresponds to  R-7, Excel, SciPy-(1,1), Maple-6 when ``w`` is one (see https://en.wikipedia.org/wiki/Quantile).
+
 Mode and Modes
 ---------------