statistics-0.16.2.1: A library of statistical types, data, and functions
Copyright(c) 2009 Bryan O'Sullivan
LicenseBSD3
Maintainerbos@serpentine.com
Stabilityexperimental
Portabilityportable
Safe HaskellSafe-Inferred
LanguageHaskell2010

Statistics.Quantile

Description

Functions for approximating quantiles, i.e. points taken at regular intervals from the cumulative distribution function of a random variable.

The number of quantiles is described below by the variable q, so with q=4, a 4-quantile (also known as a quartile) has 4 intervals, and contains 5 points. The parameter k describes the desired point, where 0 ≤ kq.

Synopsis

Quantile estimation functions

Below is family of functions which use same algorithm for estimation of sample quantiles. It approximates empirical CDF as continuous piecewise function which interpolates linearly between points \((X_k,p_k)\) where \(X_k\) is k-th order statistics (k-th smallest element) and \(p_k\) is probability corresponding to it. ContParam determines how \(p_k\) is chosen. For more detailed explanation see [Hyndman1996].

This is the method used by most statistical software, such as R, Mathematica, SPSS, and S.

data ContParam Source #

Parameters α and β to the continuousBy function. Exact meaning of parameters is described in [Hyndman1996] in section "Piecewise linear functions"

Constructors

ContParam !Double !Double 

Instances

Instances details
FromJSON ContParam Source # 
Instance details

Defined in Statistics.Quantile

ToJSON ContParam Source # 
Instance details

Defined in Statistics.Quantile

Data ContParam Source # 
Instance details

Defined in Statistics.Quantile

Methods

gfoldl :: (forall d b. Data d => c (d -> b) -> d -> c b) -> (forall g. g -> c g) -> ContParam -> c ContParam #

gunfold :: (forall b r. Data b => c (b -> r) -> c r) -> (forall r. r -> c r) -> Constr -> c ContParam #

toConstr :: ContParam -> Constr #

dataTypeOf :: ContParam -> DataType #

dataCast1 :: Typeable t => (forall d. Data d => c (t d)) -> Maybe (c ContParam) #

dataCast2 :: Typeable t => (forall d e. (Data d, Data e) => c (t d e)) -> Maybe (c ContParam) #

gmapT :: (forall b. Data b => b -> b) -> ContParam -> ContParam #

gmapQl :: (r -> r' -> r) -> r -> (forall d. Data d => d -> r') -> ContParam -> r #

gmapQr :: forall r r'. (r' -> r -> r) -> r -> (forall d. Data d => d -> r') -> ContParam -> r #

gmapQ :: (forall d. Data d => d -> u) -> ContParam -> [u] #

gmapQi :: Int -> (forall d. Data d => d -> u) -> ContParam -> u #

gmapM :: Monad m => (forall d. Data d => d -> m d) -> ContParam -> m ContParam #

gmapMp :: MonadPlus m => (forall d. Data d => d -> m d) -> ContParam -> m ContParam #

gmapMo :: MonadPlus m => (forall d. Data d => d -> m d) -> ContParam -> m ContParam #

Generic ContParam Source # 
Instance details

Defined in Statistics.Quantile

Associated Types

type Rep ContParam :: Type -> Type #

Show ContParam Source # 
Instance details

Defined in Statistics.Quantile

Binary ContParam Source # 
Instance details

Defined in Statistics.Quantile

Default ContParam Source #

We use s as default value which is same as R's default.

Instance details

Defined in Statistics.Quantile

Methods

def :: ContParam Source #

Eq ContParam Source # 
Instance details

Defined in Statistics.Quantile

Ord ContParam Source # 
Instance details

Defined in Statistics.Quantile

type Rep ContParam Source # 
Instance details

Defined in Statistics.Quantile

type Rep ContParam = D1 ('MetaData "ContParam" "Statistics.Quantile" "statistics-0.16.2.1-CZx41IRMcmf3DlPKOW81PQ" 'False) (C1 ('MetaCons "ContParam" 'PrefixI 'False) (S1 ('MetaSel ('Nothing :: Maybe Symbol) 'SourceUnpack 'SourceStrict 'DecidedStrict) (Rec0 Double) :*: S1 ('MetaSel ('Nothing :: Maybe Symbol) 'SourceUnpack 'SourceStrict 'DecidedStrict) (Rec0 Double)))

class Default a where Source #

A class for types with a default value.

Minimal complete definition

Nothing

Methods

def :: a Source #

The default value for this type.

Instances

Instances details
Default All 
Instance details

Defined in Data.Default.Class

Methods

def :: All Source #

Default Any 
Instance details

Defined in Data.Default.Class

Methods

def :: Any Source #

Default CClock 
Instance details

Defined in Data.Default.Class

Methods

def :: CClock Source #

Default CDouble 
Instance details

Defined in Data.Default.Class

Methods

def :: CDouble Source #

Default CFloat 
Instance details

Defined in Data.Default.Class

Methods

def :: CFloat Source #

Default CInt 
Instance details

Defined in Data.Default.Class

Methods

def :: CInt Source #

Default CIntMax 
Instance details

Defined in Data.Default.Class

Methods

def :: CIntMax Source #

Default CIntPtr 
Instance details

Defined in Data.Default.Class

Methods

def :: CIntPtr Source #

Default CLLong 
Instance details

Defined in Data.Default.Class

Methods

def :: CLLong Source #

Default CLong 
Instance details

Defined in Data.Default.Class

Methods

def :: CLong Source #

Default CPtrdiff 
Instance details

Defined in Data.Default.Class

Methods

def :: CPtrdiff Source #

Default CSUSeconds 
Instance details

Defined in Data.Default.Class

Default CShort 
Instance details

Defined in Data.Default.Class

Methods

def :: CShort Source #

Default CSigAtomic 
Instance details

Defined in Data.Default.Class

Default CSize 
Instance details

Defined in Data.Default.Class

Methods

def :: CSize Source #

Default CTime 
Instance details

Defined in Data.Default.Class

Methods

def :: CTime Source #

Default CUInt 
Instance details

Defined in Data.Default.Class

Methods

def :: CUInt Source #

Default CUIntMax 
Instance details

Defined in Data.Default.Class

Methods

def :: CUIntMax Source #

Default CUIntPtr 
Instance details

Defined in Data.Default.Class

Methods

def :: CUIntPtr Source #

Default CULLong 
Instance details

Defined in Data.Default.Class

Methods

def :: CULLong Source #

Default CULong 
Instance details

Defined in Data.Default.Class

Methods

def :: CULong Source #

Default CUSeconds 
Instance details

Defined in Data.Default.Class

Methods

def :: CUSeconds Source #

Default CUShort 
Instance details

Defined in Data.Default.Class

Methods

def :: CUShort Source #

Default Int16 
Instance details

Defined in Data.Default.Class

Methods

def :: Int16 Source #

Default Int32 
Instance details

Defined in Data.Default.Class

Methods

def :: Int32 Source #

Default Int64 
Instance details

Defined in Data.Default.Class

Methods

def :: Int64 Source #

Default Int8 
Instance details

Defined in Data.Default.Class

Methods

def :: Int8 Source #

Default Word16 
Instance details

Defined in Data.Default.Class

Methods

def :: Word16 Source #

Default Word32 
Instance details

Defined in Data.Default.Class

Methods

def :: Word32 Source #

Default Word64 
Instance details

Defined in Data.Default.Class

Methods

def :: Word64 Source #

Default Word8 
Instance details

Defined in Data.Default.Class

Methods

def :: Word8 Source #

Default Ordering 
Instance details

Defined in Data.Default.Class

Methods

def :: Ordering Source #

Default NewtonParam 
Instance details

Defined in Numeric.RootFinding

Default RiddersParam 
Instance details

Defined in Numeric.RootFinding

Default ContParam Source #

We use s as default value which is same as R's default.

Instance details

Defined in Statistics.Quantile

Methods

def :: ContParam Source #

Default Integer 
Instance details

Defined in Data.Default.Class

Methods

def :: Integer Source #

Default () 
Instance details

Defined in Data.Default.Class

Methods

def :: () Source #

Default Double 
Instance details

Defined in Data.Default.Class

Methods

def :: Double Source #

Default Float 
Instance details

Defined in Data.Default.Class

Methods

def :: Float Source #

Default Int 
Instance details

Defined in Data.Default.Class

Methods

def :: Int Source #

Default Word 
Instance details

Defined in Data.Default.Class

Methods

def :: Word Source #

(Default a, RealFloat a) => Default (Complex a) 
Instance details

Defined in Data.Default.Class

Methods

def :: Complex a Source #

Default (First a) 
Instance details

Defined in Data.Default.Class

Methods

def :: First a Source #

Default (Last a) 
Instance details

Defined in Data.Default.Class

Methods

def :: Last a Source #

Default a => Default (Dual a) 
Instance details

Defined in Data.Default.Class

Methods

def :: Dual a Source #

Default (Endo a) 
Instance details

Defined in Data.Default.Class

Methods

def :: Endo a Source #

Num a => Default (Product a) 
Instance details

Defined in Data.Default.Class

Methods

def :: Product a Source #

Num a => Default (Sum a) 
Instance details

Defined in Data.Default.Class

Methods

def :: Sum a Source #

Integral a => Default (Ratio a) 
Instance details

Defined in Data.Default.Class

Methods

def :: Ratio a Source #

Default a => Default (IO a) 
Instance details

Defined in Data.Default.Class

Methods

def :: IO a Source #

Default (Maybe a) 
Instance details

Defined in Data.Default.Class

Methods

def :: Maybe a Source #

Default [a] 
Instance details

Defined in Data.Default.Class

Methods

def :: [a] Source #

(Default a, Default b) => Default (a, b) 
Instance details

Defined in Data.Default.Class

Methods

def :: (a, b) Source #

Default r => Default (e -> r) 
Instance details

Defined in Data.Default.Class

Methods

def :: e -> r Source #

(Default a, Default b, Default c) => Default (a, b, c) 
Instance details

Defined in Data.Default.Class

Methods

def :: (a, b, c) Source #

(Default a, Default b, Default c, Default d) => Default (a, b, c, d) 
Instance details

Defined in Data.Default.Class

Methods

def :: (a, b, c, d) Source #

(Default a, Default b, Default c, Default d, Default e) => Default (a, b, c, d, e) 
Instance details

Defined in Data.Default.Class

Methods

def :: (a, b, c, d, e) Source #

(Default a, Default b, Default c, Default d, Default e, Default f) => Default (a, b, c, d, e, f) 
Instance details

Defined in Data.Default.Class

Methods

def :: (a, b, c, d, e, f) Source #

(Default a, Default b, Default c, Default d, Default e, Default f, Default g) => Default (a, b, c, d, e, f, g) 
Instance details

Defined in Data.Default.Class

Methods

def :: (a, b, c, d, e, f, g) Source #

quantile Source #

Arguments

:: Vector v Double 
=> ContParam

Parameters α and β.

-> Int

k, the desired quantile.

-> Int

q, the number of quantiles.

-> v Double

x, the sample data.

-> Double 

O(n·log n). Estimate the kth q-quantile of a sample x, using the continuous sample method with the given parameters.

The following properties should hold, otherwise an error will be thrown.

  • input sample must be nonempty
  • the input does not contain NaN
  • 0 ≤ k ≤ q

quantiles :: (Vector v Double, Foldable f, Functor f) => ContParam -> f Int -> Int -> v Double -> f Double Source #

O(k·n·log n). Estimate set of the kth q-quantile of a sample x, using the continuous sample method with the given parameters. This is faster than calling quantile repeatedly since sample should be sorted only once

The following properties should hold, otherwise an error will be thrown.

  • input sample must be nonempty
  • the input does not contain NaN
  • for every k in set of quantiles 0 ≤ k ≤ q

quantilesVec :: (Vector v Double, Vector v Int) => ContParam -> v Int -> Int -> v Double -> v Double Source #

O(k·n·log n). Same as quantiles but uses Vector container instead of Foldable one.

Parameters for the continuous sample method

cadpw :: ContParam Source #

California Department of Public Works definition, α=0, β=1. Gives a linear interpolation of the empirical CDF. This corresponds to method 4 in R and Mathematica.

hazen :: ContParam Source #

Hazen's definition, α=0.5, β=0.5. This is claimed to be popular among hydrologists. This corresponds to method 5 in R and Mathematica.

spss :: ContParam Source #

Definition used by the SPSS statistics application, with α=0, β=0 (also known as Weibull's definition). This corresponds to method 6 in R and Mathematica.

s :: ContParam Source #

Definition used by the S statistics application, with α=1, β=1. The interpolation points divide the sample range into n-1 intervals. This corresponds to method 7 in R and Mathematica and is default in R.

medianUnbiased :: ContParam Source #

Median unbiased definition, α=1/3, β=1/3. The resulting quantile estimates are approximately median unbiased regardless of the distribution of x. This corresponds to method 8 in R and Mathematica.

normalUnbiased :: ContParam Source #

Normal unbiased definition, α=3/8, β=3/8. An approximately unbiased estimate if the empirical distribution approximates the normal distribution. This corresponds to method 9 in R and Mathematica.

Other algorithms

weightedAvg Source #

Arguments

:: Vector v Double 
=> Int

k, the desired quantile.

-> Int

q, the number of quantiles.

-> v Double

x, the sample data.

-> Double 

O(n·log n). Estimate the kth q-quantile of a sample, using the weighted average method. Up to rounding errors it's same as quantile s.

The following properties should hold otherwise an error will be thrown.

  • the length of the input is greater than 0
  • the input does not contain NaN
  • k ≥ 0 and k ≤ q

Median & other specializations

median Source #

Arguments

:: Vector v Double 
=> ContParam

Parameters α and β.

-> v Double

x, the sample data.

-> Double 

O(n·log n) Estimate median of sample

mad Source #

Arguments

:: Vector v Double 
=> ContParam

Parameters α and β.

-> v Double

x, the sample data.

-> Double 

O(n·log n). Estimate the median absolute deviation (MAD) of a sample x using continuousBy. It's robust estimate of variability in sample and defined as:

\[ MAD = \operatorname{median}(| X_i - \operatorname{median}(X) |) \]

midspread Source #

Arguments

:: Vector v Double 
=> ContParam

Parameters α and β.

-> Int

q, the number of quantiles.

-> v Double

x, the sample data.

-> Double 

O(n·log n). Estimate the range between q-quantiles 1 and q-1 of a sample x, using the continuous sample method with the given parameters.

For instance, the interquartile range (IQR) can be estimated as follows:

midspread medianUnbiased 4 (U.fromList [1,1,2,2,3])
==> 1.333333

Deprecated

continuousBy Source #

Arguments

:: Vector v Double 
=> ContParam

Parameters α and β.

-> Int

k, the desired quantile.

-> Int

q, the number of quantiles.

-> v Double

x, the sample data.

-> Double 

Deprecated: Use quantile instead

References