Copyright	(c) 2010 Neil Brown
License	BSD3
Maintainer	bos@serpentine.com
Stability	experimental
Portability	portable
Safe Haskell	Safe-Inferred
Language	Haskell2010

Statistics.Test.MannWhitneyU

Contents

Mann-Whitney U test
- Wilcoxon rank sum test
References

Description

Mann-Whitney U test (also know as Mann-Whitney-Wilcoxon and Wilcoxon rank sum test) is a non-parametric test for assessing whether two samples of independent observations have different mean.

Synopsis

mannWhitneyUtest :: (Ord a, Unbox a) => PositionTest -> PValue Double -> Vector a -> Vector a -> Maybe TestResult
mannWhitneyU :: (Ord a, Unbox a) => Vector a -> Vector a -> (Double, Double)
mannWhitneyUCriticalValue :: (Int, Int) -> PValue Double -> Maybe Int
mannWhitneyUSignificant :: PositionTest -> (Int, Int) -> PValue Double -> (Double, Double) -> Maybe TestResult
wilcoxonRankSums :: (Ord a, Unbox a) => Vector a -> Vector a -> (Double, Double)
data TestResult
- = Significant
- | NotSignificant
data PositionTest
- = SamplesDiffer
- | AGreater
- | BGreater
significant :: Bool -> TestResult

Mann-Whitney U test

mannWhitneyUtest Source #

Arguments

:: (Ord a, Unbox a)
=> PositionTest	Perform one-tailed test (see description above).
-> PValue Double	The p-value at which to test (e.g. 0.05)
-> Vector a	First sample
-> Vector a	Second sample
-> Maybe TestResult	Return `Nothing` if the sample was too small to make a decision.

Perform Mann-Whitney U Test for two samples and required significance. For additional information check documentation of mannWhitneyU and mannWhitneyUSignificant. This is just a helper function.

One-tailed test checks whether first sample is significantly larger than second. Two-tailed whether they are significantly different.

mannWhitneyU :: (Ord a, Unbox a) => Vector a -> Vector a -> (Double, Double) Source #

The Mann-Whitney U Test.

This is sometimes known as the Mann-Whitney-Wilcoxon U test, and confusingly many sources state that the Mann-Whitney U test is the same as the Wilcoxon's rank sum test (which is provided as wilcoxonRankSums). The Mann-Whitney U is a simple transform of Wilcoxon's rank sum test.

Again confusingly, different sources state reversed definitions for U₁ and U₂, so it is worth being explicit about what this function returns. Given two samples, the first, xs₁, of size n₁ and the second, xs₂, of size n₂, this function returns (U₁, U₂) where U₁ = W₁ - (n₁(n₁+1))/2 and U₂ = W₂ - (n₂(n₂+1))/2, where (W₁, W₂) is the return value of wilcoxonRankSums xs1 xs2.

Some sources instead state that U₁ and U₂ should be the other way round, often expressing this using U₁' = n₁n₂ - U₁ (since U₁ + U₂ = n₁n₂).

All of which you probably don't care about if you just feed this into mannWhitneyUSignificant.

mannWhitneyUCriticalValue Source #

Arguments

:: (Int, Int)	The sample size
-> PValue Double	The p-value (e.g. 0.05) for which you want the critical value.
-> Maybe Int	The critical value (of U).

Calculates the critical value of Mann-Whitney U for the given sample sizes and significance level.

This function returns the exact calculated value of U for all sample sizes; it does not use the normal approximation at all. Above sample size 20 it is generally recommended to use the normal approximation instead, but this function will calculate the higher critical values if you need them.

The algorithm to generate these values is a faster, memoised version of the simple unoptimised generating function given in section 2 of "The Mann Whitney Wilcoxon Distribution Using Linked Lists"

mannWhitneyUSignificant Source #

Arguments

:: PositionTest	Perform one-tailed test (see description above).
-> (Int, Int)	The samples' size from which the (U₁,U₂) values were derived.
-> PValue Double	The p-value at which to test (e.g. 0.05)
-> (Double, Double)	The (U₁, U₂) values from `mannWhitneyU`.
-> Maybe TestResult	Return `Nothing` if the sample was too small to make a decision.

Calculates whether the Mann Whitney U test is significant.

If both sample sizes are less than or equal to 20, the exact U critical value (as calculated by mannWhitneyUCriticalValue) is used. If either sample is larger than 20, the normal approximation is used instead.

If you use a one-tailed test, the test indicates whether the first sample is significantly larger than the second. If you want the opposite, simply reverse the order in both the sample size and the (U₁, U₂) pairs.

Wilcoxon rank sum test

wilcoxonRankSums :: (Ord a, Unbox a) => Vector a -> Vector a -> (Double, Double) Source #

The Wilcoxon Rank Sums Test.

This test calculates the sum of ranks for the given two samples. The samples are ordered, and assigned ranks (ties are given their average rank), then these ranks are summed for each sample.

The return value is (W₁, W₂) where W₁ is the sum of ranks of the first sample and W₂ is the sum of ranks of the second sample. This test is trivially transformed into the Mann-Whitney U test. You will probably want to use mannWhitneyU and the related functions for testing significance, but this function is exposed for completeness.

data TestResult Source #

Result of hypothesis testing

Constructors

Significant	Null hypothesis should be rejected
NotSignificant	Data is compatible with hypothesis

Instances

Instances details

FromJSON TestResult Source #
Instance details Defined in Statistics.Test.Types Methods parseJSON :: Value -> Parser TestResult Source # parseJSONList :: Value -> Parser [TestResult] Source #
ToJSON TestResult Source #
Instance details Defined in Statistics.Test.Types Methods toJSON :: TestResult -> Value Source # toEncoding :: TestResult -> Encoding Source # toJSONList :: [TestResult] -> Value Source # toEncodingList :: [TestResult] -> Encoding Source #
Data TestResult Source #
Instance details Defined in Statistics.Test.Types Methods gfoldl :: (forall d b. Data d => c (d -> b) -> d -> c b) -> (forall g. g -> c g) -> TestResult -> c TestResult # gunfold :: (forall b r. Data b => c (b -> r) -> c r) -> (forall r. r -> c r) -> Constr -> c TestResult # toConstr :: TestResult -> Constr # dataTypeOf :: TestResult -> DataType # dataCast1 :: Typeable t => (forall d. Data d => c (t d)) -> Maybe (c TestResult) # dataCast2 :: Typeable t => (forall d e. (Data d, Data e) => c (t d e)) -> Maybe (c TestResult) # gmapT :: (forall b. Data b => b -> b) -> TestResult -> TestResult # gmapQl :: (r -> r' -> r) -> r -> (forall d. Data d => d -> r') -> TestResult -> r # gmapQr :: forall r r'. (r' -> r -> r) -> r -> (forall d. Data d => d -> r') -> TestResult -> r # gmapQ :: (forall d. Data d => d -> u) -> TestResult -> [u] # gmapQi :: Int -> (forall d. Data d => d -> u) -> TestResult -> u # gmapM :: Monad m => (forall d. Data d => d -> m d) -> TestResult -> m TestResult # gmapMp :: MonadPlus m => (forall d. Data d => d -> m d) -> TestResult -> m TestResult # gmapMo :: MonadPlus m => (forall d. Data d => d -> m d) -> TestResult -> m TestResult #
Generic TestResult Source #
Instance details Defined in Statistics.Test.Types Associated Types type Rep TestResult :: Type -> Type # Methods from :: TestResult -> Rep TestResult x # to :: Rep TestResult x -> TestResult #
Show TestResult Source #
Instance details Defined in Statistics.Test.Types Methods showsPrec :: Int -> TestResult -> ShowS # show :: TestResult -> String # showList :: [TestResult] -> ShowS #
Binary TestResult Source #
Instance details Defined in Statistics.Test.Types Methods put :: TestResult -> Put # get :: Get TestResult # putList :: [TestResult] -> Put #
NFData TestResult Source #
Instance details Defined in Statistics.Test.Types Methods rnf :: TestResult -> () #
Eq TestResult Source #
Instance details Defined in Statistics.Test.Types Methods (==) :: TestResult -> TestResult -> Bool # (/=) :: TestResult -> TestResult -> Bool #
Ord TestResult Source #
Instance details Defined in Statistics.Test.Types Methods compare :: TestResult -> TestResult -> Ordering # (<) :: TestResult -> TestResult -> Bool # (<=) :: TestResult -> TestResult -> Bool # (>) :: TestResult -> TestResult -> Bool # (>=) :: TestResult -> TestResult -> Bool # max :: TestResult -> TestResult -> TestResult # min :: TestResult -> TestResult -> TestResult #
type Rep TestResult Source #
Instance details Defined in Statistics.Test.Types type Rep TestResult = D1 ('MetaData "TestResult" "Statistics.Test.Types" "statistics-0.16.2.1-CZx41IRMcmf3DlPKOW81PQ" 'False) (C1 ('MetaCons "Significant" 'PrefixI 'False) (U1 :: Type -> Type) :+: C1 ('MetaCons "NotSignificant" 'PrefixI 'False) (U1 :: Type -> Type))

data PositionTest Source #

Test type for test which compare positional (mean,median etc.) information of samples.

Constructors

SamplesDiffer	Test whether samples differ in position. Null hypothesis is samples are not different
AGreater	Test if first sample (A) is larger than second (B). Null hypothesis is first sample is not larger than second.
BGreater	Test if second sample is larger than first.

Instances

Instances details

FromJSON PositionTest Source #
Instance details Defined in Statistics.Test.Types Methods parseJSON :: Value -> Parser PositionTest Source # parseJSONList :: Value -> Parser [PositionTest] Source #
ToJSON PositionTest Source #
Instance details Defined in Statistics.Test.Types Methods toJSON :: PositionTest -> Value Source # toEncoding :: PositionTest -> Encoding Source # toJSONList :: [PositionTest] -> Value Source # toEncodingList :: [PositionTest] -> Encoding Source #
Data PositionTest Source #
Instance details Defined in Statistics.Test.Types Methods gfoldl :: (forall d b. Data d => c (d -> b) -> d -> c b) -> (forall g. g -> c g) -> PositionTest -> c PositionTest # gunfold :: (forall b r. Data b => c (b -> r) -> c r) -> (forall r. r -> c r) -> Constr -> c PositionTest # toConstr :: PositionTest -> Constr # dataTypeOf :: PositionTest -> DataType # dataCast1 :: Typeable t => (forall d. Data d => c (t d)) -> Maybe (c PositionTest) # dataCast2 :: Typeable t => (forall d e. (Data d, Data e) => c (t d e)) -> Maybe (c PositionTest) # gmapT :: (forall b. Data b => b -> b) -> PositionTest -> PositionTest # gmapQl :: (r -> r' -> r) -> r -> (forall d. Data d => d -> r') -> PositionTest -> r # gmapQr :: forall r r'. (r' -> r -> r) -> r -> (forall d. Data d => d -> r') -> PositionTest -> r # gmapQ :: (forall d. Data d => d -> u) -> PositionTest -> [u] # gmapQi :: Int -> (forall d. Data d => d -> u) -> PositionTest -> u # gmapM :: Monad m => (forall d. Data d => d -> m d) -> PositionTest -> m PositionTest # gmapMp :: MonadPlus m => (forall d. Data d => d -> m d) -> PositionTest -> m PositionTest # gmapMo :: MonadPlus m => (forall d. Data d => d -> m d) -> PositionTest -> m PositionTest #
Generic PositionTest Source #
Instance details Defined in Statistics.Test.Types Associated Types type Rep PositionTest :: Type -> Type # Methods from :: PositionTest -> Rep PositionTest x # to :: Rep PositionTest x -> PositionTest #
Show PositionTest Source #
Instance details Defined in Statistics.Test.Types Methods showsPrec :: Int -> PositionTest -> ShowS # show :: PositionTest -> String # showList :: [PositionTest] -> ShowS #
Binary PositionTest Source #
Instance details Defined in Statistics.Test.Types Methods put :: PositionTest -> Put # get :: Get PositionTest # putList :: [PositionTest] -> Put #
NFData PositionTest Source #
Instance details Defined in Statistics.Test.Types Methods rnf :: PositionTest -> () #
Eq PositionTest Source #
Instance details Defined in Statistics.Test.Types Methods (==) :: PositionTest -> PositionTest -> Bool # (/=) :: PositionTest -> PositionTest -> Bool #
Ord PositionTest Source #
Instance details Defined in Statistics.Test.Types Methods compare :: PositionTest -> PositionTest -> Ordering # (<) :: PositionTest -> PositionTest -> Bool # (<=) :: PositionTest -> PositionTest -> Bool # (>) :: PositionTest -> PositionTest -> Bool # (>=) :: PositionTest -> PositionTest -> Bool # max :: PositionTest -> PositionTest -> PositionTest # min :: PositionTest -> PositionTest -> PositionTest #
type Rep PositionTest Source #
Instance details Defined in Statistics.Test.Types type Rep PositionTest = D1 ('MetaData "PositionTest" "Statistics.Test.Types" "statistics-0.16.2.1-CZx41IRMcmf3DlPKOW81PQ" 'False) (C1 ('MetaCons "SamplesDiffer" 'PrefixI 'False) (U1 :: Type -> Type) :+: (C1 ('MetaCons "AGreater" 'PrefixI 'False) (U1 :: Type -> Type) :+: C1 ('MetaCons "BGreater" 'PrefixI 'False) (U1 :: Type -> Type)))

significant :: Bool -> TestResult Source #

significant if parameter is True, not significant otherwise

References

Cheung, Y.K.; Klotz, J.H. (1997) The Mann Whitney Wilcoxon distribution using linked lists. Statistica Sinica 7:805–813. http://www3.stat.sinica.edu.tw/statistica/oldpdf/A7n316.pdf.