Compute some basic descriptive statistics.
num_stat.Rd
Values of type factor
, character
and logical
are
treated as categorical. For logicals, the two categories are given the
labels `Yes` for TRUE
, and `No` for FALSE
. Factor levels with
zero counts are retained.
Arguments
- x
A vector or numeric, factor, character or logical values.
- digits
An integer specifying the number of significant digits to keep for numerical results. See
signif_pad
.- digits_pct
An integer specifying the number of significant digits to keep for percentage.
- rounding_fn
The function to use to do the rounding. Defaults to
signif_pad
.
Value
A list. For numeric x
, the list contains the numeric elements:
N
: the number of non-missing valuesNMISS
: the number of missing valuesSUM
: the sum of the non-missing valuesMEAN
: the mean of the non-missing valuesSD
: the standard deviation of the non-missing valuesMIN
: the minimum of the non-missing valuesMEDIAN
: the median of the non-missing valuesCV
: the percent coefficient of variation of the non-missing valuesGMEAN
: the geometric mean of the non-missing values if non-negative, orNA
GCV
: the percent geometric coefficient of variation of the non-missing values if non-negative, orNA
GSD
: the geometric standard deviation of the non-missing values if non-negative, orNA
Q1
: the first quartile of the non-missing values (aliasq25
)Q2
: the second quartile of the non-missing values (aliasq50
orMedian
)Q3
: the third quartile of the non-missing values (aliasq75
)IQR
: the inter-quartile range of the non-missing values (i.e.,Q3 - Q1
)
If x
is categorical (i.e. factor, character or logical), the list
contains a sublist for each category, where each sublist contains the
numeric elements:
FREQ
: the frequency countPCT
: the percent relative frequency, including NA in the denominatorPCTnoNA
: the percent relative frequency, excluding NA from the denominatorNall
: total count, including NA from the denominatorN
: total count, excluding NA from the denominator
Examples
x <- exp(rnorm(100, 1, 1))
num_stat(x)
#> $N
#> [1] "100"
#>
#> $NMISS
#> [1] "0"
#>
#> $SUM
#> [1] "494"
#>
#> $MEAN
#> [1] "4.94"
#>
#> $SD
#> [1] "6.55"
#>
#> $CV
#> [1] "132.7%"
#>
#> $GMEAN
#> [1] "2.92"
#>
#> $GCV
#> [1] "142.0%"
#>
#> $GSD
#> [1] "2.86"
#>
#> $MEDIAN
#> [1] "2.98"
#>
#> $MIN
#> [1] "0.199"
#>
#> $Q1
#> 25%
#> "1.90"
#>
#> $Q2
#> 50%
#> "2.98"
#>
#> $Q3
#> 75%
#> "5.07"
#>
#> $IQR
#> 75%
#> "3.17"
#>
#> $MAX
#> [1] "42.8"
#>
y <- factor(sample(0:1, 99, replace = TRUE), labels = c("Female", "Male"))
y[1:10] <- NA
cat_stat(y)
#> $Female
#> $Female$FREQ
#> [1] 45
#>
#> $Female$PCT
#> [1] "45.5%"
#>
#> $Female$PCTnoNA
#> [1] "50.6%"
#>
#> $Female$Nall
#> [1] 99
#>
#> $Female$N
#> [1] 89
#>
#>
#> $Male
#> $Male$FREQ
#> [1] 44
#>
#> $Male$PCT
#> [1] "44.4%"
#>
#> $Male$PCTnoNA
#> [1] "49.4%"
#>
#> $Male$Nall
#> [1] 99
#>
#> $Male$N
#> [1] 89
#>
#>
cat_stat(is.na(y))
#> $Yes
#> $Yes$FREQ
#> [1] 10
#>
#> $Yes$PCT
#> [1] "10.1%"
#>
#> $Yes$PCTnoNA
#> [1] "10.1%"
#>
#> $Yes$Nall
#> [1] 99
#>
#> $Yes$N
#> [1] 99
#>
#>
#> $No
#> $No$FREQ
#> [1] 89
#>
#> $No$PCT
#> [1] "89.9%"
#>
#> $No$PCTnoNA
#> [1] "89.9%"
#>
#> $No$Nall
#> [1] 99
#>
#> $No$N
#> [1] 99
#>
#>