Press "Enter" to skip to content

Statistical functions in R

Zigya Acadmey 0

In this article, you’ll learn Statistical functions used in R. We will also be each one of them with an example and various ways to use them for better understanding.

R standard installation contains wide range of statistical functions. In this article, we will briefly look at the most important function.

Arithmetic Mean mean()

Generic function for the (trimmed) arithmetic mean.

Usage

> mean(x, …)
> mean(x, trim = 0, na.rm = FALSE, …)

Arguments

ValuesDescription
xAn R object. Currently, there are methods for numeric/logical vectors and date, date-time, and time interval objects. Complex vectors are allowed for trim = 0, only.
trimthe fraction (0 to 0.5) of observations to be trimmed from each end of x before the mean is computed. Values of trim outside that range are taken as the nearest endpoint.
na.rma logical value indicating whether NA values should be stripped before the computation proceeds.

Example

# Create a vector with random values with mean of 50 and sd of 5
> x <- round(rnorm(10, mean = 50, sd = 5))
> x
 [1] 52 47 54 51 49 49 60 54 56 50

> mean(x)
[1] 52.2

> mean(x, trim=49)
[1] 51.5

Median Value

Compute the sample median.

Usage

median(x, na.rm = FALSE, …)

Arguments

ValuesDescription
xAn R object. Currently, there are methods for numeric/logical vectors and date, date-time, and time interval objects. Complex vectors are allowed for trim = 0, only.
na.rma logical value indicating whether NA values should be stripped before the computation proceeds.

Let’s see it with an example.

> median(1:10)
> median(c(2,5,1,3,5,23,34,12,67))
[1] 5

Variance

The variance is a numerical measure of how the data values are dispersed around the mean. In particular, the sample variance is defined as:

Estimation Of A VAR(P)

Estimation of a VAR by utilizing OLS per equation.

Usage

VAR(y, p = 1, type = c("const", "trend", "both", "none"),
season = NULL, exogen = NULL, lag.max = NULL,
ic = c("AIC", "HQ", "SC", "FPE"))
print(x, digits = max(3, getOption("digits") - 3), ...)

Arguments

ValuesDescription
yData item containing the endogenous variables
pInteger for the lag order (default is p=1).
typeType of deterministic regressors to include.
seasonInlusion of centered seasonal dummy variables (integer value of frequency).
exogenInlusion of exogenous variables.
lag.maxInteger, determines the highest lag order for lag length selection according to the choosen ic.
icCharacter, selects the information criteria, if lag.max is not NULL.
xObject with class attribute ‘varest’.

Let’s take a built-in dataset cars and find the var of speed in cars.

# Load the cars dataset
> dt <- cars
> speed <- dt$speed
> speed
 [1]  4  4  7  7  8  9 10 10 10 11 11 12 12 12 12 13 13 13 13 14 14 14 14 15 15
[26] 15 16 16 17 17 17 18 18 18 18 19 19 19 20 20 20 20 20 22 23 24 24 24 24 25

> var(speed)
[1] 27.95918

Standard Deviation

This function computes the standard deviation of the values in x. If na.rm is TRUE then missing values are removed before computation proceeds.

Usage
sd(x, na.rm = FALSE)
Arguments
ValuesDescription
xAn R object. Currently, there are methods for numeric/logical vectors and date, date-time, and time interval objects. Complex vectors are allowed for trim = 0, only.
na.rma logical value indicating whether NA values should be stripped before the computation proceeds.

Let’s see it with an example using rnorm() function.

> x = rnorm(10, 10, 20)
> x
 [1]  36.1663953  32.9141902  18.7596222 -20.5158583 -15.9542984   0.8033739
 [7]  -3.1769068   6.7160742   3.7626753   8.7092254

> sd(x)
[1] 18.55818

Conclusion

Hence, we the various function which are used for statistical programming in R, along with how to use them with each example each.

This brings the end of this Blog. We really appreciate your time.

Hope you liked it.

Do visit our page www.zigya.com/blog for more informative blogs on Data Science

Keep Reading! Cheers!

Zigya Academy
BEING RELEVANT

Leave a Reply

Your email address will not be published.