Fitting Linear Models
lm
is used to fit linear models. It can be used to carry out regression, single stratum analysis of variance and analysis of covariance (although aov
may provide a more convenient interface for these).- Springer Examples (2nd ed) Examples for the second edition. The table below provides an overview of the available examples. These R code examples are also contained in the packages urca and vars in the respective subfolders book-ex.
- Jan 30, 2012 I did no such thing. I even advised AGAINST doing it as it would cause unforeseen problems (I am moderate level networker). My bosses have ordered construction on the workplace so we have been moved to a back warehouse, and after a couple days my CEO comes walking over with the server computer in his arms claiming 'Its in the way of the construction' So moving the server is what you think is.
- The following Debug message is issued when the DEBUG CONN DETAIL option is requested and the application name is not known to VTAM. The IP address and port of the client, the TCPIP connection identifier, LU name, and Telnet module issuing the message are supplied.
Scheffe 1959, method is very general in that all possible contrasts can be tested for significance and confidence intervals can be constructed for the corresponding linear. The test is conservative.
- Keywords
- regression
Usage
Arguments
an object of class
'formula'
(or one that can be coerced to that class): a symbolic description of the model to be fitted. The details of model specification are given under ‘Details’.an optional data frame, list or environment (or object coercible by
as.data.frame
to a data frame) containing the variables in the model. If not found in data
, the variables are taken from environment(formula)
, typically the environment from which lm
is called.an optional vector specifying a subset of observations to be used in the fitting process.
an optional vector of weights to be used in the fitting process. Should be
NULL
or a numeric vector. If non-NULL, weighted least squares is used with weights weights
(that is, minimizing sum(w*e^2)
); otherwise ordinary least squares is used. See also ‘Details’,a function which indicates what should happen when the data contain
NA
s. The default is set by the na.action
setting of options
, and is na.fail
if that is unset. The ‘factory-fresh’ default is na.omit
. Another possible value is NULL
, no action. Value na.exclude
can be useful.the method to be used; for fitting, currently only
method = 'qr'
is supported; method = 'model.frame'
returns the model frame (the same as with model = TRUE
, see below).logicals. If
TRUE
the corresponding components of the fit (the model frame, the model matrix, the response, the QR decomposition) are returned.logical. If
FALSE
(the default in S but not in R) a singular fit is an error.an optional list. See the
contrasts.arg
of model.matrix.default
.this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be
NULL
or a numeric vector or matrix of extents matching those of the response. One or more offset
terms can be included in the formula instead or as well, and if more than one are specified their sum is used. See model.offset
.additional arguments to be passed to the low level regression fitting functions (see below).
Details
Models for
lm
are specified symbolically. A typical model has the form response ~ terms
where response
is the (numeric) response vector and terms
is a series of terms which specifies a linear predictor for response
. A terms specification of the form first + second
indicates all the terms in first
together with all the terms in second
with duplicates removed. A specification of the form first:second
indicates the set of terms obtained by taking the interactions of all terms in first
with all terms in second
. The specification first*second
indicates the cross of first
and second
. This is the same as first + second + first:second
.If the formula includes an
offset
, this is evaluated and subtracted from the response.If
response
is a matrix a linear model is fitted separately by least-squares to each column of the matrix.See
model.matrix
for some further details. The terms in the formula will be re-ordered so that main effects come first, followed by the interactions, all second-order, all third-order and so on: to avoid this pass a terms
object as the formula (see aov
and demo(glm.vr)
for an example).A formula has an implied intercept term. To remove this use either
y ~ x - 1
or y ~ 0 + x
. See formula
for more details of allowed formulae.Non-
NULL
weights
can be used to indicate that different observations have different variances (with the values in weights
being inversely proportional to the variances); or equivalently, when the elements of weights
are positive integers (w_i), that each response (y_i) is the mean of (w_i) unit-weight observations (including the case that there are (w_i) observations equal to (y_i) and the data have been summarized). However, in the latter case, notice that within-group variation is not used. Therefore, the sigma estimate and residual degrees of freedom may be suboptimal; in the case of replication weights, even wrong. Hence, standard errors and analysis of variance tables should be treated with care.lm
calls the lower level functions lm.fit
, etc, see below, for the actual numerical computations. For programming only, you may consider doing likewise.All of
weights
, subset
and offset
are evaluated in the same way as variables in formula
, that is first in data
and then in the environment of formula
.Value
lm
returns an object of class
'lm'
or for multiple responses of class c('mlm', 'lm')
.The functions
summary
and anova
are used to obtain and print a summary and analysis of variance table of the results. The generic accessor functions coefficients
, effects
, fitted.values
and residuals
extract various useful features of the value returned by lm
.An object of class
'lm'
is a list containing at least the following components:a named vector of coefficients
the residuals, that is response minus fitted values.
the fitted mean values.
the numeric rank of the fitted linear model.
(only for weighted fits) the specified weights.
the residual degrees of freedom.
the matched call.
the
terms
object used.Remove haze. Just load the images. And AirMagic will enhance as many of your photos at the same time as you want. Boost colors. Air magic 1 0 0 7143. Reveal hidden details.
(only where relevant) the contrasts used.
(only where relevant) a record of the levels of the factors used in fitting.
the offset used (missing if none were used).
if requested, the response used.
if requested, the model matrix used.
if requested (the default), the model frame used.
(where relevant) information returned by
model.frame
on the special handling of NA
s. In addition, non-null fits will have components assign, effects and (unless not requested) qr relating to the linear fit, for use by extractor functions such as summary and effects.
Note
Offsets specified by
offset
will not be included in predictions by predict.lm
, whereas those specified by an offset term in the formula will be.Using time series
Considerable care is needed when using
lm
with time series.Unless
na.action = NULL
, the time series attributes are stripped from the variables before the regression is done. (This is necessary as omitting NA
s would invalidate the time series attributes, and if NA
s are omitted in the middle of the series the result would no longer be a regular time series.)Even if the time series attributes are retained, they are not used to line up series, so that the time shift of a lagged or differenced regressor would be ignored. It is good practice to prepare a
data
argument by ts.intersect(…, dframe = TRUE)
, then apply a suitable na.action
to that data frame and call lm
with na.action = NULL
so that residuals and fitted values are time series.References
Chambers, J. M. (1992) Linear models. Chapter 4 of Statistical Models in S eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.
Wilkinson, G. N. and Rogers, C. E. (1973). Symbolic descriptions of factorial models for analysis of variance. Applied Statistics, 22, 392--399. 10.2307/2346786.
See Also
summary.lm
for summaries and anova.lm
for the ANOVA table; aov
for a different interface.The generic functions
coef
, effects
, residuals
, fitted
, vcov
.predict.lm
(via predict
) for prediction, including confidence and prediction intervals; confint
for confidence intervals of parameters.lm.influence
for regression diagnostics, and glm
for generalized linear models.The underlying low level functions,
lm.fit
for plain, and lm.wfit
for weighted regression fitting.More
lm()
examples are available e.g., in anscombe
, attitude
, freeny
, LifeCycleSavings
, longley
, stackloss
, swiss
.Rcode 2.8 X
biglm
in package biglm for an alternative way to fit linear models to large datasets (especially those with many cases).Aliases
- lm
Examples
library(stats)
# NOT RUN {require(graphics)## Annette Dobson (1990) 'An Introduction to Generalized Linear Models'.## Page 9: Plant Weight Data.ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)group <- gl(2, 10, 20, labels = c('Ctl','Trt'))weight <- c(ctl, trt)lm.D9 <- lm(weight ~ group)lm.D90 <- lm(weight ~ group - 1) # omitting intercept# }# NOT RUN {anova(lm.D9)summary(lm.D90)# }# NOT RUN {opar <- par(mfrow = c(2,2), oma = c(0, 0, 1.1, 0))plot(lm.D9, las = 1) # Residuals, Fitted, ..par(opar)# }# NOT RUN {### less simple examples in 'See Also' above# }
Community examples
Rcode 2.8 Free
linearmod1 <- lm(iq~read_ab, data= basedata1 ) summary(linearmod1)
[email protected] at Jan 17, 2017 stats v3.3.1
`lm()` takes a formula and a data frame. See [`formula()`](https://www.rdocumentation.org/packages/stats/topics/formula) for how to contruct the first argument.```{r}(model_with_intercept <- lm(weight ~ group, PlantGrowth))(model_without_intercept <- lm(weight ~ group - 1, PlantGrowth))```You get more information about the model using [`summary()`](https://www.rdocumentation.org/packages/stats/topics/summary.lm)```{r}(model_without_intercept <- lm(weight ~ group - 1, PlantGrowth))summary(model_without_intercept)```Diagnostic plots are available; see [`plot.lm()`](https://www.rdocumentation.org/packages/stats/topics/plot.lm) for more examples.```{r}(model_without_intercept <- lm(weight ~ group - 1, PlantGrowth))layout(matrix(1:6, nrow = 2))plot(model_without_intercept, which = 1:6)```You can predict new values; see [`predict()`](https://www.rdocumentation.org/packages/stats/topics/predict) and [`predict.lm()`](https://www.rdocumentation.org/packages/stats/topics/predict.lm) .```{r}(model_without_intercept <- lm(weight ~ group - 1, PlantGrowth))predictions <- data.frame(group = levels(PlantGrowth$group))predictions$weight <- predict(model_without_intercept, predictions)predictions# Plot predictions against the databoxplot(weight ~ group, PlantGrowth, ylab = 'weight')points(weight ~ group, predictions, col = 'red')```There are many methods available for inspecting `lm` objects.```{r}(model_without_intercept <- lm(weight ~ group - 1, PlantGrowth))confint(model_without_intercept)anova(model_without_intercept)residuals(model_without_intercept)fitted(model_without_intercept)influence(model_without_intercept)methods(class = 'lm')```
API documentation