Linear regression

From Simple English Wikipedia, the free encyclopedia
The idea is to find the red curve, the blue points are actual samples. With linear regression all points can be connected using a single, straight line. This example uses simple linear regression, where the square of the distance between the red line and each sample point is minimized.

Linear regression is a way to look at how something changes when other things change using math. A linear regression uses a dependent variable and one or more explanatory variables to create a straight line. This straight line is known as a "line of regression".

Linear regression was the first of many ways of performing regression analysis. This is because models which depend linearly on their unknown parameters are easier to fit than models which are non-linearly related to their parameters. Another advantage of linear regression is that the statistical properties of the resulting estimators are easier to determine.

Linear regression has many practical uses. Most applications fall into one of the following two broad categories:

  • Linear regression can be used to fit a predictive model to a set of observed values (data). This is useful, if the goal is prediction, forecasting or reduction. After developing such a model, if an additional value of X is then given without its accompanying value of y, the fitted model can be used to make a predicted value of y (written as [1]).
  • Given a variable y and a number of variables X1, ..., Xp that may be related to y, linear regression analysis can be applied to quantify the strength of the relationship between y and the Xj, to assess which Xj has no relationship with y at all, and to identify which subsets of the Xj contain redundant information about y.

Linear regression models try to make the vertical distance between the line and the data points (that is, the residuals) as small as possible.[2] This is called "fitting the line to the data." Often, linear regression models try to minimize the sum of the squares of the residuals (least squares), but other ways of fitting exist.[3] They include minimizing the "lack of fit" in some other norm (as with least absolute deviations regression), or minimizing a penalized version of the least squares loss function as in ridge regression. The least squares approach can also be used to fit models that are not linear. As outlined above, the terms "least squares" and "linear model" are closely linked, but they are not synonyms.

Usage[change | change source]

Economics[change | change source]

Linear regression is the main analytical tool in economics. For example, it is used to guess consumption spending,[4] fixed investment spending, inventory investment, purchases of a country's exports,[5] spending on imports,[5] the demand to hold liquid assets,[6] labor demand[7] and labor supply.[7]

Related pages[change | change source]

References[change | change source]

  1. "List of Probability and Statistics Symbols". Math Vault. 2020-04-26. Retrieved 2020-10-13.
  2. "Linear Regression - Basics | Data Basecamp". 2021-11-27. Retrieved 2022-07-22.
  3. "Linear Regression". www.stat.yale.edu. Retrieved 2020-10-13.
  4. Deaton, Angus (1992). Understanding Consumption. Oxford University Press. ISBN 978-0-19-828824-4.
  5. 5.0 5.1 Krugman, Paul R.; Obstfeld, M.; Melitz, Marc J. (2012). International Economics: Theory and Policy (9th global ed.). Harlow: Pearson. ISBN 9780273754091.
  6. Laidler, David E. W. (1993). The Demand for Money: Theories, Evidence, and Problems (4th ed.). New York: Harper Collins. ISBN 978-0065010985.
  7. 7.0 7.1 Ehrenberg; Smith (2008). Modern Labor Economics (10th international ed.). London: Addison-Wesley. ISBN 9780321538963.