Regression analysis

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Regression analysis is a field of statistics. It is a tool to show the relationship between the inputs and the outputs of a system. There are different ways to do this. Better curve fitting usually needs more complex calculations.

Data modeling can be used without any knowledge of underlying processes that have generated the data;[1] in this case the model is an empirical model. Moreover, in modelling, knowledge of the probability distribution of the errors is not required. Regression analysis requires assumptions to be made regarding probability distribution of the errors. Statistical tests are made on the basis of these assumptions. In regression analysis the term "model" embraces both the function used to model the data and the assumptions concerning probability distributions.

Regression can be used for prediction (including forecasting of time-series data), inference, hypothesis testing, and modeling of causal relationships. These uses of regression rely heavily on the underlying assumptions being satisfied. Regression analysis has been criticized as being misused for these purposes in many cases where the appropriate assumptions cannot be verified to hold.[1][2] One factor contributing to the misuse of regression is that it can take considerably more skill to critique a model than to fit a model.[3]

References[change | edit source]

  1. 1.0 1.1 Richard A. Berk, Regression Analysis: A Constructive Critique, Sage Publications (2004)
  2. David A. Freedman, Statistical Models: Theory and Practice, Cambridge University Press (2005)
  3. [1] R. Dennis Cook; Sanford Weisberg "Criticism and Influence Analysis in Regression", Sociological Methodology, Vol. 13. (1982), pp. 313–361.