If the given bivariate data are plotted on a graph, the points so obtained on the scatter diagram will more or less concentrate around a curve, called the ‘curve of regression’. The mathematical equation of the regression curve, usually called the regression equation, enables us to study the average change in the value of the dependent variable for any given value of the independent variable.
If the regression curve is a straight line, we say that there is linear regression between the variables under study. The equation of such a curve is the equation of a straight line, i.e., a first-degree equation in the variables x and y. In the case of linear regression the values of the dependent variable increase by a constant absolute amount for a unit change in the value of the independent variable. However, if the curve of regression is not a straight line, the regression is called curved or non-linear regression. We now give some of the advantages and properties of regression analysis.
Advantages of Regression Analysis:
- Regression analysis helps in developing a regression equation by which the value of a dependent variable can be estimated given a value of an independent variable.
- Regression analysis helps to determine the standard error of estimate to measure the variability or spread of values of a dependent variable with respect to the regression line. The smaller the variance and error of the estimate, the closer the pair of values (x, y) fall to the regression line and the better the line fits the data, that is, a good estimate can be made of the value of variable y. When all the points fall on the line, the standard error of estimate equals zero.
- When the sample size is large (greater than 30), the interval estimation for predicting the value of a dependent variable based on the standard error of the estimate is considered to be quite accurate.
Properties of Regression Analysis:
- The relationship between the dependent variable y and independent variable x exists and is linear. The average relationship between x and y can be described by a simple linear regression equation y = a + bx + e, where e is the deviation of a particular value of y from its expected value for a given value of independent variable x.
- For every value of the independent variable x, there is an expected (or mean) value of the dependent variable y and these values are normally distributed. The mean of these normally distributed values falls on the line of regression.
- The dependent variable y is a continuous random variable, whereas values of the independent variable x are fixed values and are not random.
- The sampling error associated with the expected value of the dependent variable y is assumed to be an independent random variable distributed normally with mean zero and constant standard deviation. The errors are not related to each other in successive observations.
- The standard deviation and variance of expected values of the dependent variable y about the regression line are constant for all values of the independent variable x within the range of the sample data.
- The value of the dependent variable cannot be estimated for a value of an independent variable lying outside the range of values in the sample data.
Fundamentals of Business Statistics – JK Sharma