Afterwards the difference is taken between the predicted observation and the actual observation. This residual is computed for the ith observation by first fitting a model without the ith observation, then using this model to predict the ith observation. When this option is selected, the Deleted Residuals are displayed in the output. As a result, any residual with absolute value exceeding 3 usually requires attention. These residuals have t - distributions with ( n-k-1) degrees of freedom. Studentized residuals are computed by dividing the unstandardized residuals by quantities related to the diagonal elements of the hat matrix, using a common scale estimate computed without the ith case in the model. When this option is selected, the Studentized Residuals are displayed in the output. For more information on partitioning, please see the Data Mining Partition section.Ĭlick Advanced to display the Multiple Linear Regression - Advanced Options dialog.
If partitioning has already occurred on the data set, this option is disabled. If this option is selected, XLMiner partitions the data set before running the prediction method. XLMiner V2015 provides the ability to partition a data set from within a classification or prediction method by selecting Partitioning Options on the Step 2 of 2 dialog. Since we did not create a Test Partition, the options under Score Test Data are disabled. Under Score Training Data and Score Validation Data, select all options to produce all four reports in the output. When this option is selected, the variance-covariance matrix of the estimated regression coefficients is displayed in the output. Under Residuals, select Unstandardized to display the Unstandardized Residuals in the output, which are computed by the formula: Unstandardized residual = Actual response - Predicted response.
Standardized residuals are obtained by dividing the unstandardized residuals by the respective standard deviations. Under Residuals, select Standardized to display the Standardized Residuals in the output. When this option is selected, the ANOVA table is displayed in the output. When this option is selected, the fitted values are displayed in the output. Leave this option unchecked for this example. If Force constant term to zero is selected, there is constant term in the equation. Select OK to advance to the Variable Selection dialog. If the number of rows in the data is less than the number of variables selected as Input variables, XLMiner displays the following prompt. MEDV).Ĭlick Next to advance to the Step 2 of 2 dialog. On the XLMiner ribbon, from the Data Mining tab, select Predict - Multiple Linear Regression to open the Multiple Linear Regression - Step 1 of 2 dialog.Īt Output Variable, select MEDV, and from the Selected Variables list, select all remaining variables (except CAT. Select a cell on the Data_Partition worksheet. On the XLMiner ribbon, from the Data Mining tab, select Partition - Standard Partition to open the Standard Data Partition dialog. For more information on partitioning a data set, see the Data Mining Partition section. To partition the data into Training and Validation Sets, use the Standard Data Partition defaults with percentages of 60% of the data randomly allocated to the Training Set, and 40% of the data randomly allocated to the Validation Set. A portion of the data set is shown below. On the XLMiner ribbon, from the Applying Your Model tab, select Help - Examples, then Forecasting/Data Mining Examples to open the Boston_Housing.xlsx from the data sets folder. This variable will not be used in this example. MEDV, which has been created by categorizing median value (MEDV) into two categories: high (MEDV > 30) and low (MEDV < 30). In addition to these variables, the data set also contains an additional variable, Cat. A description of each variable is given in the following table. The following example illustrates XLMiner's Multiple Linear Regression method using the Boston Housing data set to predict the median house prices in housing tracts.