Support vector machine /regression feature selection with an application towards classification
Initial data mining and response surface methodology applications demonstrated support vector machines and regression (SVM and SVR, respectively) as the choice classification function approximation for market data provided by the United States Army Accessions and Recruiting Commands (59), (60). Unlike logistic regression and random forest, SVM/SVR alone cannot provide insight to feature selection, but rather exhibits “black box” functionality similar to artificial neural networks. Motivated by the predictive result of SVM/SVR, this research obtains SVM/SVR features by combining a kernel recursive criterion within an alternate branch and bound enumeration. The kernel recursive criterion eliminates repetitive SVM/SVR calculations, thereby speeding the convergence to an optimal feature subset, which is guaranteed by the alternate enumeration and monotonic property of a quadratic loss function.
We compare the SVM/SVR feature selection results with step-wise logistic regression to show the usefulness of the method. The experiment occurs within a response surface methodology environment using experimental data. We control the number of prediction variables, the number of noise variables, the error on the noise variables, and the training data size. In support of the SVM/SVR feature selection, the research demonstrates a method for deploying the SVM/SVR feature selection on Army recruiting and market data.
The contributions of the research are: (1) developing a SVM/SVR specific kernel recursive criterion within an alternate branch and bound enumeration; (2) assessing the effects of training data size on the recursive kernel criterion: and (3) demonstrating the effectiveness of the kernel recursive criterion for feature selection on an Army market data application.