value of (MEAN1 of other functions - MEAN1 of
Revised IP-OLDF) in the training samples, and
‘Diff2’ is the value of (MEAN2 of other functions -
MEAN2 of Revised IP-OLDF) in the validation
samples. The maximum values of ‘Diff1’ given by
SVM4 and logistic regression were 2.33 and 3.13,
respectively, and the maximum values of ‘Diff2’
given by these functions were 1.7 and 1.62,
respectively. The minimum values of ‘Diff1’ and
‘Diff2’ given by LDF were greater than 7.97% and
6.23%, respectively. It is concluded that LDF was
not as good as Revised IP-OLDF, S-SVM, and
logistic regression by 100-fold cross-validation.
In 2014, these results are recalculated using
LINGO Ver.14. The elapsed runtimes of Revised IP-
OLDF and S-SVM are 3 minutes 54 seconds and 2
minutes 22 seconds, respectively. The elapsed
runtimes of LDF and logistic regression by JMP are
24 minutes and 21 minutes, respectively.
5 CONCLUSIONS
In this research, we have discussed three problems
of discriminant analysis. Problem 1 is solved by
Revised IP-OLDF, which looks for the interior
points of the “Optimal Convex Polyhedron”
directly. Problem 2 is theoretically solved by
Revised IP-OLDF and H-SVM, but H-SVM can
only be applied to linear separable model. Error rates
of LDF and QDF are very high for linear separable
data. This means that these functions should not be
used for important discrimination tasks, such as
medical diagnosis and genome discrimination.
Problem 3 only concerns QDF and RDA. This
problem was detected using a t-test after three years
of investigation, and can be solved by adding a small
noise term to variables. Now, JMP offers a modified
RDA, and if we can find clear rules to choose proper
parameters, it may be better than LDF and QDF.
However, these conclusions are confirmed by the
training samples. In many cases, statistical users
have small sample sizes, and cannot evaluate the
validation samples. Therefore, a k-fold cross-
validation method for small samples was proposed.
These results confirm the above conclusion for the
validation samples. Many discriminant functions are
developed using various criteria after Warmack and
Gonzalez (1973). Ibaraki and Muroga (1970)
defined the same Revised IP-OLDF. The mission of
discrimination should be based on the MNM
criterion. Statisticians have tried to develop
functions based on the MNM criterion, but this can
now be achieved by Revised IP-OLDF using MIP. It
is widely believed that Revised IP-OLDF leads to
overestimations, but LDF is worse for validation
samples. It is a realistic option for users to choose
logistic regression if they do not use Revised IP-
OLDF or S-SVM. The evaluation of modified RDA
is a topic for future work.
ACKNOWLEDGEMENTS
My research started in 1997 and finished in 2012. It
was achieved by What’s Best! and LINGO of
LINDO Systems Inc., and SAS and JMP of SAS
Institute Inc.
REFERENCES
Edgar, A., 1935. The irises of the Gaspe Peninsula. Bulltin
of the American Iris Society, 59, 2-5.
Fisher, R. A., 1936. The Use of Multiple Measurements in
Taxonomic Problems. Annals of Eugenics, 7, 179–
188.
Flury, B., Rieduyl, H., 1988. Multivariate Statistics: A
Practical Approach. Cambridge University Press.
Friedman, J. H., 1989 . Regularized Discriminant
Analysis . Journal of the American Statistical
Association,84/405, 165-175.
Goodnight, J. H., 1978. SAS Technical Report – The
Sweep Operator: Its Importance in Statistical
Computing –
(R-100). SAS Institute Inc.
Ibaraki,T., Muroga, S., 1970. Adaptive linear
classifier by linear programming. IEEE
trans
action On systems science and cybernetics,
SSC-6, 53-62.
Lachenbruch, P. A., Mickey, M. R., 1968.
Estimation of
error rates in discriminant analysis.
Technometrics 10, 1-11.
Liitschwager, J. M., Wang, C., 1978. Integer programming
solution of a classification problem. Management
Science, 24/14, 1515-1525.
Sall, J. P., 1981. SAS Regression Applications. SAS
Institute Inc. (Japanese version is translated by
Shinmura, S.)
Sall, J. P., Creighton, L., Lehman, A., 2004. JMP Start
Statistics, Third Edition. SAS Institute Inc. (Japanese
version is edited by Shinmura, S.)
Schrage, L., 1991. LINDO-An Optimization Modeling
System-. The Scientific Press. (Japanese version is
translated by Shinmura, S., & Takamori, H.)
Schrage, L., 2006. Optimization Modeling with LINGO.
LINDO Systems Inc. (Japanese version is translated by
Shinmura, S.)
Shinmura, S., Miyake, A., 1979. Optimal linear
discriminant functions and their application.
COMPSAC79, 167-172.
EndofDiscriminantFunctionsbasedonVariance-covarianceMatrices
15