This article revisits recently proposed methods to determine the kernel parameter and the number of latent components for identifying kernel principal component analysis (KPCA) and kernel partial least squares (KPLS) models. A detailed analysis shows that existing work is neither optimal nor efficient in determining these important parameters and may lead to erroneous estimates. In addition to that, most methods are not designed to simultaneously estimate both parameters, i.e. they require one parameter to be predetermined. To address these practically important issues, the article introduces a cross-validatory framework to optimally determine both parameters. Application studies to a simulation example and a total of three experimental or industrial data sets confirm that the cross-validatory framework outperforms existing methods and yields optimal estimations for both parameters. In sharp contrast, existing work has the potential to substantially overestimate the number of latent components and to provide inadequate estimates for the kernel parameter.
Chemometrics and Intelligent Laboratory Systems 167, pp. 196-207 (2017)