Replies: 1 comment
-
Good question. As far as I'm aware it is completely fine to use un-normalized vectors. Even the choice of what kernel to use seems a bit arbitrary in many cases, and I don't see a particularly good reason to stick just with polynomial kernels, you could just as well use a Gaussian kernel. DScribe was designed in a way that it only deals with producing high-dimensional vectors whose inner product is a meaningful measure of the similarity of two samples - it is up to you to then decide how to use these vectors. You most probably need to benchmark different methods and hyperparameters to find out what produces good results for your particular data. |
Beta Was this translation helpful? Give feedback.
-
I am using the SOAP output of the DScribe library in combination with the sklearn library for kernel based methods. As provided in the documentation, the original definition of the SOAP kernel is a normalized polynomial kernel. As far as I understand, the output is not normalized by default in DScribe, which I was not initially aware of. Since the prediction errrors I obtain with the normalized and unnormalized output are comparable, I am now wondering if it is even necessary to stick to the original SOAP kernel definition. To be more precise: I am not using some implementation of the SOAP kernel, but directly use the SOAP output from the DScribe library as input for the (exponentiated) DotProduct kernel implemented in sklearn. I would appreciate any advice !
Beta Was this translation helpful? Give feedback.
All reactions