Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] SVM class and sample weights #2618

Merged
merged 11 commits into from
Aug 5, 2020

Conversation

tfeher
Copy link
Contributor

@tfeher tfeher commented Jul 29, 2020

This PR adds SVM class and sample weights, and closes issue #2222.

  • The Python layer accepts both sample and class weights.
  • The C++ layer accepts only sample weights. Class weight can be included by setting sample weights based on the class label (as it is done here).

The weights scale the penalty parameter C for each sample individually. The essence of this PR is the two code lines change in SmoBlockSolver, the rest just makes sure that the correctly weighted penalty parameter reaches that point, and the results are correctly processed afterwards.

Additional unrelated change: the unused fit method from SVMBase was removed (#2409).

@tfeher tfeher requested review from a team as code owners July 29, 2020 13:40
@GPUtester
Copy link
Contributor

Please update the changelog in order to start CI tests.

View the gpuCI docs here.

cpp/src_prims/vectorized.cuh Show resolved Hide resolved
cpp/src/svm/workingset.cuh Outdated Show resolved Hide resolved
cpp/src/svm/workingset.cuh Outdated Show resolved Hide resolved
Copy link
Contributor Author

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @teju85 for the review, I have fixed the issues!

cpp/src_prims/vectorized.cuh Show resolved Hide resolved
cpp/src/svm/workingset.cuh Outdated Show resolved Hide resolved
cpp/src/svm/workingset.cuh Outdated Show resolved Hide resolved
Copy link
Member

@teju85 teju85 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @tfeher for the PR. Changes LGTM.

Copy link
Member

@dantegd dantegd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As part of CI the doxygen docs are now checked, the parameter sample_weight seems to be missing from the docstring now:

Generating docs for compound ML::Internals::Cal/jenkins/workspace/rapidsai/gpuci/cuml/prb/cuml-gpu-build/cpp/include/cuml/svm/svc.hpp:49: error: The following parameter of ML::SVM::svcFit(const cumlHandle &handle, math_t *input, int n_rows, int n_cols, math_t *labels, const svmParameter &param, MLCommon::Matrix::KernelParams &kernel_params, svmModel< math_t > &model, const math_t *sample_weight=nullptr) is not documented:

  parameter 'sample_weight' (warning treated as error, aborting now)

@tfeher tfeher force-pushed the fea-ext-svm-sample-weights branch from 469de3b to d2fff63 Compare August 4, 2020 12:49
@tfeher
Copy link
Contributor Author

tfeher commented Aug 4, 2020

Thanks @dantegd, I have corrected the docstring!

@dantegd dantegd merged commit 29df7e5 into rapidsai:branch-0.15 Aug 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants