Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-11401] [MLLIB] PMML export for Logistic Regression Multiclass Classification #9397

Conversation

selvinsource
Copy link
Contributor

@selvinsource selvinsource changed the title [SPARK-7272] [MLLIB] PMML export for Logistic Regression Multiclass Classification [SPARK-11401] [MLLIB] PMML export for Logistic Regression Multiclass Classification Nov 1, 2015
@dbtsai
Copy link
Member

dbtsai commented Nov 2, 2015

Jenkins, ok to test

@dbtsai
Copy link
Member

dbtsai commented Nov 2, 2015

It's hard to see the changes with moving the file from BinaryClassificationPMMLModelExport.scala to ClassificationPMMLModelExport.scala Can you create an separate sub-task for moving the class name?

@SparkQA
Copy link

SparkQA commented Nov 2, 2015

Test build #44801 has finished for PR 9397 at commit 14a4c8d.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@selvinsource
Copy link
Contributor Author

Here the diff between the two files (BinaryClass vs Class):
http://www.mergely.com/w7ufbahQ/

Practically the difference between the Binary and the Class version is that instead of building two fixed regression tables (YES/NO), I create a Category Zero table (the one without the predictors) and as many regression tables as the number of the categories (numClasses-1).

The code, in case of numClasses = 2, does exactly what it was doing before in the BinaryClassificationPMMLModelExport class that I originally wrote.

My spark validator project confirms both Binary and Multiclass pmml export works fine:
https://github.com/selvinsource/spark-pmml-exporter-validator/tree/logistic_regression_multi_class

See sections:
Logistic Regression (Binary Classification)
Logistic Regression (Multiclass Classification)

@dbtsai
Copy link
Member

dbtsai commented Nov 3, 2015

okay. please fix fails MiMa tests. thanks.

@selvinsource
Copy link
Contributor Author

@dbtsai
how do I fix the following?

[info] spark-mllib: found 1 potential binary incompatibilities (filtered 51)
[error]  * class org.apache.spark.mllib.pmml.export.BinaryClassificationPMMLModelExport does not have a correspondent in new version
[error]    filter with: ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.mllib.pmml.export.BinaryClassificationPMMLModelExport")

Not familiar with Mima.

Thanks.

@dbtsai
Copy link
Member

dbtsai commented Nov 4, 2015

You need to add ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.mllib.pmml.export.BinaryClassificationPMMLModelExport") into Mima in project/ folder.

@SparkQA
Copy link

SparkQA commented Nov 5, 2015

Test build #45152 has finished for PR 9397 at commit 63945b5.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 8, 2015

Test build #45306 has finished for PR 9397 at commit 7db2168.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@selvinsource
Copy link
Contributor Author

@dbtsai I don't think the issue with failed Spark test is to do with my code.

@SparkQA
Copy link

SparkQA commented Nov 13, 2015

Test build #45877 has finished for PR 9397 at commit 96a19d6.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dbtsai
Copy link
Member

dbtsai commented Nov 23, 2015

Jenkins, retest this please

@SparkQA
Copy link

SparkQA commented Nov 23, 2015

Test build #46554 has finished for PR 9397 at commit 96a19d6.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@selvinsource
Copy link
Contributor Author

@dbtsai any advice on why it is failing? All the pmml tests passed and that is the only thing I changed.

@dbtsai
Copy link
Member

dbtsai commented Nov 24, 2015

Can you rebase the master?

@selvinsource selvinsource force-pushed the mllib_pmml_model_export_SPARK-11401 branch from 96a19d6 to fd38551 Compare November 26, 2015 23:19
@SparkQA
Copy link

SparkQA commented Nov 27, 2015

Test build #46784 has finished for PR 9397 at commit fd38551.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@selvinsource
Copy link
Contributor Author

@dbtsai thanks for the suggestion, rebasing from master seems to have fixed it.

@rxin
Copy link
Contributor

rxin commented Jun 15, 2016

Thanks for the pull request. I'm going through a list of pull requests to cut them down since the sheer number is breaking some of the tooling we have. Due to lack of activity on this pull request, I'm going to push a commit to close it. Feel free to reopen it or create a new one.

@dbtsai there are a few pull requests that were waiting on your review. Can you revisit them even if they are closed?

@asfgit asfgit closed this in 1a33f2e Jun 15, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants