Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-6724] [MLlib] Support model save/load for FPGrowthModel #9267

Closed
wants to merge 11 commits into from

Conversation

yanboliang
Copy link
Contributor

Support model save/load for FPGrowthModel

@SparkQA
Copy link

SparkQA commented Oct 25, 2015

Test build #44311 has finished for PR 9267 at commit 81f667a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * class FPGrowthModel[Item: ClassTag: TypeTag] @Since(\"1.3.0\") (\n

throw new UnsupportedOperationException(s"Schema for type $other is not supported")
}
result
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can make the inferring one of Spark SQL functions just like ScalaReflection.schemaFor?

@SparkQA
Copy link

SparkQA commented Dec 10, 2015

Test build #47509 has finished for PR 9267 at commit 81f667a.

  • This patch passes all tests.
  • This patch does not merge cleanly.
  • This patch adds the following public classes (experimental):\n * class FPGrowthModel[Item: ClassTag: TypeTag] @Since(\"1.3.0\") (\n

@jkbradley
Copy link
Member

I'll take a look at this now.

@@ -42,8 +53,9 @@ import org.apache.spark.storage.StorageLevel
*/
@Since("1.3.0")
@Experimental
class FPGrowthModel[Item: ClassTag] @Since("1.3.0") (
@Since("1.3.0") val freqItemsets: RDD[FreqItemset[Item]]) extends Serializable {
class FPGrowthModel[Item: ClassTag: TypeTag] @Since("1.3.0") (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break API compatibility since old user code might provide a class for which a TypeTag is not available.

@jkbradley
Copy link
Member

Your PR looks good, but I hope we can eliminate the need for Item to be restricted by TypeTag.

Also, eliminating inferItemType will be nice to avoid the need for updates if Catalyst supports more types in the future.

This should probably include a Java unit test b/c of how it uses types.

Thanks! I'll watch for updates.

@yanboliang
Copy link
Contributor Author

@jkbradley I updated the PR and it's not necessary to provide TypeTag now. I also add Java unit test. But I can not figure out a way to eliminate inferItemType, because there is no exist API to do the map between DataFrame datatypes and the Scala types. I'm very glad to hear your opinions.

@SparkQA
Copy link

SparkQA commented Dec 31, 2015

Test build #48546 has finished for PR 9267 at commit 7381b31.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -20,17 +20,27 @@ package org.apache.spark.mllib.fpm
import java.{util => ju}
import java.lang.{Iterable => JavaIterable}

import org.json4s.DefaultFormats
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Organize imports (scala before 3rd-party libraries)

@jkbradley
Copy link
Member

@yanboliang This change should let you remove inferItemType. (Tests pass at least.)

jkbradley@4f5c5a3

@SparkQA
Copy link

SparkQA commented Jan 5, 2016

Test build #48731 has finished for PR 9267 at commit 41b31eb.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yanboliang
Copy link
Contributor Author

@jkbradley That's cool! It also works well in my environment. Thanks for your kindly help!

@jkbradley
Copy link
Member

@yanboliang No problem; thanks for your updates! This LGTM

Merging with master

@asfgit asfgit closed this in 13a3b63 Jan 5, 2016
@yanboliang yanboliang deleted the spark-6724 branch January 6, 2016 02:32
ghost pushed a commit to dbtsai/spark that referenced this pull request Apr 13, 2016
```PrefixSpanModel``` supports ```save/load```. It's similar with apache#9267.

cc jkbradley

Author: Yanbo Liang <[email protected]>

Closes apache#10664 from yanboliang/spark-10386.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants