Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-6083] [MLLib] [DOC] Make Python API example consistent in NaiveBayes #4834

Closed
wants to merge 2 commits into from

Conversation

MechCoder
Copy link
Contributor

No description provided.

@SparkQA
Copy link

SparkQA commented Feb 28, 2015

Test build #28130 has started for PR 4834 at commit 0c5fe03.

  • This patch merges cleanly.

@MechCoder
Copy link
Contributor Author

cc: @mengxr Would you be able to verify this?

@MechCoder
Copy link
Contributor Author

Hmm. I get an a accuracy of zero for the given example. Not sure where I'm going wrong though :(

@SparkQA
Copy link

SparkQA commented Feb 28, 2015

Test build #28130 has finished for PR 4834 at commit 0c5fe03.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28130/
Test PASSed.

@MechCoder
Copy link
Contributor Author

I changed the randomSplit seed and it works better. It should look good now.

@SparkQA
Copy link

SparkQA commented Mar 1, 2015

Test build #28139 has started for PR 4834 at commit 65bbbe9.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Mar 1, 2015

Test build #28139 has finished for PR 4834 at commit 65bbbe9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28139/
Test PASSed.

@mengxr
Copy link
Contributor

mengxr commented Mar 1, 2015

@MechCoder Thanks for the update! We only have 6 lines in sample_naive_bayes_data.txt. That's why some random seed would give bad splits.

@MechCoder
Copy link
Contributor Author

Great. Do you have any more comments?


# Preprocessing
splitData = data.map(lambda line: line.split(','))
parsedData = splitData.map(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can define a parse function to make the code more readable. Btw, we use 4 space indentation in Python, following PEP8.

def parseLine(line):
    parts = line.split(',')
    label = float(parts[0])
    features = Vector.dense([float(x) for x in parts[1].split(' ')])
    return LabeledPoint(label, features)

data = sc.textFile('data/mllib/sample_naive_bayes_data.txt').map(parseLine)

@MechCoder
Copy link
Contributor Author

@mengxr fixed !

@SparkQA
Copy link

SparkQA commented Mar 1, 2015

Test build #28152 has started for PR 4834 at commit 1cdd7b5.

  • This patch merges cleanly.

@mengxr
Copy link
Contributor

mengxr commented Mar 1, 2015

LGTM.

@SparkQA
Copy link

SparkQA commented Mar 1, 2015

Test build #28152 has finished for PR 4834 at commit 1cdd7b5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28152/
Test PASSed.

asfgit pushed a commit that referenced this pull request Mar 2, 2015
…eBayes

Author: MechCoder <[email protected]>

Closes #4834 from MechCoder/spark-6083 and squashes the following commits:

1cdd7b5 [MechCoder] Add parse function
65bbbe9 [MechCoder] [SPARK-6083] Make Python API example consistent in NaiveBayes

(cherry picked from commit 3f00bb3)
Signed-off-by: Xiangrui Meng <[email protected]>
@asfgit asfgit closed this in 3f00bb3 Mar 2, 2015
@mengxr
Copy link
Contributor

mengxr commented Mar 2, 2015

Merged into master and branch-1.3. Thanks!

@MechCoder MechCoder deleted the spark-6083 branch March 2, 2015 02:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants