Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-3723] [MLlib] Adding instrumentation to random forests #13881

Closed

Conversation

smurching
Copy link
Contributor

What changes were proposed in this pull request?

In RandomForest.run(), added instrumentation for the number of node groups, along with the min, max, and average number of nodes per group.

Also fixed a typo in BaggedPoint.scala documentation.

How was this patch tested?

Tested by running RandomForestClassifierSuite, checking the test output manually to make sure instrumentation information was present and reasonable.

@mengxr
Copy link
Contributor

mengxr commented Jun 24, 2016

ok to test

@SparkQA
Copy link

SparkQA commented Jun 24, 2016

Test build #61144 has finished for PR 13881 at commit bd7d24d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

no groups of nodes are procesed; updated log statements to reflect this
@SparkQA
Copy link

SparkQA commented Jun 24, 2016

Test build #61150 has finished for PR 13881 at commit f5a6893.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 24, 2016

Test build #61151 has finished for PR 13881 at commit 7fb031e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@smurching
Copy link
Contributor Author

Does it make sense to only perform instrumentation-related computations (i.e. updating the max/min nodes per group) if the instrumentation argument to RandomForest.run (instr) is not None? This isn't checked for in the current implementation.

@jkbradley
Copy link
Member

Sorry for the long delay! Whenever you get a chance to update this, it'd be nice to log this info via the Instrumentation class, rather than logInfo.

@HyukjinKwon
Copy link
Member

Hi @smurching, is this still active?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants