Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[7.7][ML] Add information about samples per node to the tree #1006

Merged
merged 1 commit into from
Feb 18, 2020

Conversation

valeriy42
Copy link
Contributor

Backport to #991

This PR extends the definition of the tree node by adding information about the number of training samples that passed through the node (numberSamples or number_samples). The json schema for inference model is adjusted accordingly.

Since this change the schema for persist/restore of the tree implementation, I bumped the version and removed 7.5 and 7.6 from the list of supported version. My reasoning: restoring from old schema and setting number samples to 0 would break feature importance at inference time.

I also adjust feature importance computation to use pre-computed number samples instead of recomputing it on the fly.
@valeriy42 valeriy42 merged commit 030d608 into elastic:7.x Feb 18, 2020
@valeriy42 valeriy42 deleted the backport-pr-991 branch May 6, 2020 11:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant