Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementations of various Naive Bayes classifier. #39

Merged
merged 2 commits into from
May 15, 2015
Merged

Conversation

rleonid
Copy link
Owner

@rleonid rleonid commented May 11, 2015

Multinomial version with

  • smoothing
  • Bernoulli evaluation

Gaussian version for dealing with continuous data.

Multinomial version with
  - smoothing
  - Bernoulli evaluation
Gaussian for dealing with continous data.
let aa = feature_size + 1 in
let update arr idx =
Array.iter (fun i -> arr.(i) <- arr.(i) + 1) idx;
(* keep track of the class count at the end of array. *)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting the count at the end is a bit sneaky. Does it have a significant performance impact or keep the code significantly simpler?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It probably does not have a significant performance impact. The type signature

'a * float * float array

looked awkward to me. Plus un/re-boxing the tuple in the association list below seemed like a waste. It did make things easy to multiply since you could just fold over the entire array, until I started dealing with the smoothing.

Let's see if I like the way the code looks in 3-6 months. If it seems like a bad choice then, I'll factor out the prior into a separate float.

@struktured
Copy link
Contributor

Another nice to have feature for this package would be to provide an incremental version, perhaps in the spirit of scikit's implementation (although it still does leave you with some tough choices to make).

I was trying to dig up a nice paper on online multinomial NB but I haven't found one yet (here's a good paper on Online LDA though, which also has a python implementation).

@rleonid
Copy link
Owner Author

rleonid commented May 12, 2015

An online version wouldn't be that difficult to implement. Move the code after https://github.com/rleonid/oml/blob/naive_bayes/src/lib/classify.ml#L104, into a closure for the type to be evaluated at the eval step. If you don't mind, I'll file an issue to do it later.

I'll also add the LDA paper link to another issue.

@rleonid rleonid mentioned this pull request May 12, 2015
2 tasks
rleonid added a commit that referenced this pull request May 15, 2015
Implementations of various Naive Bayes classifier.
@rleonid rleonid merged commit 572aab7 into master May 15, 2015
@rleonid rleonid deleted the naive_bayes branch May 15, 2015 03:36
rleonid added a commit that referenced this pull request Jul 22, 2015
Implementations of various Naive Bayes classifier.
rleonid added a commit that referenced this pull request Jul 22, 2015
Implementations of various Naive Bayes classifier.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants