-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
organize or extract some useful utils #46
Comments
The code has definitely outgrown all being in one package but I haven't had time to reorganize it. I'm happy to provide feedback on proposals and accept pull requests though. The data file formats may not be the best starting point as they produce FeatureMatrixes filled with DenseNumFeature's and DenseCatFeature's and these types/interface also implement logic specific to the split searching and splitting criteria used in decision trees and handeling missing values in the way I do and thus aren't great for general use. A general purpose parser should probably parse to either simple slices of data or something like the matrix types defined in gonum and i'm not sure they handle missing values or categorical data. Something like a pandas or R dataframe for go would be a good target but i'm not aware of one. The code is BSD licensed so parts could also be spun off into independent projects if there is something you have a pressing need for. |
Do you have any idea or draft thought on how we can organize it?
I agree. I always feel we should have a dataframe or pandas in Go. I've used gonum in another machine learning package golearn, and it's really good. However, I think it'd be better if we can have a higher level wrapper based on gonum (something like data-frame). It's necessary to have such fundamental infrastructure in order to build some awesome ML package in Go. Do you have any thought on such dataframe? I'd definitely like to find someone to discuss and build such tool together, it's a little off topic to this issue though LOL... |
Hi,
I notice there are many useful tools in this package, for example, read/write libsvm format file, and the data matrix package etc. Do you have any plan to organize to extract them into packages? I think it would be good if we can put them into packages, so other ML-related package can be implemented base on these utils.
I'd like to help with such task if you have any plan on it! Please let me know what's your thought!
Thanks.
The text was updated successfully, but these errors were encountered: