-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Load data to atomspace #20
Conversation
* @param row_num row number | ||
* @return | ||
*/ | ||
Handle mk_row_number_cell(const int& row_num) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to pass int
by const ref as int
is as big as its pointer or smaller.
* @param contin contin value of a cell | ||
* @return | ||
*/ | ||
Handle mk_real_cell(const contin_t& contin) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to pass contin
by ref (it's the same size as a pointer).
* @return | ||
*/ | ||
Handle mk_real_cell_exec(const contin_t& contin, | ||
const int& row_num, const string& label) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same remark about passing by ref (string though be passed by ref).
* @param b builtin value | ||
* @return | ||
*/ | ||
Handle mk_boolean_cell(const builtin& b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same remark about builtin ref passing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
builtin
is an enum. Shouldn't we pass it by ref?
const type_node& type = oTable.get_type(); | ||
const string& label = oTable.get_label(); | ||
HandleSeq rows; | ||
int row_num = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better if it starts by 0 (contrary to what I've indicated on the issue). It makes the subsequent code slightly simpler and it's a more frequent start index value.
mk_real_cell(boost::get<contin_t>(v))); | ||
rows.push_back(h); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively you can put the conditional in the createLink
and select either mk_real_cell
or mk_boolean_cell
depending on type
. It slightly less efficient (though probably imperceptibly) but it reduces the number of lines of code.
Another even better option is to create a mk_cell
visitor! that way you just call that mk_cell
function on v
and you're good. There are plenty of examples of visitors throughout MOSES' code (in table.h
for instance). See also https://www.boost.org/doc/libs/1_67_0/doc/html/variant.html for its documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It slightly less efficient (though probably imperceptibly) but it reduces the number of lines of code.
Will it be imperceptible as we're going to do the checking for every single cell in the data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, if we're changing it for only the output table, it wouldn't affect much. But, what if we're doing the same for other parts such as this line (https://github.com/opencog/as-moses/pull/20/files#diff-85f2b83d9bc9a6c48aeb22bb8535b034R325)?
* all features have similar type with the output feature. | ||
* @return | ||
*/ | ||
Handle mk_input_table(const ITable& iTable, const type_node& type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to pass type_node
by ref.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type_node
is an enum too.
HandleSeq cells; | ||
|
||
cells.push_back(mk_boolean_cell( | ||
boost::get<builtin>(table.otable[row_num - 1]))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you start row_num
from 0 you wouldn't need to subtract 1.
@Yidnekachew it is OK if you make you next commits on top the existing one (i.e. not overwrite the git history) I mean the branch is clean enough at this point. |
@Yidnekachew said
Yes because the branch prediction of the CPU will choose the right function after the first few cycles. |
HandleSeq cells; | ||
|
||
for (contin_t cell_value : m.get_seq<contin_t>()) | ||
cells.push_back(mk_real_cell(cell_value)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, you could simplify here as well by having the whole push_back
loop wrapped in a visitor, but maybe it's just too much worry. I mean this code is already quite clean as it is.
cells.push_back(mk_boolean_cell(cell_value)); | ||
|
||
rows.push_back( | ||
similarity ? mk_numbered_row(cells, row_num + 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can drop + 1
in row_num + 1
(in that and the other instances).
for (builtin cell_value : m.get_seq<builtin>()) | ||
cells.push_back( | ||
mk_boolean_cell_exec(cell_value, row_num + 1, | ||
labels[cell_number++])); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, yes, I feel that if you wrap that code in a visitor you may simply a lot your code. Maybe like having a visitor for turning a multi_type_seq
into a cells
vector.
HandleSeq operator()(const T& t) const | ||
{} | ||
|
||
int cell_number = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit dangerous to have this here because if you reapply the visitor it will crash. I would suggest to move it in the operators.
|
||
for (const multi_type_seq& m : table.itable) | ||
{ | ||
HandleSeq cells = boost::apply_visitor(mk_cell_seq_visitor(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, it terms of performance, it might be better to create the visitor out of the loop. But frankly it's not even sure as both the compiler and system (CPU branch prediction) optimize that sort of things. Just keep that in mind for the day you want to speed up that function (although it's unlikely it'll be problematic).
* @param table | ||
* @return | ||
*/ | ||
Handle mk_unfolded_boolean_table_eval(const Table& table) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
{ | ||
mk_exec_cell_seq_visitor cseqv; | ||
cseqv.row_num = row_num; | ||
cseqv.labels = labels; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think C++11 allows you to write directly
mk_exec_cell_seq_visitor cseqv{row_num, labels};
{} | ||
|
||
int row_num; | ||
string label; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better be a const string&
{} | ||
|
||
int cell_number = 0; | ||
HandleSeq cellSeq; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better move this in the operators as well.
int cell_number = 0; | ||
HandleSeq cellSeq; | ||
int row_num; | ||
vector<string> labels; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better be a const ref.
@@ -0,0 +1,474 @@ | |||
/** atomese_representation.cc --- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need name update. Your IDE likely offers a way to find and replace a string across multiple files, if not you can go old school with sed
(though interactivity is usually better handled via an IDE).
mk_exec_cell_seq_visitor(const int row_num, | ||
const vector<string>& labels) | ||
: row_num(row_num), labels(labels) | ||
{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you don't need to define that constructor, C++11 initializer will work regardless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unable to do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad, they need to be public for it to work. You may either use struct
instead of class
or declare them public
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a struct
now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, so now you should be able to remove mk_exec_cell_visitor
and mk_exec_cell_seq_visitor
constructors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was already struct
and doesn't work if constructors removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see, C++11 aggregate initialization is very stringent, see https://en.cppreference.com/w/cpp/language/aggregate_initialization for more info.
It actually works with -std=c++17
but then this produces other problems with the cogutil tree
class. And C++17 is too recent to force users to use a compiler compatible with it.
So you're gonna have to let it here, but you may add a comment that such a constructor isn't needed for C++17.
using namespace opencog; | ||
using namespace std; | ||
|
||
class atomese_reprUTest : public CxxTest::TestSuite { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update that unit test name (and the filename it's in) to follow the functions it is testing), load_tableUTest
would do.
@@ -0,0 +1,238 @@ | |||
/** atomese_representation.h --- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update
*/ | ||
|
||
#ifndef _OPENCOG_ATOMESE_REPRESENTATION_H | ||
#define _OPENCOG_ATOMESE_REPRESENTATION_H |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update
Last thing @Yidnekachew please fix the warnings (by having the default visitor operators return something, like |
I've merged, it seems alright, including the git history. Thanks for your patience and for putting up with my stringent requirements. |
@ngeiswei It's an honor to get your comments! |
Implemented based on #12