You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I still consider adopting a better discovery method from the state of art #10 as the most impactful workload.
Renewing the concept indexes would also be nice to stay fresh (though tedious, as one needs to ensure the crawling code is still correct in the 2019 version of the websites).
The big feature set I hid under the rug was regarding "invalidation" #16, which is platform-dependent and difficult to implement in the standalone repo.
Lastly, depending on who and when has time to invest in upgrading nnexus into a 2019 "best in class" tool, I am also tempted to suggest a second rewrite, this time departing Perl for good. If I was to be the one undertaking it, I would certainly choose Rust, which is a great language for this type of tooling. I can point to the convenience of the statement classification showcase I recently completed, which is a decent exhibit to what a future nnexus web service may look like. That said, the port can be done in parts, where the index database generation can remain in Perl for a while - web crawling is definitely one of the places perl code is just much quicker for getting things done.
If you're wondering about the current project's size, here is the report from the cloc tool:
P.S. I should also say that I am not making rewrite suggestions lightly - I have ported all of my research work into rust, and I credit it as one of the main reasons the tooling I ended up with is usable and productive. Nnexus is not yet experiencing any explicit disadvantage from being in perl, except maybe the much more limited community of potential developers.
The text was updated successfully, but these errors were encountered:
I have cleaned the repository down to ten issues, in two projects:
I still consider adopting a better discovery method from the state of art #10 as the most impactful workload.
Renewing the concept indexes would also be nice to stay fresh (though tedious, as one needs to ensure the crawling code is still correct in the 2019 version of the websites).
The big feature set I hid under the rug was regarding "invalidation" #16, which is platform-dependent and difficult to implement in the standalone repo.
Lastly, depending on who and when has time to invest in upgrading nnexus into a 2019 "best in class" tool, I am also tempted to suggest a second rewrite, this time departing Perl for good. If I was to be the one undertaking it, I would certainly choose Rust, which is a great language for this type of tooling. I can point to the convenience of the statement classification showcase I recently completed, which is a decent exhibit to what a future nnexus web service may look like. That said, the port can be done in parts, where the index database generation can remain in Perl for a while - web crawling is definitely one of the places perl code is just much quicker for getting things done.
If you're wondering about the current project's size, here is the report from the
cloc
tool:P.S. I should also say that I am not making rewrite suggestions lightly - I have ported all of my research work into rust, and I credit it as one of the main reasons the tooling I ended up with is usable and productive. Nnexus is not yet experiencing any explicit disadvantage from being in perl, except maybe the much more limited community of potential developers.
The text was updated successfully, but these errors were encountered: