-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
David/fix cmake shared library ubuntu #4466
David/fix cmake shared library ubuntu #4466
Conversation
- Note that this is only for Linux.
- If this is not done, scripts copied will not be executed, resulting to errors during training.
- Now OpenFST is installed first, kaldi is then using built binary to compile it's source code.
Great, thanks!! |
I'll try my best to review ASAP.
|
For the changes I made: README.mdThat is only for my own comment, I shall revert it before merging. Will patch it. install_openfst.sh & OpenFST.py & .gitignoreThose are used to install OpenFST. Major motivation of this is due to the fact that in CMake installed binary, I will put those installed files into a prefix location (such as /usr/local/lib or ~/APP/). In that way, I also need to install OpenFST into corresponding location. As I remember, when I am looking at original installation script of OpenFST, it does not contain method to set installation prefix. That is why I add new script for OpenFST installation. The other option is that original install script have a method to set installation prefix. In this way, those installation script will not be needed. For the Python version, the reason I wrote that one is to consider the fact that Kaldi might also support Windows. Python can also be native to Windows, while shell script needs special container (such as WSL or virtual machine). However, that script seems to violate Codefactor rule. Therefore, still need to see how Python script can go, and if Python script is suitable or not.
|
Revert README.md change.
I wish @jtrmal could look at this. This CMake thing is going back and forth. I would be very happy if it finally materialized. Unfortunately, we are both very busy now. If you could take on this project, it would be very helpful. The main goal, in my view, is transition to CMake smoothly so we do not break the current users. My opinion is there is no perfect build system, or, come think of it, even a non-brain-damaged one. They all require RTFM if your build is anything but trivial. I rewrote tools/Makefile, and it is still hard to use in even minimally non-typical scenarios. CMake+ninja is likely better than make, in the end, if you do it correctly. |
Thank you for the information about what should be noticed for patching Kaldi. I am new to Kaldi (3 month maybe), so I do need information to patch kaldi without destroying environment. For CMake patching, previously it is because our work is related to Kaldi. Currently I have very limited time at both day and night to do the patching. Therefore, after reading your post, I came to an idea. Let's see if the plan work or not: For tthose changes, my main focus is to ensure that our project's code can hook with Kaldi. Our code is an exntension to kaldi, so that it is a separate repository. Unlike Github project "phonetisaurus g2p", I don't have time to rewrite all training script to Python, while currently we select Python to run entire training process. Therefore, in my original intention is to concentrate all binary and library to fixed location (bin/ and lib/), which is like old-style Liniux does. But considering the fact that I can't destroy Kaldi environment, so maybe there is an alternative in between that can both fix current CMake issue, while Kaldi's overall structure is majorly maintained.
Under those changes, eventually only |
Based on the strategy stated above, I revert changes that is not necessary, and use Kaldi's For files changed in this pull-request, except for two files stated above, one additional file is changed, For DNN part, there are actually bug in CUDA compilation for CUDA capability 35 during CMake build (Makefile build is normal). So DNN testing will need other patch (which I will put it in another pull request?). Therefore in this pull request, I only focus on non-DNN training part verification. Do you think that the testing is enough? Or I need to ensure that training is successful till Tri4? |
Thanks!! Let me see what @kkm000 thinks.. |
Thank you David, you took on a long unsolved problem. I'm
LD_LIBRARY_PATH should not normally be used. You build Kaldi after OpenFST has been installed, so you do not need any tricks to make linker hardcode the path to them. There is a way to compile a binary against libraries in one place, but have them at runtime in another (path to libraries to link against is specified with OpenFST is installed into tools/openfst, which is a link to the directory with a version number, like tools/openfst-1.7.1. Yes, it is the same directory as the source, but it creates directories that do not exist at the root level (bin/, lib/ and include/; do not get confused: the original includes are in src/include, and src/lib also exists--it's the library source). In the end, whatever is linked, should be linked against e.g. ../../tools/openfst/lib/libfst.so.12 or similar. Use Are you building static or shared Kaldi? I'll look carefully at your changes today. This one point looked strange to me.
I do not think we support it any more, certainly not in the cudadecoder, so if you enable it (in configure, it is disabled by default), it's expected to fail. It's really very old. I do not remember which CUDA devices limited to it, but the ancient K20 is newer, AFAICR. Generally, it's better to specify the architecture(s) that you are testing on when building. Every additional sm_XX significantly adds to compilation time, but there is no point embedding a CUDA fatbin into the executable if the hardware cannot run it. |
Hi @kkm000: Thank you for your tip. Indeed when I am linking OpenFST, I does not set For this patch, it is focusing on the issue that CMake build on shared library fail, so I am using shared library for Kaldi build. I did not check static build yet, which maybe tested and, if problem exist, patch in another pull-request? For CUDA build, does Kaldi already have method to specify CUDA capabiliity in CMake project file? If not, I can have yet another pull request to fix this issue (I have already build CUDA by CMake successfully, but the solution is ugly). |
This reverts commit 0a2b995.
Hi @kkm000: I have added two more commits, to first remove change to |
CMakeLists.txt
Outdated
# get_third_party(openfst) | ||
# set(OPENFST_ROOT_DIR ${CMAKE_BINARY_DIR}/openfst) | ||
# include(third_party/openfst_lib_target) | ||
find_library(OpenFST |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable is cached, so it must conform to CMake conventions:
find_library(OpenFST | |
find_library(OpenFST_LIBRARY |
CMakeLists.txt
Outdated
NAMES fst | ||
PATHS ${CMAKE_SOURCE_DIR}/tools/openfst/lib | ||
REQUIRED) | ||
include_directories(${CMAKE_SOURCE_DIR}/tools/openfst/include) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add, before this line,
find_path(OpenFST_INCLUDE_DIRS "fst.h"
PATHS "${CMAKE_SOURCE_DIR}/tools/openfst/include"
REQUIRED)
get_filename_component(OpenFST_LIBRARY_DIR ${OpenFST_LIBRARY} DIRECTORY CACHE)
mark_as_advanced(FORCE OpenFST_INCLUDE_DIRS OpenFST_LIBRARY OpenFST_LIBRARY_DIR)
Since find_library
caches the library path, it makes sense to do the same to the include path. find_path()
will cache the path variable. As long as the path is cached, the search will not be performed again.
CMakeLists.txt
Outdated
PATHS ${CMAKE_SOURCE_DIR}/tools/openfst/lib | ||
REQUIRED) | ||
include_directories(${CMAKE_SOURCE_DIR}/tools/openfst/include) | ||
link_directories(${CMAKE_SOURCE_DIR}/tools/openfst/lib) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
link_directories(${CMAKE_SOURCE_DIR}/tools/openfst/lib) | |
link_directories(${OpenFST_LIBRARY_DIR}) |
This is a cached path to the library. Once found, the find_library()
command won't perform the search again. Also, use OpenFST_INCLUDE_DIRS
for the include_directories()
. It's tempting to omit link_directories()
here altogether and use link_libraries(${OpenFST_LIBRARY})
, which contains a full path to the library, but it will be incorrect if someone toggles the BUILD_SHARED_LIBS
value: the cached value retains the .a or .so suffix which was selected according to the value of that variable.
@davidlin409, I ran the build a couple times, and everything is just fine, from the very first attempt. I'll merge this, there's only a couple things to fix. Thanks for your work, really appreciated! I have a question for you: how willing are you to improve CMake build for Kaldi? What I'm seeing is kinda raw, compared to our configure. It would be nice to switch to CMake, but there is a lot of work needed. I'm bogged down for the month or so, but I could at the least draw a roadmap for it, what needs to be done. However much you want to bite off it, is great; it's a large project. Read "Modern CMake" and google for "more modern CMake." I found them very helpful.
We call them binaries, not executables in the documentation, and the standard CMake variable (although undocumented and is a horrible name) is
This does not apply to libraries or tests; only the main binaries must reside in the expected locations. Just for a reference, this is how I ran the build (
For reference, these are the actual .so files that the binary will load:
All generated stuff is in a separate directory now:
the
and here are example content of a library and a binary directory under it: only build aritifacts and a couple extra files
Object files are hidden out of sight:
It's all nice and good, except the scripts expect the binaries to be somewhere else. :) |
Forgot to mention a sensible and well-known hack. You can add It's not a biggie unless you open the CMake project in Visual Studio or VSCode, which will show all these targets, or run |
Hi @kkm000 : For kaldi CMake improvement, may I ask if there is any place that can have Kanban view? Those items you mentioned above seems like really large project, while those projects are not directly linked to my work; I might need to organize them by several topics and handle them separately. The other option is that I create a open page on Notion.so, and we can add CMake related topic on top of it. But since this is directly related to Kaldi, I think that putting Kanban/Wiki on Kaldi Github will be most safe option. What do you think? From the four topics you mentioned above, I do have a question about how those object files are placed in question 4. I know that traditional Kaldi place object files in the same directory as source code. But that is actually what old style Makefile does. For example when I am studying in university as graduated student, I already separate those object files into separate directory (also in the directory called build folder) in pure Makefile project, which needs tuning of Makefile since Kaldi Makefile system is large, but doable. So shall we make CMake place object file the same as Makefile, or shall we follow CMake and tune Makefile project?
As for taking on those topics. For now my first priority is to patch those bug that is related to my project at work (a patch to egs script that causes warning in DNN training of nnet1, and CUDA compilation bug of CUDA capability 35). After that, I might start to take a look at those topics. But since those topics are not related to my work, I can only do it at night with my own computer, which actually install Windows instead of native Linux. So the speed will be much slower (work computer is with 32 CPU threads...), and I will have limited time. The progress will be slow, but by one-by-one looking into those topics, it can be gradually handled. For some topics (especially the object to be placed at source directory). If I would start from those 4 topics, the first one that I might try is question 3, since I am also bugged by CTest. First and second can be later investigated (where there might be discussion to ensure what I think is the same as you). For the 4th question, changing Makefile is easy, but changing where CMake place object file is still unknown. |
Let's just try to close this issue and if there are further TODOs we can
leave them up as separate issues on github. Kanban is probably more useful
when there is a dedicated full time team, which there is not really here.
Done is better than perfect.
I think it's best to leave the Makefile build as-is, it will cause a lot of
hassle to those who have forked version if we make big changes.
…On Tue, Mar 16, 2021 at 12:04 AM davidlin409 ***@***.***> wrote:
Hi @kkm000 <https://github.com/kkm000> :
For kaldi CMake improvement, may I ask if there is any place that can have
Kanban view? Those items you mentioned above seems like really large
project, while those projects are not directly linked to my work; I might
need to organize them by several topics and handle them separately. The
other option is that I create a open page on Notion.so, and we can add
CMake related topic on top of it. But since this is directly related to
Kaldi, I think that putting Kanban/Wiki on Kaldi Github will be most safe
option. What do you think?
From the four topics you mentioned above, I do have a question about how
those object files are placed in question 4. I know that traditional Kaldi
place object files in the same directory as source code. But that is
actually what old style Makefile does. For example when I am studying in
university as graduated student, I already separate those object files into
separate directory (also in the directory called build folder) in pure
Makefile project, which needs tuning of Makefile since Kaldi Makefile
system is large, but doable. So shall we make CMake place object file the
same as Makefile, or shall we follow CMake and tune Makefile project?
- Making those objects into build directory have an advantage of
deleting those objects in a folder, which does not tweak a lot in
.gitignore file. Also in current system, if I want to clean those binaries,
I needs to purge git repository. For binary/library installation, it can
still be inside src/ folder, which does not affect current binary library
topology, and binary can found library properly with rpath, and those
scripts will still functioning without risk of being break.
As for taking on those topics. For now my first priority is to patch those
bug that is related to my project at work (a patch to egs script that
causes warning in DNN training of nnet1, and CUDA compilation bug of CUDA
capability 35). After that, I might start to take a look at those topics.
But since those topics are not related to my work, I can only do it at
night with my own computer, which actually install Windows instead of
native Linux. So the speed will be much slower (work computer is with 32
CPU threads...), and I will have limited time. The progress will be slow,
but by one-by-one looking into those topics, it can be gradually handled.
For some topics (especially the object to be placed at source directory).
If I would start from those 4 topics, the first one that I might try is
question 3, since I am also bugged by CTest. First and second can be later
investigated (where there might be discussion to ensure what I think is the
same as you). For the 4th question, changing Makefile is easy, but changing
where CMake place object file is still unknown.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4466 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZFLO3VJ4EZBCXPCPENTJTTDYVXJANCNFSM4YROL3JA>
.
|
Co-authored-by: Cy 'kkm' Katsnelson <[email protected]>
- ENDIF changed to endif, to follow kaldi convention. - Add BUILD_SHARED_LIBS option, or user will not know that this option exists (based on kkm000 suggestion) - For OpenFST include/library path, use `find_library`, `find_path`, and `get_filename_component` to locate them, and use corresponding CACHED variable in `include_directories` and `link_directories`. - Mark OpenFST related variable as advanced, since it is found by CMAKE procedure, instead of setted by user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For comment above
Indeed issue is also a way to track those tasks. Thank you @danpovey for the tip :-D.
For the final change
For those changes, I have compiled it again using "shared_library" option, which is successful. I also check OpenFST related CMake variable, which is pointed to right locations. Since final modification is about compilation detail, I did not train again this time.
If training test is needed, please let me know such that I schedule it again.
PS: (The diff show below is old diff, so please use "File Changed" tab above to see last modification)
I'm with @danpovey on this. Let's bite little pieces, easy to chew. And we cannot touch the Makefile layouts. There are tens of thousands users out there who expect it to work as they used to. Most of them know little about build, and are in a rush to write the paper by deadline. So this part is very sensitive. This is also the realistic CMake build should, at the least by default, reproduce the make build exactly. It's not because it's perfect or even good. Look at all the files in tools/config/, which is exactly one, sourced by every run.sh script. Why is it in tools/, why does it require KALDI_ROOT be set when it perfectly knows it is invariably the full path to '../../src'--which could have been just Unfortunately, CMake is not very consistent and being consistent with the rest of CMake does not mean consistency in the normal sense. The convention is _INCLUDE_DIRS plural and _LIBRARY_DIR singular. I can concoct a rationale: when linking to a library X, you may need includes from multiple dirs, but linking to a single libX.a or libX.so, not multiple libraries. But most likely, I'm telling a "just so story". Check FindXXXX.cmake own modules (in /usr/share somewhere), and go along with them. The naming of target properties is very unobvious. To place a .so or an executable binary to a directory you want, set its property
|
I am really surprised anyone uses nnet1. Just yesterday I though of removing it. If you do not mind, can you explain why this choice? What is your GPU? Do you really need sm_35? sm_35 is not a capability, it's an architecture. The capability is e.g. compute_35. Nothing prevents you from compiling capability 3.5 code for e.g. GTX 1080 (architecture 6.1) with And if you really stumble, and are using nnet1, check out Kaldi around 5.2 or 5.3. I think we dropped support for cap 3.5 about a year ago. |
Re kanban, I used to use Kerika for my TODO lists. I thied may different things, but to me, nothing works better than Microsoft OneNote. Drag items up or down or between boxes on a page, highlgih in color--what else you need? If you create a notebook on OneDrive, you can have it synced on all machines that you use. There are 2 OneNote apps: one comes with Windows 10, another with MS Office. Office is not free, but its OneNote is available somewhere as a free standalone download somewhere. I think if you get a trial of Office, you'll be able to install OneNote from it and use after the trial expires, too. I just used to it. Maybe the "new" Windows 10 OneNote gives you the same functionality. I never bothered to check, really :) Office also includes Planner, which is more board-like. I do not know if it's available for free though. |
Hi @kkm000
So, final question is, is there any further blocking factor for my changes made that needs to be fixed? PS: For further CMake discussion, I have opened a free-issue feel free to discuss all kind of stuff there. |
Your PR is very good, actually, thanks! I liked the defensive style in the Python part, asserting the boolean type. CMake produces inconsistent cache, however, this is what we need to solve. Either skip find_library() for OpenFST altogether (but it's good, it will stop build right there if it does not exist), or force-remove its variable from cache, or cache the include and lib directory too. Two cached variables for testing is also bad practice. Since it's work in progress, I can merge almost anything. The whole CMake code set requires more cleanup than this. But it's better be correct than not when adding a feature. Review comments are not a blame, please! Definitely get an older version of Kaldi and older CUDA (9.0 would do), if your GPU is this old. You're using the parts of Kaldi not touched in years. Check out 5.4 and see if it just builds with CUDA 9, with make. CMake is not there yet when you just need a working Kaldi. |
Ok, so do I need further action to finish this PR, or it is already good to go? I am new to Github PR, so don't know how it is working. I still see a sign "1 review requesting changes", don't know if that will block anything @@? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed your changes, sorry. Merging.
Co-authored-by: davidlin409 <[email protected]> Co-authored-by: David2 Lin <[email protected]>
Thank you David, you took on a long unsolved problem. I'm
LD_LIBRARY_PATH should not normally be used. You build Kaldi after OpenFST has been installed, so you do not need any tricks to make linker hardcode the path to them. There is a way to compile a binary against libraries in one place, but have them at runtime in another (path to libraries to link against is specified with OpenFST is installed into tools/openfst, which is a link to the directory with a version number, like tools/openfst-1.7.1. Yes, it is the same directory as the source, but it creates directories that do not exist at the root level (bin/, lib/ and include/; do not get confused: the original includes are in src/include, and src/lib also exists--it's the library source). In the end, whatever is linked, should be linked against e.g. ../../tools/openfst/lib/libfst.so.12 or similar. Use Are you building static or shared Kaldi? I'll look carefully at your changes today. This one point looked strange to me.
I do not think we support it any more, certainly not in the cudadecoder, so if you enable it (in configure, it is disabled by default), it's expected to fail. It's really very old. I do not remember which CUDA devices limited to it, but the ancient K20 is newer, AFAICR. Generally, it's better to specify the architecture(s) that you are testing on when building. Every additional sm_XX significantly adds to compilation time, but there is no point embedding a CUDA fatbin into the executable if the hardware cannot run it. |
@davidlin409, I cannot login to your Notion I got an invitation for, access denied. But There is not so much little stuff that we won't be able to track it in a single ticket (possibly forking off X-ref'd as needed.) |
Hi @kkm000 For Notion, I also don't think it is adequate for issue tracking like stuff. At my personal fork, I start a issue for general discussion, as below: Also, you can create issue there. In my personal fork, I also open a project for CMake. Does that fit our need? |
This branch contains fix of CMake building bug, that causes Kaldi shared library to not functioning properly. Hope that it helps with CMake progress.
This diff is exactly what I mention in Issue "Build System".