Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cuda 文件对应的 object target 无法配置 devlink #4992

Closed
TOMO-CAT opened this issue Apr 19, 2024 · 6 comments
Closed

cuda 文件对应的 object target 无法配置 devlink #4992

TOMO-CAT opened this issue Apr 19, 2024 · 6 comments
Labels

Comments

@TOMO-CAT
Copy link

Xmake 版本

xmake v2.8.5

操作系统版本和架构

Linux 720ce3a659a2 5.15.90.1-microsoft-standard-WSL2 #1 SMP Fri Jan 27 02:56:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

描述问题

按照 xmake 文档,对于 cuda static target,可以通过 policy 开启 devlink:
image
对于一些简单的场景,可以直接将 cuda 相关源码编译成 static 对象:
image
它也能正常 compile - devlink - ar:
image

但是如果是将 cuda 编译成 object target,就会缺少 devlink 的一步,即使我配置了对应的 policy:
image
image

期待的结果

希望 cuda 源码的 object target 可以在配置相应的 policy 后,生成对应的 cuda_test_kernel_gpucode.cu.o,然后添加到 target:objectfiles() 里。

工程配置

target("zgpu.cuda_util.cuda_test_kernel", function()
    set_kind("object")
    add_files("cuda_test_kernel.cu")
    add_deps("zgpu.cuda_util.cuda_base_memory")
    set_optimize("faster")
    add_values("cuda.build.devlink", true)
end)

target("zgpu.cuda_util.cuda_base_memory_test", function()
    set_kind("binary")
    set_default(false)
    add_tests("default")
    add_files("cuda_base_memory_test.cc")
    add_deps("zgpu.cuda_util.cuda_base_memory",
             "zgpu.cuda_util.cuda_test_kernel")
    add_packages("gtest")
end)

附加信息和错误日志

@TOMO-CAT TOMO-CAT added the bug label Apr 19, 2024
@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


Title: The object target corresponding to the cuda file cannot be configured with devlink

@TOMO-CAT
Copy link
Author

测试了一下,这样就 ok 了,感觉是之前的 devlink 对 object 支持不够友好:

rule("cuda_devlink", function()
    on_config(function(target)
        import("core.platform.platform")
        -- get cuda sdk
        local cuda = assert(target:data("cuda"), "Cuda SDK not found!")
        -- add arch
        if target:is_arch("i386", "x86") then
            target:add("cuflags", "-m32", {force = true})
            target:add("culdflags", "-m32", {force = true})
        else
            target:add("cuflags", "-m64", {force = true})
            target:add("culdflags", "-m64", {force = true})
        end
        -- add rdc, @see https://github.com/xmake-io/xmake/issues/1975
        if target:values("cuda.rdc") ~= false then
            target:add("cuflags", "-rdc=true")
        end
        -- add links
        target:add("syslinks", "cudadevrt")
        local cudart = false
        for _, link in ipairs(table.join(target:get("links") or {},
                                         target:get("syslinks"))) do
            if link == "cudart" or link == "cudart_static" then
                cudart = true
                break
            end
        end
        if not cudart then target:add("syslinks", "cudart_static") end
        if target:is_plat("linux") then
            target:add("syslinks", "rt", "pthread", "dl")
        end
        target:add("linkdirs", cuda.linkdirs)
        target:add("rpathdirs", cuda.linkdirs)

        -- add includedirs
        target:add("includedirs", cuda.includedirs)
    end)

    after_build(function(target, opt)
        import("core.base.option")
        import("core.tool.linker")
        import("core.project.depend")
        import("utils.progress")

        -- load linker instance
        local linkinst = linker.load("gpucode", "cu", {target = target})
        -- init culdflags
        local culdflags = {"-dlink"}
        -- add shared flag
        if target:is_shared() then table.insert(culdflags, "-shared") end
        -- get link flags
        local linkflags = linkinst:linkflags({
            target = target,
            configs = {force = {culdflags = culdflags}}
        })
        -- get target file
        local targetfile = target:objectfile(
                               path.join("rules", "cuda", "devlink",
                                         target:basename() .. "_gpucode.cu"))
        -- get object files
        local objectfiles = nil
        for _, sourcebatch in pairs(target:sourcebatches()) do
            if sourcebatch.sourcekind == "cu" then
                objectfiles = sourcebatch.objectfiles
            end
        end
        if not objectfiles then return end
        -- insert gpucode.o to the object files
        table.insert(target:objectfiles(), targetfile)
        -- need build this target?
        local depfiles = objectfiles
        for _, dep in ipairs(target:orderdeps()) do
            if dep:kind() == "static" then
                if depfiles == objectfiles then
                    depfiles = table.copy(objectfiles)
                end
                table.insert(depfiles, dep:targetfile())
            end
        end
        local dryrun = option.get("dry-run")
        local depvalues = {linkinst:program(), linkflags}
        depend.on_changed(function()

            -- is verbose?
            local verbose = option.get("verbose")

            -- trace progress info
            progress.show(opt.progress,
                          "${color.build.target}devlinking.$(mode) %s",
                          path.filename(targetfile))

            -- trace verbose info
            if verbose then
                -- show the full link command with raw arguments, it will expand @xxx.args for msvc/link on windows
                print(linkinst:linkcmd(objectfiles, targetfile,
                                       {linkflags = linkflags, rawargs = true}))
            end

            -- link it
            if not dryrun then
                assert(linkinst:link(objectfiles, targetfile,
                                     {linkflags = linkflags}))
            end

        end, {
            dependfile = target:dependfile(targetfile),
            lastmtime = os.mtime(targetfile),
            changed = target:is_rebuilt(),
            values = depvalues,
            files = depfiles,
            dryrun = dryrun
        })
    end)
end)

target("zgpu.cuda_util.cuda_test_kernel", function()
    set_kind("object")
    add_files("cuda_test_kernel.cu")
    add_deps("zgpu.cuda_util.cuda_base_memory")
    set_optimize("faster")
    add_rules("cuda_devlink")
end)

@star-hengxing
Copy link
Contributor

star-hengxing commented Apr 19, 2024

devlink 作为 rule,在 before_link 执行,估计 object kind 没 link 这步骤,就没执行了

可能给 object kind 弄一个 phony link 更好?

@TOMO-CAT
Copy link
Author

devlink 作为 rule,在 before_link 执行,估计 object kind 没 link 这步骤,就没执行了

可能给 object kind 弄一个 phony link 更好?

感觉还是 xmake 原生支持会更好?不过目前我暂时通过 package rule 实现。

@waruqi
Copy link
Member

waruqi commented Apr 25, 2024

等后面有空可以看下 能否改进,最近没啥时间。

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


I can see if I can improve it when I have time later, I don’t have much time recently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants