Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unnecessary -1 for cache index #2

Open
andreasKroepelin opened this issue Jun 23, 2020 · 0 comments
Open

Unnecessary -1 for cache index #2

andreasKroepelin opened this issue Jun 23, 2020 · 0 comments

Comments

@andreasKroepelin
Copy link

andreasKroepelin commented Jun 23, 2020

In the section "Shared Memory and Synchronisation" you introduce the variable cacheIndex as threadIdx().x - 1 in the dot product kernel. But after that you always refer to cacheIndex + 1. It might therefore be more concise to drop that -1.

The updated code with my edits highlighted:

function dot(a,b,c, N, threadsPerBlock, blocksPerGrid)

    # Set up shared memory cache for this current block.
    cache = @cuDynamicSharedMem(Int64, threadsPerBlock)

    # Initialise some variables.
    tid = (threadIdx().x - 1) + (blockIdx().x - 1) * blockDim().x
    totalThreads = blockDim().x * gridDim().x
    cacheIndex = threadIdx().x # <<< HERE >>>
    temp = 0

    # Iterate over vector to do dot product in parallel way
    while tid < N
        temp += a[tid + 1] * b[tid + 1]
        tid += totalThreads
    end

    # set cache values
    cache[cacheIndex] = temp # <<< HERE >>>

    # synchronise threads
    sync_threads()

    # In the step below, we add up all of the values stored in the cache
    i::Int = blockDim().x ÷ 2
    while i!=0
        if cacheIndex <= i # <<< HERE >>>
            cache[cacheIndex] += cache[cacheIndex + i] # <<< HERE >>>
        end
        sync_threads()
        i = i ÷ 2
    end

    # cache[1] now contains the sum of vector dot product calculations done in
    # this block, so we write it to c
    if cacheIndex == 1 # <<< HERE >>>
        c[blockIdx().x] = cache[1]
    end

    return nothing
end

Sorry for not doing a proper pull request, I thought this way it would be a bit easier.

Thank you for taking the time writing such a tutorial!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant