Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

mxnet.nd.random.multinomial is very slow on CPU and stucked on GPU #15231

Closed
chinakook opened this issue Jun 13, 2019 · 9 comments
Closed

mxnet.nd.random.multinomial is very slow on CPU and stucked on GPU #15231

chinakook opened this issue Jun 13, 2019 · 9 comments

Comments

@chinakook
Copy link
Contributor

chinakook commented Jun 13, 2019

MXNet version is mxnet-cu100mkl 1.5.0b20190605(also tested on mxnet-cu100 1.4.1)

import time
import numpy as np
import mxnet as mx


p = [1 / 300000.] * 300000

probs = np.array(p)
t0 = time.time()
sample_times = np.random.multinomial(len(probs), np.array(probs))
t1 = time.time()
print("Numpy MultiNomial Time", t1 - t0)

probs = mx.nd.array(p, ctx=mx.cpu())
t0 = time.time()
indices = mx.nd.random.multinomial(probs, len(probs))
mx.nd.waitall()
t1 = time.time()
print("MXNet CPU MultiNomial Time", t1 - t0)

# stucked
# probs = mx.nd.array(p, ctx=mx.gpu(0))
# t0 = time.time()
# indices = mx.nd.random.multinomial(probs, len(probs))
# mx.nd.waitall()
# t1 = time.time()
# print("MXNet GPU MultiNomial Time", t1 - t0)

The result is

Numpy MultiNomial Time 0.01881909
MXNet CPU MultiNomial Time 43.93881797
@mxnet-label-bot
Copy link
Contributor

Hey, this is the MXNet Label Bot.
Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it.
Here are my recommended labels: Performance

@pengzhao-intel
Copy link
Contributor

@zixuanweeei could you take a look for the CPU performance?

@zixuanweeei
Copy link
Contributor

While np.random.multinomial gives the frequency of each outcome drawn from n experiments (sampling without replacement), mx.nd.random.multinomial gives the outcomes sampled from the multinomial distribution (sampling with replacement). At first thought, they produce the result in different ways. mx.nd.random.multinomial uses a very simple method as described in this link. The complexity of the method above can be O(N^2) for sampling K times from K possible outcomes with replacement. So it will be time-consuming for a large value of K as 300000 in your script. Any method can be taken to optimize the sampling method after that we further analysis the complexity and take some surveys on advance sampling algorithms.
Any insights of this?

@chinakook
Copy link
Contributor Author

You can get nearly the same result as mx.nd.random.multinomial, but very swift.

        sample_list = []
        p = [1 / 300000.] * 300000

        probs = np.array(p)
        sample_times = np.random.multinomial(len(probs), np.array(probs))
        for i, t in enumerate(sample_times):
            if t > 0:
                sample_list.extend([i]*t)
        random.shuffle(sample_list)

@chinakook
Copy link
Contributor Author

pyt*ch version

t0 = time.time()
p = [1/300000.] * 300000
probs = torch.as_tensor(p)
indices = torch.multinomial(probs, len(probs), True)
print("Pytorch output:\t\t", indices.tolist())
t1 = time.time()
print(t1 - t0)

@zixuanweeei
Copy link
Contributor

zixuanweeei commented Jun 17, 2019

@chinakook Thanks for your examples. It's informative. Pytorch uses a binary search for the slot in which the prob falls, while it uses linear search in MXNet. Pytorch uses the alias method as well. But I am not sure whether it works for drawing samples just once. We will optimize it using binary search at this moment.

@zixuanweeei
Copy link
Contributor

We used binary search to optimize mx.nd.random.multinomial and it got interesting result. On Xeon Platinum 8180 platform, it costs ~0.0535s for the example @chinakook provided, while it costs ~59s before our optimization. By the way, the numpy one costs ~0.0279s.

Thanks for @chinakook 's examples. The optimization requires further verification. And the PR is on the way:tada:.

@zixuanweeei
Copy link
Contributor

@chinakook Thanks for the issue. You can close it if there are no problems. BTW, considering that

... the binomial distribution describes the probability of k successes in n draws with replacement.

from this, I think np.random.multinomial also draws samples with replacement. Sorry for the mistake in my reply.

Any insights of this?

@chinakook
Copy link
Contributor Author

No problem.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants