Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading a huge file line-by-line, Go is eightfold slower than PHP7? #13730

Closed
anthonyfok opened this issue Dec 25, 2015 · 5 comments
Closed

Reading a huge file line-by-line, Go is eightfold slower than PHP7? #13730

anthonyfok opened this issue Dec 25, 2015 · 5 comments

Comments

@anthonyfok
Copy link

@lixuancn started a thread (in Chinese) at astaxie/build-web-application-with-golang#578, saying that he has written two equivalent programs, one in Go, one in PHP, to read a 13GB text file, split by space into an array, and sum up one of the fields.

He asked the same question on http://www.oschina.net/question/938918_2145778 with posted source code. The rough benchmark that he posted is as follows:

Go: 300 seconds
PHP5.6: 200 seconds
PHP7: 47 seconds

Another user did a similar test, albeit on a much smaller 200MB file, and noticed that:

  • Go, bufio.ReadBytes('\n'): 0.6 second
  • Go, bufio.ReadLine(): 0.1 second
  • PHP: 5.5.9-1ubuntu4.9, Zend Engine v2.5.0: 0.3 second

Yet another noted that the discrepancy between Readline() and ReadBytes('\n') may have to do with the fact the latter does an extra copy operation.

I have not done any testing personally—No, @liuxuancn did not provide actual test data—but if @lixuancn's test result that Go could be eightfold slower than PHP7 is indeed true, that is rather disconcerting indeed.

If ReadBytes('\n') were indeed 6 times slower than ReadLine(), then perhaps the following "advice" in https://golang.org/pkg/bufio/#Reader.ReadLine is not too be relied on in performance-critical situation?

ReadLine is a low-level line-reading primitive. Most callers should use ReadBytes('\n') or ReadString('\n') instead or use a Scanner.

Sorry for not digging into this further, but I thought I should report this issue here.
Hope you can point out any possible way of optimizing @liuxuancn's code, or see if any optimization can be done in Go itself.

Many thanks!
Anthony

@davecheney
Copy link
Contributor

Can you please take this to the mailing list, the issue tracker is only for bugs and it is not clear from this report that there is an issue to be addressed.

Thanks

Dave

@bradfitz
Copy link
Contributor

Links to the mailing list and other forums are in: https://golang.org/wiki/Questions

@davecheney
Copy link
Contributor

@wheelcomplex please do not continue to discuss this on a closed issue.

@wheelcomplex
Copy link
Contributor

Ok, sorry for that.

@lixuancn
Copy link

@anthonyfok Thanks for your interest

@golang golang locked and limited conversation to collaborators Dec 28, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants