-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Network performance issue (30 secs for 8mb PUT) #1409
Comments
Attachments are a known weak point of CouchDB. What happens if you attempt to PUT an 8MB JSON doc (not an attachment, just an 8MB document)? |
I base64 encoded the file as a text file then did the PUT with application/text and it was worse 1 minute 29 seconds. So this is a 10MB text file that now takes 1 min 29 seconds to PUT. |
And I don't believe this issue is todo with attachments as the same PUT that takes 1 min 29 secs takes less than 1 second if it is done from a client on the same local network as the couchdb server. Must be some network packet size/timing issue in the network layer of couchdb. |
OK good news found solution to performance issue. Turns out the recbuf setting is the cause. I had to comment out the recbuf param in mochiweb/mochiweb_socket_server.erl and then the time to put a 10 MB text file went from 1 min 29 seconds to < 1 second !! Could this be the cause of all the complaints about how slow couchdb is for attachments? I believe not setting recbuf on linux then allows the operating system to handle it more efficiently. (When set it seems to be causing some tcp window size performance issues that lead to the huge delay for larger PUTs.) I also found this PR mochi/mochiweb#153 to fix mochiweb to allow an "undefined" setting for recbuf, but couchdb 2.1.1 is not using this version, I tried using this version with couchdb but couchdb did not seem to allow undefined as a param to recbuf in the ini file. So it would seem like we need couchdb to support setting recbuf to undefined, maybe even having that a the default? Thanks, |
I think this one dates back to https://issues.apache.org/jira/browse/COUCHDB-1986. I wasn't too involved with that investigation but when I read through the comments it seems that we didn't have a great reason for customizing the buffer size beyond the fact that it improved things in those specific scenarios where tests were timing out. Note that under the hood mochi was using a fixed CouchDB master has upgraded the mochiweb dependency to |
@kocolosk thank you for the sleuthing and memory here! I'll get a PR up to change the default in master to @stevedrew for reference, I help people run CouchDB on AWS all the time, and we've not seen this one crop up before - presumably because people are on low-latency links to AWS from their clients/app servers, whereas mochi/mochiweb#153 points the finger at high-latency links. So, thank you for the report! |
@stevedrew this isn't the root cause for CouchDB's slow attachment behaviour universally, no. A lot of that slowness comes from the serialisation of attachment data both into the b-tree and over the wire. Internal attachment replication between nodes in a 2.x cluster is also unoptimized, and can block other operations on very large files, leading to database-wide issues. We still recommend keeping attachments in CouchDB below 10MB per document, which you have already met, hooray! There is an ongoing discussion about how to help guide users towards these defaults, see #1200 and #1253. |
gotcha thanks. Yes in fact our attachments are actually less than 1MB, but we just noticed very slow network transfers even with that, and then tested with 8MB to pull out the problem. |
Due to the concerns about changing the defaults here, and the need to get a 2.2.0 release out the door, the change to CouchDB's defaults will not happen until post-2.2.0. |
Expected Behavior
Should complete in a reasonable amount of time, ie < 1 second (its only a 8mb file).
Current Behavior
The above command if issued from a remote client (but over a very fast network) can take up to 30 seconds.
The same command issue from a client on the same network takes < 1 second.
Network is very fast between servers, an scp of same file takes < 0.1 seconds.
Tested on dedicated servers and multiple AWS servers. Always same result.
It seems like maybe couchdb network layer maybe sending too small a packet?
I have configured a total of 6 servers mix of AWS and dedicated, all experience the same slowness
Steps to Reproduce (for bugs)
Your Environment
just single nodes and no special configuration.
The text was updated successfully, but these errors were encountered: