Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connections keep cycling #4

Closed
nabeelio opened this issue Jul 4, 2013 · 19 comments
Closed

Connections keep cycling #4

nabeelio opened this issue Jul 4, 2013 · 19 comments
Labels

Comments

@nabeelio
Copy link

nabeelio commented Jul 4, 2013

Another one... could just be me doing something wrong again. The same code that basically was in the other bug.

module.exports.query = function(query, values) {

    var deferred = when.defer();

    if(!_.isArray(values)) {
        values = [];
    }

    var args = [query];
    args.push(values);

    var type = query.charAt(0).toLowerCase();
    if (type !== 's') {
        args.push(Cassandra.types.consistencies.any);
    }

    // Le' callback
    args.push(function(err, results) {
        if(err) {
            logger.error(err);
            return deferred.reject(err);
        }

        return deferred.resolve(results);
    });

    _.log('cassandra::binary: Executing query:', query);
    _cqlClient.execute.apply(_cqlClient, args);
    return deferred.promise;
};

This keeps cycling:

2013-07-04T12:58:23-04:00 'cassandra::binary: Executing query:' 'SELECT id,json FROM filters LIMIT 300;'
2013-07-04T12:58:23-04:00 'info' 'Connecting to all hosts'
2013-07-04T12:58:23-04:00 'info' 'Opened connection #0'
2013-07-04T12:58:23-04:00 'info' 'Checking next connection'
2013-07-04T12:58:23-04:00 'info' 'connection #0 aquired, executing: '
2013-07-04T12:58:23-04:00 'error' 'Connection #0 was set to Unhealthy'
2013-07-04T12:58:23-04:00 'info' 'Retrying execute'
2013-07-04T12:58:23-04:00 'info' 'Checking next connection'
2013-07-04T12:58:23-04:00 'info' 'Checking next connection'
2013-07-04T12:58:23-04:00 'info' 'Checking next connection'
2013-07-04T12:58:23-04:00 'info' 'Checking next connection'
2013-07-04T12:58:23-04:00 'info' 'Checking next connection'
2013-07-04T12:58:23-04:00 'info' 'Checking next connection'
2013-07-04T12:58:23-04:00 'info' 'Checking next connection'
2013-07-04T12:58:23-04:00 'info' 'Checking next connection'
2013-07-04T12:58:23-04:00 'info' 'Checking next connection'
2013-07-04T12:58:23-04:00 'info' 'Checking next connection'
2013-07-04T12:58:23-04:00 'info' 'Checking next connection'
2013-07-04T12:58:23-04:00 'info' 'Checking next connection'
2013-07-04T12:58:23-04:00 'info' 'Checking next connection'
2013-07-04T12:58:23-04:00 'info' 'Checking next connection'
2013-07-04T12:58:23-04:00 'info' 'Checking next connection'
2013-07-04T12:58:23-04:00 'info' 'Checking next connection'
2013-07-04T12:58:24-04:00 'info' 'Checking next connection'
2013-07-04T12:58:24-04:00 'info' 'Checking next connection'
2013-07-04T12:58:24-04:00 'info' 'Checking next connection'
2013-07-04T12:58:24-04:00 'info' 'Checking next connection'
2013-07-04T12:58:24-04:00 'info' 'Checking next connection'
2013-07-04T12:58:24-04:00 'info' 'Retrying to open #0'
2013-07-04T12:58:24-04:00 'info' 'Connection #0 was set to healthy'
2013-07-04T12:58:24-04:00 'info' 'connection #0 aquired, executing: '
2013-07-04T12:58:24-04:00 'error' 'Connection #0 was set to Unhealthy'
@jorgebay
Copy link
Owner

jorgebay commented Jul 4, 2013

Mmmmm.... its get connected but it gets an error executing that is setting the flag to connection "unhealthy"...
(TODO: there should be a maximum amount of retries to execute!)

Could you emit the error, modifying the line 193 of the index.js file to include the err:

self.emit('log', 'info', 'Retrying execute', err);

And post the error code?

@jorgebay
Copy link
Owner

jorgebay commented Jul 4, 2013

These are the error codes that set the connection as "unhealthy". At first glance, it looks correct.

@nabeelio
Copy link
Author

nabeelio commented Jul 4, 2013

2013-07-04T14:05:58-04:00 'info' 'Retrying execute' { code: 4096,
  message: 'Cannot achieve consistency level ALL',
  isServerUnhealthy: true }

@nabeelio
Copy link
Author

nabeelio commented Jul 4, 2013

So I changed it to Cassandra.types.consistencies.any

2013-07-04T14:06:45-04:00 'info' 'connection #0 aquired, executing: '
[2013-07-04T18:06:46.657Z]ERROR: populate_cache/10452: Trying to access beyond buffer length
    RangeError: Trying to access beyond buffer length
        at checkOffset (buffer.js:582:11)
        at Buffer.readInt32BE (buffer.js:750:5)
        at FrameReader.readInt (/mnt/hgfs/Projects/Liveclips/dev/odrt/www/node_modules/node-cassandra-cql/lib/readers.js:22:26)
        at FrameReader.readBytes (/mnt/hgfs/Projects/Liveclips/dev/odrt/www/node_modules/node-cassandra-cql/lib/readers.js:81:24)
        at FrameReader.parseRows (/mnt/hgfs/Projects/Liveclips/dev/odrt/www/node_modules/node-cassandra-cql/lib/readers.js:248:28)
        at FrameReader.parseResult (/mnt/hgfs/Projects/Liveclips/dev/odrt/www/node_modules/node-cassandra-cql/lib/readers.js:214:23)
        at ResponseHandler.handle (/mnt/hgfs/Projects/Liveclips/dev/odrt/www/node_modules/node-cassandra-cql/lib/readers.js:402:23)
        at Connection.handleData (/mnt/hgfs/Projects/Liveclips/dev/odrt/www/node_modules/node-cassandra-cql/lib/connection.js:158:18)
        at Socket.EventEmitter.emit (events.js:95:17)
        at Socket.<anonymous> (_stream_readable.js:736:14)

Setting it to Cassandra.types.consistencies.one - which I use with SELECT on the Thrift driver:

2013-07-04T14:08:20-04:00 'info' 'connection #0 aquired, executing: '
[2013-07-04T18:08:21.088Z]ERROR: populate_cache/10513: Trying to access beyond buffer length
    RangeError: Trying to access beyond buffer length
        at checkOffset (buffer.js:582:11)
        at Buffer.readInt32BE (buffer.js:750:5)
        at FrameReader.readInt (/mnt/hgfs/Projects/Liveclips/dev/odrt/www/node_modules/node-cassandra-cql/lib/readers.js:22:26)
        at FrameReader.readBytes (/mnt/hgfs/Projects/Liveclips/dev/odrt/www/node_modules/node-cassandra-cql/lib/readers.js:81:24)
        at FrameReader.parseRows (/mnt/hgfs/Projects/Liveclips/dev/odrt/www/node_modules/node-cassandra-cql/lib/readers.js:248:28)
        at FrameReader.parseResult (/mnt/hgfs/Projects/Liveclips/dev/odrt/www/node_modules/node-cassandra-cql/lib/readers.js:214:23)
        at ResponseHandler.handle (/mnt/hgfs/Projects/Liveclips/dev/odrt/www/node_modules/node-cassandra-cql/lib/readers.js:402:23)
        at Connection.handleData (/mnt/hgfs/Projects/Liveclips/dev/odrt/www/node_modules/node-cassandra-cql/lib/connection.js:158:18)
        at Socket.EventEmitter.emit (events.js:95:17)
        at Socket.<anonymous> (_stream_readable.js:736:14)

@nabeelio
Copy link
Author

nabeelio commented Jul 4, 2013

This is that table's schema:

CREATE TABLE filters (
  id text PRIMARY KEY,
  created timestamp,
  domain text,
  json text,
  protobuf blob,
  search_spec text,
  type text,
  updated timestamp
) WITH
  bloom_filter_fp_chance=0.010000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=0.100000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};

CREATE INDEX domainFiltersIdx ON filters (domain);

@nabeelio
Copy link
Author

nabeelio commented Jul 4, 2013

Wrapped it in a util.inspect(buf, {depth: null}), not sure why it's showing a ...>

2013-07-04T14:11:31-04:00 'info' 'connection #0 aquired, executing: '
<Buffer 81 00 1f 08 00 01 98 88 00 00 00 02 00 00 00 01 00 00 00 02 00 05 6f 72 69 6f 6e 00 07 66 69 6c 74 65 72 73 00 02 69 64 00 0d 00 04 6a 73 6f 6e 00 0d 00 ...>
<Buffer 81 00 1f 08 00 01 98 88 00 00 00 02 00 00 00 01 00 00 00 02 00 05 6f 72 69 6f 6e 00 07 66 69 6c 74 65 72 73 00 02 69 64 00 0d 00 04 6a 73 6f 6e 00 0d 00 ...>
<Buffer 81 00 1f 08 00 01 98 88 00 00 00 02 00 00 00 01 00 00 00 02 00 05 6f 72 69 6f 6e 00 07 66 69 6c 74 65 72 73 00 02 69 64 00 0d 00 04 6a 73 6f 6e 00 0d 00 ...>
<Buffer 81 00 1f 08 00 01 98 88 00 00 00 02 00 00 00 01 00 00 00 02 00 05 6f 72 69 6f 6e 00 07 66 69 6c 74 65 72 73 00 02 69 64 00 0d 00 04 6a 73 6f 6e 00 0d 00 ...>
<Buffer 81 00 1f 08 00 01 98 88 00 00 00 02 00 00 00 01 00 00 00 02 00 05 6f 72 69 6f 6e 00 07 66 69 6c 74 65 72 73 00 02 69 64 00 0d 00 04 6a 73 6f 6e 00 0d 00 ...>
<Buffer 81 00 1f 08 00 01 98 88 00 00 00 02 00 00 00 01 00 00 00 02 00 05 6f 72 69 6f 6e 00 07 66 69 6c 74 65 72 73 00 02 69 64 00 0d 00 04 6a 73 6f 6e 00 0d 00 ...>
<Buffer 81 00 1f 08 00 01 98 88 00 00 00 02 00 00 00 01 00 00 00 02 00 05 6f 72 69 6f 6e 00 07 66 69 6c 74 65 72 73 00 02 69 64 00 0d 00 04 6a 73 6f 6e 00 0d 00 ...>
<Buffer 81 00 1f 08 00 01 98 88 00 00 00 02 00 00 00 01 00 00 00 02 00 05 6f 72 69 6f 6e 00 07 66 69 6c 74 65 72 73 00 02 69 64 00 0d 00 04 6a 73 6f 6e 00 0d 00 ...>
<Buffer 81 00 1f 08 00 01 98 88 00 00 00 02 00 00 00 01 00 00 00 02 00 05 6f 72 69 6f 6e 00 07 66 69 6c 74 65 72 73 00 02 69 64 00 0d 00 04 6a 73 6f 6e 00 0d 00 ...>
<Buffer 81 00 1f 08 00 01 98 88 00 00 00 02 00 00 00 01 00 00 00 02 00 05 6f 72 69 6f 6e 00 07 66 69 6c 74 65 72 73 00 02 69 64 00 0d 00 04 6a 73 6f 6e 00 0d 00 ...>
<Buffer 81 00 1f 08 00 01 98 88 00 00 00 02 00 00 00 01 00 00 00 02 00 05 6f 72 69 6f 6e 00 07 66 69 6c 74 65 72 73 00 02 69 64 00 0d 00 04 6a 73 6f 6e 00 0d 00 ...>
<Buffer 81 00 1f 08 00 01 98 88 00 00 00 02 00 00 00 01 00 00 00 02 00 05 6f 72 69 6f 6e 00 07 66 69 6c 74 65 72 73 00 02 69 64 00 0d 00 04 6a 73 6f 6e 00 0d 00 ...>
<Buffer 81 00 1f 08 00 01 98 88 00 00 00 02 00 00 00 01 00 00 00 02 00 05 6f 72 69 6f 6e 00 07 66 69 6c 74 65 72 73 00 02 69 64 00 0d 00 04 6a 73 6f 6e 00 0d 00 ...>
<Buffer 81 00 1f 08 00 01 98 88 00 00 00 02 00 00 00 01 00 00 00 02 00 05 6f 72 69 6f 6e 00 07 66 69 6c 74 65 72 73 00 02 69 64 00 0d 00 04 6a 73 6f 6e 00 0d 00 ...>
<Buffer 81 00 1f 08 00 01 98 88 00 00 00 02 00 00 00 01 00 00 00 02 00 05 6f 72 69 6f 6e 00 07 66 69 6c 74 65 72 73 00 02 69 64 00 0d 00 04 6a 73 6f 6e 00 0d 00 ...>
<Buffer 81 00 1f 08 00 01 98 88 00 00 00 02 00 00 00 01 00 00 00 02 00 05 6f 72 69 6f 6e 00 07 66 69 6c 74 65 72 73 00 02 69 64 00 0d 00 04 6a 73 6f 6e 00 0d 00 ...>
<Buffer 81 00 1f 08 00 01 98 88 00 00 00 02 00 00 00 01 00 00 00 02 00 05 6f 72 69 6f 6e 00 07 66 69 6c 74 65 72 73 00 02 69 64 00 0d 00 04 6a 73 6f 6e 00 0d 00 ...>
<Buffer 81 00 1f 08 00 01 98 88 00 00 00 02 00 00 00 01 00 00 00 02 00 05 6f 72 69 6f 6e 00 07 66 69 6c 74 65 72 73 00 02 69 64 00 0d 00 04 6a 73 6f 6e 00 0d 00 ...>
<Buffer 81 00 1f 08 00 01 98 88 00 00 00 02 00 00 00 01 00 00 00 02 00 05 6f 72 69 6f 6e 00 07 66 69 6c 74 65 72 73 00 02 69 64 00 0d 00 04 6a 73 6f 6e 00 0d 00 ...>
<Buffer 81 00 1f 08 00 01 98 88 00 00 00 02 00 00 00 01 00 00 00 02 00 05 6f 72 69 6f 6e 00 07 66 69 6c 74 65 72 73 00 02 69 64 00 0d 00 04 6a 73 6f 6e 00 0d 00 ...>
<Buffer 81 00 1f 08 00 01 98 88 00 00 00 02 00 00 00 01 00 00 00 02 00 05 6f 72 69 6f 6e 00 07 66 69 6c 74 65 72 73 00 02 69 64 00 0d 00 04 6a 73 6f 6e 00 0d 00 ...>
<Buffer 81 00 1f 08 00 01 98 88 00 00 00 02 00 00 00 01 00 00 00 02 00 05 6f 72 69 6f 6e 00 07 66 69 6c 74 65 72 73 00 02 69 64 00 0d 00 04 6a 73 6f 6e 00 0d 00 ...>
<Buffer 81 00 1f 08 00 01 98 88 00 00 00 02 00 00 00 01 00 00 00 02 00 05 6f 72 69 6f 6e 00 07 66 69 6c 74 65 72 73 00 02 69 64 00 0d 00 04 6a 73 6f 6e 00 0d 00 ...>
<Buffer 81 00 1f 08 00 01 98 88 00 00 00 02 00 00 00 01 00 00 00 02 00 05 6f 72 69 6f 6e 00 07 66 69 6c 74 65 72 73 00 02 69 64 00 0d 00 04 6a 73 6f 6e 00 0d 00 ...>
<Buffer 81 00 1f 08 00 01 98 88 00 00 00 02 00 00 00 01 00 00 00 02 00 05 6f 72 69 6f 6e 00 07 66 69 6c 74 65 72 73 00 02 69 64 00 0d 00 04 6a 73 6f 6e 00 0d 00 ...>
<Buffer 81 00 1f 08 00 01 98 88 00 00 00 02 00 00 00 01 00 00 00 02 00 05 6f 72 69 6f 6e 00 07 66 69 6c 74 65 72 73 00 02 69 64 00 0d 00 04 6a 73 6f 6e 00 0d 00 ...>
<Buffer 81 00 1f 08 00 01 98 88 00 00 00 02 00 00 00 01 00 00 00 02 00 05 6f 72 69 6f 6e 00 07 66 69 6c 74 65 72 73 00 02 69 64 00 0d 00 04 6a 73 6f 6e 00 0d 00 ...>
<Buffer 81 00 1f 08 00 01 98 88 00 00 00 02 00 00 00 01 00 00 00 02 00 05 6f 72 69 6f 6e 00 07 66 69 6c 74 65 72 73 00 02 69 64 00 0d 00 04 6a 73 6f 6e 00 0d 00 ...>
<Buffer 81 00 1f 08 00 01 98 88 00 00 00 02 00 00 00 01 00 00 00 02 00 05 6f 72 69 6f 6e 00 07 66 69 6c 74 65 72 73 00 02 69 64 00 0d 00 04 6a 73 6f 6e 00 0d 00 ...>
[2013-07-04T18:11:31.437Z]ERROR: populate_cache/10661: Trying to access beyond buffer length
    RangeError: Trying to access beyond buffer length

@nabeelio
Copy link
Author

nabeelio commented Jul 4, 2013

Sorry for all the comments, just trying to also see what is up:

this.readInt = function() {
        console.log('Buffer Length: ', buf.length, 'Offset: ', offset);
        var result = buf.readInt32BE(offset);
        offset += 4;
        return result;
    };
Buffer Length:  2712 Offset:  8
Buffer Length:  2712 Offset:  12
Buffer Length:  2712 Offset:  16
Buffer Length:  2712 Offset:  50
Buffer Length:  2712 Offset:  54
Buffer Length:  2712 Offset:  85
Buffer Length:  2712 Offset:  508
Buffer Length:  2712 Offset:  538
Buffer Length:  2712 Offset:  695
Buffer Length:  2712 Offset:  726
Buffer Length:  2712 Offset:  1123
Buffer Length:  2712 Offset:  1154
Buffer Length:  2712 Offset:  1312
Buffer Length:  2712 Offset:  1343
Buffer Length:  2712 Offset:  1501
Buffer Length:  2712 Offset:  1531
Buffer Length:  2712 Offset:  1953
Buffer Length:  2712 Offset:  1982
Buffer Length:  2712 Offset:  2366
Buffer Length:  2712 Offset:  2397
Buffer Length:  2712 Offset:  2811
[2013-07-04T18:13:38.845Z]ERROR: populate_cache/10743: Trying to access beyond buffer length
    RangeError: Trying to access beyond buffer length

@jorgebay
Copy link
Owner

jorgebay commented Jul 4, 2013

First conclusion: Error code 0x1000 (4096) is not a "fatal error".
Second: For sure there is something wrong with the FrameParser class for one of the datatypes in the table.

I will reproduce it tomorrow and try to fix it (you are more than welcome to send a pull request if you'ld like!)
Thanks for your great help!

@nabeelio
Copy link
Author

nabeelio commented Jul 4, 2013

No problem, I will try to look at it in some more detail

@jorgebay
Copy link
Owner

jorgebay commented Jul 5, 2013

Hi,
I created a unit test where I created the table filters, insert data and retrieve data and I wasn't able to reproduce (I had another issues and fixed them, but not this one).
The bug is raised when trying to parse specifically 1 of the rows inside your column family (Is it the row 30?). Could you try to LIMIT to 10 then to 20 ...etc?

I will continue to try to reproduce it...

@jorgebay
Copy link
Owner

jorgebay commented Jul 5, 2013

Hi,
I could reproduce it!!
When messages are large, data event on net.Sockets is chunked (d'oh!)

The parsing needs to be changed to support this...

@jorgebay
Copy link
Owner

jorgebay commented Jul 6, 2013

I implemented partial data handling, I tested using 100K messages and it looks good!

@nabeelio
Copy link
Author

nabeelio commented Jul 6, 2013

Sweet, thanks! I'm about to test this right now

@nabeelio
Copy link
Author

nabeelio commented Jul 7, 2013

All I'm seeing now is:

2013-07-06T20:00:09-04:00 'info' 'Waiting for pool to connect'

One for every query - I've got about 1200+ that go async, but I'm just getting the above.. I'm able to connect with Thrift (I have a generic "driver" so I can swap). Binary protocol is enabled server-side... it's the same settings I had as before

@jorgebay
Copy link
Owner

jorgebay commented Jul 8, 2013

I added a write queue for sending the packets to the socket.
I made some tests using something like and it looks good:

async.times(100, function(n, next){
  client.execute('select * from filters;', [], function (err, result) {/*...*/});
  //do not wait until it is callback'ed
  next();
});
//do not shutdown until all has been callback'ed :)

Let me know if it works as expected.
Thanks for your great help!!!

@nabeelio
Copy link
Author

Great, thanks! Appears to be working. I'm not sure if this is indicative of a leak.

screen shot 2013-07-10 at 3 46 14 am

@nabeelio
Copy link
Author

Same chain for every Buffer obj, I believe that originates from you

screen shot 2013-07-10 at 4 05 17 am

@jorgebay
Copy link
Owner

Yes, it's leaking fast! I was checking it the past few days.
If its OK to you, I will close this issue and open a new one for better tracking.

Thanks again for such great help!!!!!!

@nabeelio
Copy link
Author

Not a problem at all, glad I can help. I would try to be helping you fix
actual code, but I'm neck deep in other issues :/

Thanks for being so responsive!
On Jul 10, 2013 4:20 AM, "Jorge Bay Gondra" [email protected]
wrote:

Yes, it's leaking fast! I was checking it the past few days.
If its OK to you, I will close this issue and open a new one for better
tracking.

Thanks again for such great help!!!!!!


Reply to this email directly or view it on GitHubhttps://github.com//issues/4#issuecomment-20727765
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants