Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checkpoint speedup #1908

Merged
merged 1 commit into from
Jan 19, 2016
Merged

Checkpoint speedup #1908

merged 1 commit into from
Jan 19, 2016

Conversation

Amir-61
Copy link
Contributor

@Amir-61 Amir-61 commented Dec 22, 2015

@bajtos This is still a WIP. Please let me know if I'm going the right direction.

Problem:

The previous checkpoint implementation was storing all of the previous checkpoints in DB and every time we needed to get the current checkpoint we needed to query all the checkpoints, sort them in descending order and finally get the first checkpoint; in other words, the algorithm was naive and very slow, for every time we needed to query all the checkpoints, which are not indexed nor are they all needed.

Fix:

Now we just save one global variable in DB for checkpoint and update it whenever needed.

Connect to #1864

Other notes:

We obviously expect the test cases to fail since findOrCreate for memory connector is implemented in another PR under loopback-datasource-juggler - see loopbackio/loopback-datasource-juggler#815

Thanks!

}
});
],
cb);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This particular part is pretty simple and can be coded without async. Keep in mind that async adds a bit of overhead, so it's better to not overuse it.

Could you please rewrite this without async?

@Amir-61
Copy link
Contributor Author

Amir-61 commented Jan 6, 2016

Hey @bajtos,

Could you please review this patch?

Thanks a lot!

@Amir-61 Amir-61 assigned bajtos and unassigned Amir-61 Jan 6, 2016
@bajtos bajtos changed the title WIP: Checkpoint speedup Checkpoint speedup Jan 6, 2016
model.seq = seq + 1;
next();
/**
* Increase the current checkpoint If it already exists otherwise initialize it
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo, should be ... checkpoint if it ... (lower-case i)

@bajtos
Copy link
Member

bajtos commented Jan 6, 2016

@Amir-61 reviewed, see the comment above. You are heading in the right direction!

@bajtos
Copy link
Member

bajtos commented Jan 6, 2016

@Amir-61 could you please add a test to verify that current does not create more than one checkpoint, as I outlined in my earlier comment #1908 (comment)?

async.parallel([
  function(next) { Checkpoint.current(next); },
  function(next) { Checkpoint.current(next); }
], function(err, list) {
  if (err) return done(err);
  Checkpoint.find(function(err) {
    if (err) return done(err);
    expect(list).to.have.length(1);
    done();
  });
});

@bajtos
Copy link
Member

bajtos commented Jan 6, 2016

Possibly add a test to verify what happens when bump is called in parallel too. There should be still only one checkpoint instance, and the seq number should have been increased either by 1 or 2 (both outcomes are valid according the reasoning behind our implementation).

@bajtos bajtos assigned Amir-61 and unassigned bajtos Jan 6, 2016
@Amir-61
Copy link
Contributor Author

Amir-61 commented Jan 7, 2016

@Amir-61 could you please add a test to verify that current does not create more than one checkpoint, as I outlined in my earlier comment #1908 (comment)?

async.parallel([
  function(next) { Checkpoint.current(next); },
  function(next) { Checkpoint.current(next); }
], function(err, list) {
  if (err) return done(err);
  Checkpoint.find(function(err) {
    if (err) return done(err);
    expect(list).to.have.length(1);
    done();
  });
});

@bajtos I guess you meant this assertion: expect(data).to.have.length(1); in the following code. Am I right?

      async.parallel([
        function(next) { Checkpoint.current(next); },
        function(next) { Checkpoint.current(next); }
      ], function(err, list) {
        if (err) return done(err);
        Checkpoint.find(function(err, data) {
          if (err) return done(err);
          expect(data).to.have.length(1);
          done();
        });
      });

// We expect `seq` to be 2 since `checkpoint` does not exist and
// `bumpLastSeq` for the first time not only initializes it to one,
// but also increments the initialized value by one.
expect(cp.seq).to.equal(2);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bajtos, Please note I can check if the checkpoint is a new instance or not and if it is new I can just initialize it to 1 as opposed to initialize it and increment the initialized value by one; something like:

  Checkpoint.bumpLastSeq = function(cb) {
    var Checkpoint = this;
    Checkpoint._getSingleton(function(err, cp, isNew) {
      if (err) return cb(err);
      if (isNew) return cb(null, cp);

      var originalSeq = cp.seq;
      cp.seq++;
      // Update the checkpoint but only if it was not changed under our hands
      Checkpoint.updateAll({id: cp.id, seq: originalSeq}, {seq: cp.seq}, function(err, info) {
        if (err) return cb(err);
        cb(null, cp);
      });
    });
  };

But it depends what behavior your would expect in this case...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all connectors report whether findOrCreate creates a new instance or not. Let's keep the current implementation as it simpler. I think the extra cost caused by the more complex version is not worth the little gain it brings.

@Amir-61
Copy link
Contributor Author

Amir-61 commented Jan 7, 2016

Hey @bajtos,

Thanks so much for the review, patience and constructive feedback! I applied your feedback and left a comment; I'm assigning this PR to you now. Please review.

Also it is worth mentioning that test cases pass on local machine using npm link for the support for findOrCreate in loopback-datasource-juggler and here we obviously expect the test cases to fail since findOrCreate for memory connector is implemented in another PR under loopback-datasource-juggler - see loopbackio/loopback-datasource-juggler#815

Thanks a lot!

@Amir-61 Amir-61 removed their assignment Jan 7, 2016
@Amir-61
Copy link
Contributor Author

Amir-61 commented Jan 7, 2016

Possibly add a test to verify what happens when bump is called in parallel too. There should be still only one checkpoint instance, and the seq number should have been increased either by 1 or 2 (both outcomes are valid according the reasoning behind our implementation).

I don't think if this test case is that much helpful; as you already mentioned in the code there is a possibility of race condition in this case for bumpLastSeq(), which should not be a big problem as you explained; in other words, the race condition may or may not happen; hence we can not expect a specific value in this case...But again for the reasons you illustrated in the code they should be OK for now...

Please correct me if I'm wrong...

Hmm, this will be tricky to implement in an atomic fashion. I wish we had a method like updateOrCreate(filter, { seq: { $inc: 1 } }, { seq: 1}).

👍

@Amir-61
Copy link
Contributor Author

Amir-61 commented Jan 7, 2016

Possibly add a test to verify what happens when bump is called in parallel too. There should be still only one checkpoint instance, and the seq number should have been increased either by 1 or 2 (both outcomes are valid according the reasoning behind our implementation).

I don't think if this test case is that much helpful; as you already mentioned in the code there is a possibility of race condition in this case for bumpLastSeq(), which should not be a big problem as you explained; in other words, the race condition may or may not happen; hence we can not expect a specific value in this case...But again for the reasons you illustrated in the code they should be OK for now...
Please correct me if I'm wrong...

Oh no I'm wrong; it is still safe to always expect seq to be 2 based on the current implementation if we have at most 2 parallel calls of bumpLastSeq where we are sure we never encounter the third outcome you mentioned in the code

  1. seq was bumped more than once, so we will be using a value that is behind the latest seq.

I believe if we have more than 2 parallel calls of bumpLastSeq there is a possibility of the third outcome; however the odds are low.

So I add the test with 2 parallel calls of bumpLastSeq to be safe.

Please correct me if I'm wrong...

Thanks!

if (err) return done(err);
Checkpoint.find(function(err, data) {
if (err) return done(err);
expect(data[0].seq).to.equal(2);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not entirely sure that the current implementation guarantees the value will be exactly 2, I think it may be possible to arrive at 3, depending on timing. Perhaps the memory connector guarantees 2, but I think it won't be the case with external databases.

Could you please add a comment to explain a bit more what we are asserting here?

I would also move the assertion of data.length at the top and add a comment too.

We should also check that the values returned by bumpLastSeq are all the same (or lower) than the new checkpoint seq.

Checkpoint.find(function(err, data) {
  if (err) return done(err);
  // The invariant "we have at most 1 checkpoint instance" is preserved
  // even when multiple calls are made in parallel
  expect(data).to.have.length(1);

  // There is a race condition here, we could end up with both 2 or 3 as the "seq".
  // The current implementation of the memory connector always yields 2 though.
  expect(data[0].seq).to.equal(2);

  // All checkpoints returned by bumpLastSeq must be 
  //   greater than the old last seq and
  //   lower-or-equal than the new last seq.
  // In this particular case, since the new last seq is always 2, both results
  // should be 2.
  expect(list.map(function(it) { return it.seq; })
    .to.eql([2, 2]);

@bajtos
Copy link
Member

bajtos commented Jan 8, 2016

@Amir-61 Two more minor comments to address, otherwise the patch LGTM.

I am proposing the following steps:

@bajtos bajtos assigned Amir-61 and unassigned bajtos Jan 8, 2016
@Amir-61 Amir-61 force-pushed the checkpoint_speedup branch from f830ee7 to 08a2786 Compare January 9, 2016 06:58
@Amir-61
Copy link
Contributor Author

Amir-61 commented Jan 9, 2016

Applied last feedback, squashed and repased; I will merge the code accordingly once loopbackio/loopback-datasource-juggler#815 is landed to both master and 2.x

@Amir-61 Amir-61 added #wip and removed #review labels Jan 11, 2016
@0candy 0candy assigned raymondfeng and unassigned Amir-61 Jan 12, 2016
@bajtos
Copy link
Member

bajtos commented Jan 19, 2016

@0candy why is this assigned to Raymond? I believe no further reviews are necessary, we are waiting for @Amir-61 to land the other patch in [email protected]. Can we assign this back to @Amir-61?

@Amir-61 Amir-61 assigned Amir-61 and unassigned raymondfeng Jan 19, 2016
@Amir-61
Copy link
Contributor Author

Amir-61 commented Jan 19, 2016

@bajtos Sorry I forgot to assign it back to myself; I land it to master and 2.x now

@Amir-61
Copy link
Contributor Author

Amir-61 commented Jan 19, 2016

@slnode test please

Amir-61 added a commit that referenced this pull request Jan 19, 2016
@Amir-61 Amir-61 merged commit c9be67e into master Jan 19, 2016
@Amir-61 Amir-61 removed the #wip label Jan 19, 2016
@Amir-61 Amir-61 deleted the checkpoint_speedup branch January 19, 2016 15:33
@superkhau
Copy link
Contributor

@Amir-61 This is the one causing CI to fail. http://ci.strongloop.com/job/loopback/2806/

@Amir-61
Copy link
Contributor Author

Amir-61 commented Jan 20, 2016

@Amir-61 This is the one causing CI to fail. http://ci.strongloop.com/job/loopback/2806/

I chatted with @superkhau and apparently failure is not due to this PR. This PR got merged on Jan 19th morning ET and before that the CI was already failing and talked to @bajtos and we figured out it is a known issue and after merging this patch we see some green lights...

@superkhau
Copy link
Contributor

Yes, it's related to some windows issues at #1946 after speaking with @rmg.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants