Atomic "find or create" Repository API #1956

bajtos · 2018-11-01T10:55:11Z

While implementing hasOne relation in #1879, we discovered a requirement for an atomic "find or create" operation that's needed to avoid race conditions in hasOne's implementation of create operation.

Cross-posting #1879 (comment)

It is crucial to leverage an atomic implementation of findOrCreate provided by our connectors. The current proposal is prone to race conditions, where to instances of the target "hasOne" model can be created.

Consider the following scenario: the LB4 server is handling two incoming HTTP requests to create a hasOne target and the code is executed in the following way by Node.js runtime:

Request 1 arrives, DefaultHasOneRepository#create is invoked
targetRepository.find is called. It's an async function so other stuff happens while the query is in progress.
Request 2 arrives, DefaultHasOneRepository#create is invoked.
targetRepository.find is called. It's an async function so other stuff happens while the query is in progress.
targetRepository.find for Request 1 returns, no model was found. targetRepository.create in called.
targetRepository.find for Request 2 returns, no model was found. targetRepository.create in called.
Both create requests eventually finish. We have two models that are a target of hasOne relation. 💥 💣 💥

Acceptance criteria

Add a new Repository-like interface describing atomic operations.

export interface FindOrCreateResult<T extends Entity> {
 entity: T;
 found: boolean;
}

export interface AtomicCrudRepository<T extends Entity, ID>
   extends EntityCrudRepository<T, ID> {
    findOrCreate(
     filter: Filter<T>,
     entity: DataObject<T>,
     options?: Options,
   ): Promise<FindOrCreateResult>
}

Add a new repository class implementing this new interface while extending DefaultCrudRepository. Either the constructor or the findOrCreate method should verify that the connector provides atomic implementation of findOrCreate and throw an error with a machine-readable code if it does not.
Modify the table mapping known error codes to HTTP status codes to map the new error code to 501 Not Implemented.

Nice to have but not required:

A new repository class implementing AtomicCrudRepository on top of CrudRepositoryImpl.

Out of scope

Support for other atomic operations like patchOrCreate, replaceOrCreate, etc.

The text was updated successfully, but these errors were encountered:

bajtos · 2018-11-01T11:05:09Z

Few thoughts:

(1)
In LoopBack 3.x, when a connector did not implement atomic findOrCreate, the DataAccessObject provided a non-atomic fallback implementation calling findandcreate`. While this may work well when building prototypes, it's creates a high risk of data corruption when deployed to production.

For LB4 and beyond, I am proposing to provide strong guarantees about robustness of this operation. If the connector and/or the database does not provide a solution that's atomic, then the repository method should throw a "Not Implemented" error (instead of falling-back to a non-atomic version).

If there is enough user demand, then we can provide a different repository class that implements operations like findOrCreate in an unsafe (non-atomic) way.

(2)
In order to allow repository consumers to better express whether they require atomic operations or not, and also in line with Relation refactoring: Split Repository interface into different interfaces #1356, I am proposing to introduce a new repository interface and an new implementation class.

For example:

interface AtomicCrudRepository<T extends Entity, ID> 
  extends EntityCrudRepository<T, ID> {

  // declare atomic operations like findOrCreate
}

class DefaultAtomicCrudRepository<T extends Entity, ID> 
  extends EntityCrudRepository<T, ID> 
  implements AtomicCrudRepository<T, ID> {

  // implement atomic operations like findOrCreate
}

/cc @strongloop/loopback-maintainers thoughts?

raymondfeng · 2018-11-06T19:36:07Z

I think the issue is more complex than just implementing the atomic findOrCreate. It depends on how hasOne is enforced:

By LoopBack using atomic findOrCreate from the connector
By the database with a unique constraint of the FK, for example, a user has a profile and the profile model has a unique index of userId. This can be potentially automated by automigrate.
For relational databases that don't have atomic findOrCreate, can we use transactional find then create with strong isolation level?

My understanding is that we only have mongodb and memory connector support atomic findOrCreate. If that's the default behavior, hasOne create will fail for most databases.

Maybe we have to introduce different options for hasOne.create method to honor users' policies, such as:

byAtomicFindCreate
byDatabase
...

jannyHou · 2018-11-06T19:48:15Z

based on Raymond's comment, Cloudant/Couchdb2 have a similar situation as sql databases.

bajtos · 2018-11-08T12:38:56Z

I think the issue is more complex than just implementing the atomic findOrCreate. It depends on how hasOne is enforced.

Good point!

This is loosely related to my ideas related to validation, see #1872 (comment)

By LoopBack using atomic findOrCreate from the connector

👍

By the database with a unique constraint of the FK, for example, a user has a profile and the profile model has a unique index of userId. This can be potentially automated by automigrate.

Makes sense. To make this option easy to use, I would like our implementation to have the following traits:

The database-specific error reporting violating of unique constraint is converted into a well-known LoopBack error code, so that the errors returned by REST APIs are consistent and independent on the underlying database.
The application/framework should check the database schema to verify the presence of the correct unique constraint. We should not blindly rely on the user to correctly configure the constraints. I see this as very important for any users that have multiple environments, where it's too easy to forget to add unique constraint to test/staging/production database. This would be of a lesser concern if we had a good database migration framework in place (see Database Migration Management Framework #487).

For relational databases that don't have atomic findOrCreate, can we use transactional find then create with strong isolation level?

Makes sense to me. In the current design, it's up to the connector how it implements findOrCreate. Opening a new transaction and doing two-step find + create is a perfectly valid solution IMO. A possible caveat I see: how to handle the case when findOrCreate is already executed within an (outer) transaction. For example, MySQL does not seem to support nested transactions.

My understanding is that we only have mongodb and memory connector support atomic findOrCreate. If that's the default behavior, hasOne create will fail for most databases.
based on Raymond's comment, Cloudant/Couchdb2 have a similar situation as sql databases.

Ouch! When I proposed to use findOrCreate for hasMany, I has incorrectly assumed that most of our connectors already support this operation in an atomic way. Mea culpa.

Maybe we have to introduce different options for hasOne.create method to honor users' policies, such as:

byAtomicFindCreate

byDatabase

I don't think it's a good idea to ask the users to decide this matter. In my experience, many LB users don't fully understand ramifications of different ways of enforcing has-one constraint and as a result, we can end up with many LB4 applications prone to race conditions.

What I would like to see instead, is a single Model/Repository-level API that allows:

the HasMany relation to express the has-one constraint
the connectors to provide a database-specific implementation enforcing the constraint in a reliable (atomic) way

An interesting approach for document databases like Cloudant/Couchdb, based on this StackOverflow discussion:

When a User has one Credentials instance, we can encode the user's id in credentials' id. Even better, Credentials id can be set to the same value as the User id it belongs to. Now

In that setup, when we attempt a PUT request of an existing document id (findOrCreate, the document already exists), it will be rejected with a HTTP 409 error - unless you supply the correct revision (_rev property) of the existing document.

It makes me wonder - would it make sense to merge the primary key with the foreign key into the same property in the target model of HasOne relation and use this as the solution to enforce uniqueness across all databases?

@model()
class User {
  @hasOne(() => Credentials)
  credentials: Credentials;
}

@model
class Credentials {
  @belongsTo(() => User)
  @property({id: true})
  userId: string;
}

Or even

@model()
class User {
  @hasOne(() => Credentials, {keyTo: 'id'})
  credentials: Credentials;
}

@model
class Credentials {
  @belongsTo(() => User)
  @property({id: true})
  id: string;
}

Thoughts?

bajtos · 2018-11-09T13:33:40Z

@b-admike and me have discussed this matter yesterday. We concluded there are two ways forward:

Look into transaction-based implementation of findOrCreate
Require the target of hasOne relation to use the foreign key as the primary key. I believe the unique constraint of primary keys is enforced by all databases (SQL and NoSQL), thus not atomic operation like findOrCreate is needed.

dhmlau · 2018-11-26T16:25:29Z

@b-admike, per our discussion last week, IIUC since this feature is only applicable for a small subset of connectors (or databases), this task is not a blocker for the hasOne task.
If this should not be part of the Nov milestone or top of backlog, please update accordingly. Thanks.

b-admike · 2018-11-27T01:25:11Z

Yes @dhmlau per #1956 (comment), we've decided to take approach 2) with #2005, and decided to close #1974 as rejected. I've removed this issue from the milestone.

stale · 2020-02-07T17:30:28Z

This issue has been marked stale because it has not seen activity within six months. If you believe this to be in error, please contact one of the code owners, listed in the CODEOWNERS file at the top-level of this repository. This issue will be closed within 30 days of being stale.

stale · 2020-03-08T17:35:13Z

This issue has been closed due to continued inactivity. Thank you for your understanding. If you believe this to be in error, please contact one of the code owners, listed in the CODEOWNERS file at the top-level of this repository.

bajtos added feature TOB Repository Issues related to @loopback/repository package labels Nov 1, 2018

bajtos mentioned this issue Nov 1, 2018

feat(repository): hasOne relation #1879

Closed

7 tasks

bajtos changed the title ~~Atomic "find or create" command~~ Atomic "find or create" Repository API Nov 1, 2018

dhmlau mentioned this issue Nov 1, 2018

Monthly Milestone - November 2018 🍁 🌾 #1961

Closed

24 tasks

This was referenced Nov 5, 2018

fix: update typescript declarations loopbackio/loopback-datasource-juggler#1654

Merged

feat(repository): initial AtomicCrudRepository implementation #1974

Closed

dhmlau added this to the November 2018 Milestone milestone Nov 5, 2018

bajtos mentioned this issue Nov 9, 2018

Hidden properties #1914

Open

b-admike mentioned this issue Nov 9, 2018

feat(repository): hasOne relation #2005

Merged

9 tasks

dhmlau mentioned this issue Nov 21, 2018

[Roadmap] 🚀1Q2019 🚀 #1839

Closed

20 tasks

b-admike removed this from the November 2018 Milestone milestone Nov 27, 2018

dhmlau removed the TOB label Feb 12, 2019

stale bot added the stale label Feb 7, 2020

stale bot closed this as completed Mar 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Atomic "find or create" Repository API #1956

Atomic "find or create" Repository API #1956

bajtos commented Nov 1, 2018 •

edited

Loading

bajtos commented Nov 1, 2018

raymondfeng commented Nov 6, 2018

jannyHou commented Nov 6, 2018

bajtos commented Nov 8, 2018

bajtos commented Nov 9, 2018

dhmlau commented Nov 26, 2018

b-admike commented Nov 27, 2018

stale bot commented Feb 7, 2020

stale bot commented Mar 8, 2020

Atomic "find or create" Repository API #1956

Atomic "find or create" Repository API #1956

Comments

bajtos commented Nov 1, 2018 • edited Loading

Acceptance criteria

Out of scope

bajtos commented Nov 1, 2018

raymondfeng commented Nov 6, 2018

jannyHou commented Nov 6, 2018

bajtos commented Nov 8, 2018

bajtos commented Nov 9, 2018

dhmlau commented Nov 26, 2018

b-admike commented Nov 27, 2018

stale bot commented Feb 7, 2020

stale bot commented Mar 8, 2020

bajtos commented Nov 1, 2018 •

edited

Loading