Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DatabaseFacade.CanConnect doesn't actually check anything #18330

Closed
roji opened this issue Oct 11, 2019 · 9 comments · Fixed by #18508
Closed

DatabaseFacade.CanConnect doesn't actually check anything #18330

roji opened this issue Oct 11, 2019 · 9 comments · Fixed by #18508
Assignees
Labels
closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-bug
Milestone

Comments

@roji
Copy link
Member

roji commented Oct 11, 2019

DatabaseFacade.CanConnect is supposed to determine whether the database is alive and can be connected to. The default implementation in RelationalDatabaseCreator calls Exists, whose implementation in SqlServer (and Sqlite and Npgsql) is to simply open and close a connection. As far as I'm aware, assuming connection pooling is enabled this doesn't actually do any sort of communication to the database. We should consider implementing CanConnect by doing a raw SELECT 1 roundtrip, or possibly opening and closing a non-pooled connection.

Note that ASP.NET's EF health probe is built over this API.

@roji roji added the type-bug label Oct 11, 2019
@roji
Copy link
Member Author

roji commented Oct 11, 2019

Note that opening an unpooled connection could be problematic for a health check, as you may be hitting database connection limits... A simple SELECT query is probably safest...

@ajcvickers
Copy link
Contributor

@roji Have you verified that opening a connection can pass if the database cannot be contacted? It's the same mechanism we have been using for years on SQL Server and I haven't seen it give false positives. (Plenty of false negatives, but not positives.) Also, on SQLite opening the connection opens the file.

@roji
Copy link
Member Author

roji commented Oct 11, 2019

I will give this a proper test - wanted to log this first (conference time). As a general rule opening a pooled connection is not supposed to actually contact a database because of perf reasons (it would kind of defeat the purpose of the pool), so this is definitely a bug with npgsql, but I'll confirm this on SQL Server in the coming days.

@roji
Copy link
Member Author

roji commented Oct 13, 2019

Tested three scenarios with some very "interesting" results:

  1. If SQL Server is shut down, the health check times out (for a long time) and finally throws SqlException. This seems like the wrong behavior - the method should probably return false (opened DatabaseFacade.CanConnect throws instead of returning false #18355). It also seems problematic to have such a long timeout - we may want to consider reducing the timeout specifically for this check (we can discuss in triage).
  2. If the database is dropped (ALTER DATABASE [databasename] SET SINGLE_USER WITH ROLLBACK IMMEDIATE; DROP DATABASE [databasename];), the API works as expected, returning false.
  3. If I simply drop the network link (ip link set lo down under Linux), the API returns true as I feared, since SqlConnection.Open doesn't actually do any I/O.

From cases 1 and 2 above, I'm guessing that when SQL Server shuts down (or drops the database), it sends a message to SqlClient which makes it clear its pool. But if networking is cut this obviously doesn't happen.

Note that all this holds for RelationalDatabaseCreator.Exists as well (which is what CanConnect calls) - this also seems problematic. I'd suggest simply adding a simple SELECT 1 query roundtrip inside Exists (these don't seem to be perf-sensitive APIs).

@mmisztal1980
Copy link

So happy we’ve had this conversation :D

@roji
Copy link
Member Author

roji commented Oct 13, 2019

@mmisztal1980 me too, without your question I'd have never looked into that code...!

@mmisztal1980
Copy link

I’m gonna try repro the other issue again on 2.2 in the coming week, I’ll keep you posted

@roji
Copy link
Member Author

roji commented Oct 21, 2019

I did some basic research on executing SELECT 1 as part of the Exists and CanConnect checks, to see whether this would block any valid scenarios.

It is apparently possible to deny SELECT access on an entire database (DENY SELECT ON DATABASE...). While it's conceivable this could be used in certain write-only scenarios, various other EF Core features would break here as well (e.g. migrations). Such write-only scenarios can still be enabled by denying rights on a schema or table rather than at the database level. This seems like enough of an edge case that we can safely add SELECT 1 to the checks.

However, I really am not an SQL Server expert by any stretch, would be good to have other opinions. @AndriySvyryd I think you looked into this area in the past?

@AndriySvyryd
Copy link
Member

I don't have any significant pushback against SELECT 1.

roji added a commit that referenced this issue Oct 22, 2019
@roji roji added the closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. label Oct 22, 2019
roji added a commit that referenced this issue Oct 23, 2019
roji added a commit that referenced this issue Oct 23, 2019
@ajcvickers ajcvickers modified the milestones: 3.1.0, 3.1.0-preview2 Oct 24, 2019
@ajcvickers ajcvickers modified the milestones: 3.1.0-preview2, 3.1.0 Dec 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants