Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose sqlite3_db_config and verbs (or equivalent) #103489

Closed
numist opened this issue Apr 12, 2023 · 6 comments · Fixed by #103506
Closed

Expose sqlite3_db_config and verbs (or equivalent) #103489

numist opened this issue Apr 12, 2023 · 6 comments · Fixed by #103506
Assignees
Labels
topic-sqlite3 type-feature A feature request or enhancement

Comments

@numist
Copy link

numist commented Apr 12, 2023

Feature or enhancement

Python's SQLite bindings should expose sqlite3_db_config and at least SQLITE_DBCONFIG_DEFENSIVE (or an idiomatic version of the same)

Pitch

The libsqlite3.dylib built into Darwin enables defensive mode by default for all connections in processes linked on or after macOS 11 Big Sur as a mitigation layer against the general class of security vulnerabilities that can be exploited with pure SQL, but it's still useful to be able to disable it when using certain tools (like sqlite-utils). Conversely, developers may find it useful to be able to enable defensive mode on other platforms when opening a database with an untrusted schema, or executing untrusted SQL statements from remote sources.

Previous discussion

This was prompted by a brief discussion on Mastodon with @erlend-aasland

Linked PRs

@numist numist added the type-feature A feature request or enhancement label Apr 12, 2023
@simonw
Copy link
Contributor

simonw commented Apr 12, 2023

Here's the particular problem I'm encountering at the moment.

SQLite has a documented mechanism for applying "dangerous" SQL schema changes - things like adding a new foreign key to an existing table. It's this process here: https://www.sqlite.org/lang_altertable.html#otheralter

  1. Start a transaction.
  2. Run PRAGMA schema_version to determine the current schema version number. This number will be needed for step 6 below.
  3. Activate schema editing using PRAGMA writable_schema=ON.
  4. Run an UPDATE statement to change the definition of table X in the sqlite_schema table: UPDATE sqlite_schema SET sql=... WHERE type='table' AND name='X'; [...]
  5. If the change to table X also affects other tables or indexes or triggers are views within schema, then run UPDATE statements to modify those other tables indexes and views too. [...]
  6. Increment the schema version number using PRAGMA schema_version=X where X is one more than the old schema version number found in step 2 above.
  7. Disable schema editing using PRAGMA writable_schema=OFF.

I use this trick in my sqlite-utils Python library and CLI tool to power table.add_foreign_key() method, described here: https://sqlite-utils.datasette.io/en/stable/python-api.html#adding-foreign-key-constraints

Here's the problem: it turns out there it's increasingly common for Python environments to be linked against versions of SQLite that set the SQLITE_DBCONFIG_DEFENSIVE flag by default. This flag causes PRAGMA writable_schema=ON to silently be ignored... but then the attempt to update the sqlite_schema table fails with an error.

This breaks features in my software that use that feature... but there's no current way for me to provide a fix.

SQLite programs with the full C API can turn SQLITE_DBCONFIG_DEFENSIVE back off again - but since the Python sqlite3 module doesn't expose a mechanism for that I can't provide my users with a fix.

Related issue:

@erlend-aasland
Copy link
Contributor

Thoughts about the API? Currently, I've exposed only the boolean options, since those are the most interesting IMO. I think the lookaside and "main db name" options are special enough to warrant their own APIs, if we were to add them1. However, we don't know what kind of options SQLite will add in the future; there's a risk that my proposed API won't be a good fit in the future.

Footnotes

  1. IMO, those two are too low level to expose, unless there are real good use cases for them.

@erlend-aasland
Copy link
Contributor

Thoughts about the API?

Revisiting this, I'm quite ok with the proposed API. We don't need to lock it to boolean flags; it should be straight-forward to support for example the "set main db name" option using this API, and even the lookup cache hack if needed (not that I think we need that one #footgun).

Another route could be to add more specialised APIs; setflag(), setmaindbname(), etc. but that can get quite messy, so I'd rather not go down that road.

@erlend-aasland
Copy link
Contributor

FWIW, I'm going to put up a competing PR with support for all options, just to get a feel of what that looks like.

@erlend-aasland
Copy link
Contributor

FWIW, I'm going to put up a competing PR with support for all options, just to get a feel of what that looks like.

I added support for SQLITE_DBCONFIG_MAINDBNAME locally, just for fun, and it expanded the diff considerably:

 Lib/test/test_sqlite3/test_dbapi.py   |  16 +++
 Modules/_sqlite/clinic/connection.c.h |  62 +++++++-----
 Modules/_sqlite/connection.c          | 186 ++++++++++++++++++++++++----------
 Modules/_sqlite/connection.h          |   2 +
 Modules/_sqlite/module.c              |   3 +
 5 files changed, 192 insertions(+), 77 deletions(-)
PoC patch
diff --git a/Lib/test/test_sqlite3/test_dbapi.py b/Lib/test/test_sqlite3/test_dbapi.py
index 1bb0e13e35..989d242a83 100644
--- a/Lib/test/test_sqlite3/test_dbapi.py
+++ b/Lib/test/test_sqlite3/test_dbapi.py
@@ -601,6 +601,22 @@ def test_connection_config(self):
             with self.assertRaisesRegex(sqlite.IntegrityError, "constraint"):
                 cx.execute("insert into u values(0)")
 
+    def test_connection_config_dbname(self):
+        op = sqlite.SQLITE_DBCONFIG_MAINDBNAME
+        with memory_database() as cx:
+            self.assertEqual(cx.getconfig(op), "main")
+            cx.setconfig(op, "bar")
+            self.assertEqual(cx.getconfig(op), "bar")
+            cx.setconfig(op, "foo")
+            self.assertEqual(cx.getconfig(op), "foo")
+
+            with self.assertRaisesRegex(ValueError, "single arg"):
+                cx.setconfig(op)
+            with self.assertRaisesRegex(ValueError, "single arg"):
+                cx.setconfig(op, "a", 2)
+            with self.assertRaisesRegex(ValueError, "requires.*str"):
+                cx.setconfig(op, 42)
+
 
 class UninitialisedConnectionTests(unittest.TestCase):
     def setUp(self):
diff --git a/Modules/_sqlite/clinic/connection.c.h b/Modules/_sqlite/clinic/connection.c.h
index 3417d1f051..208506e15d 100644
--- a/Modules/_sqlite/clinic/connection.c.h
+++ b/Modules/_sqlite/clinic/connection.c.h
@@ -1514,7 +1514,7 @@ exit:
 }
 
 PyDoc_STRVAR(setconfig__doc__,
-"setconfig($self, op, enable=True, /)\n"
+"setconfig($self, /, op, *args)\n"
 "--\n"
 "\n"
 "\n"
@@ -1523,36 +1523,57 @@ PyDoc_STRVAR(setconfig__doc__,
 "    The configuration verb; one of the sqlite3.SQLITE_DBCONFIG codes.");
 
 #define SETCONFIG_METHODDEF    \
-    {"setconfig", _PyCFunction_CAST(setconfig), METH_FASTCALL, setconfig__doc__},
+    {"setconfig", _PyCFunction_CAST(setconfig), METH_FASTCALL|METH_KEYWORDS, setconfig__doc__},
 
 static PyObject *
-setconfig_impl(pysqlite_Connection *self, int op, int enable);
+setconfig_impl(pysqlite_Connection *self, int op, PyObject *args);
 
 static PyObject *
-setconfig(pysqlite_Connection *self, PyObject *const *args, Py_ssize_t nargs)
+setconfig(pysqlite_Connection *self, PyObject *const *args, Py_ssize_t nargs, PyObject *kwnames)
 {
     PyObject *return_value = NULL;
+    #if defined(Py_BUILD_CORE) && !defined(Py_BUILD_CORE_MODULE)
+
+    #define NUM_KEYWORDS 1
+    static struct {
+        PyGC_Head _this_is_not_used;
+        PyObject_VAR_HEAD
+        PyObject *ob_item[NUM_KEYWORDS];
+    } _kwtuple = {
+        .ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
+        .ob_item = { &_Py_ID(op), },
+    };
+    #undef NUM_KEYWORDS
+    #define KWTUPLE (&_kwtuple.ob_base.ob_base)
+
+    #else  // !Py_BUILD_CORE
+    #  define KWTUPLE NULL
+    #endif  // !Py_BUILD_CORE
+
+    static const char * const _keywords[] = {"op", NULL};
+    static _PyArg_Parser _parser = {
+        .keywords = _keywords,
+        .fname = "setconfig",
+        .kwtuple = KWTUPLE,
+    };
+    #undef KWTUPLE
+    PyObject *argsbuf[2];
     int op;
-    int enable = 1;
+    PyObject *__clinic_args = NULL;
 
-    if (!_PyArg_CheckPositional("setconfig", nargs, 1, 2)) {
+    args = _PyArg_UnpackKeywordsWithVararg(args, nargs, NULL, kwnames, &_parser, 1, 1, 0, 1, argsbuf);
+    if (!args) {
         goto exit;
     }
     op = _PyLong_AsInt(args[0]);
     if (op == -1 && PyErr_Occurred()) {
         goto exit;
     }
-    if (nargs < 2) {
-        goto skip_optional;
-    }
-    enable = PyObject_IsTrue(args[1]);
-    if (enable < 0) {
-        goto exit;
-    }
-skip_optional:
-    return_value = setconfig_impl(self, op, enable);
+    __clinic_args = args[1];
+    return_value = setconfig_impl(self, op, __clinic_args);
 
 exit:
+    Py_XDECREF(__clinic_args);
     return return_value;
 }
 
@@ -1568,7 +1589,7 @@ PyDoc_STRVAR(getconfig__doc__,
 #define GETCONFIG_METHODDEF    \
     {"getconfig", (PyCFunction)getconfig, METH_O, getconfig__doc__},
 
-static int
+static PyObject *
 getconfig_impl(pysqlite_Connection *self, int op);
 
 static PyObject *
@@ -1576,17 +1597,12 @@ getconfig(pysqlite_Connection *self, PyObject *arg)
 {
     PyObject *return_value = NULL;
     int op;
-    int _return_value;
 
     op = _PyLong_AsInt(arg);
     if (op == -1 && PyErr_Occurred()) {
         goto exit;
     }
-    _return_value = getconfig_impl(self, op);
-    if ((_return_value == -1) && PyErr_Occurred()) {
-        goto exit;
-    }
-    return_value = PyBool_FromLong((long)_return_value);
+    return_value = getconfig_impl(self, op);
 
 exit:
     return return_value;
@@ -1611,4 +1627,4 @@ exit:
 #ifndef DESERIALIZE_METHODDEF
     #define DESERIALIZE_METHODDEF
 #endif /* !defined(DESERIALIZE_METHODDEF) */
-/*[clinic end generated code: output=84560376421204ce input=a9049054013a1b77]*/
+/*[clinic end generated code: output=93a9f8a9422547c9 input=a9049054013a1b77]*/
diff --git a/Modules/_sqlite/connection.c b/Modules/_sqlite/connection.c
index fe2f64d7b6..8f2345bdb3 100644
--- a/Modules/_sqlite/connection.c
+++ b/Modules/_sqlite/connection.c
@@ -306,6 +306,9 @@ pysqlite_connection_init_impl(pysqlite_Connection *self, PyObject *database,
     self->ProgrammingError      = state->ProgrammingError;
     self->NotSupportedError     = state->NotSupportedError;
 
+    PyMem_Free((void *)self->maindbname);
+    self->maindbname = NULL;
+
     if (PySys_Audit("sqlite3.connect/handle", "O", self) < 0) {
         return -1;  // Don't goto error; at this point, dealloc will clean up.
     }
@@ -428,6 +431,7 @@ connection_dealloc(pysqlite_Connection *self)
 
     /* Clean up if user has not called .close() explicitly. */
     connection_close(self);
+    PyMem_Free(self->maindbname);
 
     tp->tp_free(self);
     Py_DECREF(tp);
@@ -2342,52 +2346,89 @@ getlimit_impl(pysqlite_Connection *self, int category)
     return setlimit_impl(self, category, -1);
 }
 
-static inline bool
-is_int_config(const int op)
+static const char *
+config_verb(const int op)
 {
-    switch (op) {
-        case SQLITE_DBCONFIG_ENABLE_FKEY:
-        case SQLITE_DBCONFIG_ENABLE_TRIGGER:
+#define RETURN_IF(CONST) do {   \
+    if (op == CONST) {          \
+        return #CONST;          \
+    }                           \
+} while (0)
+
+    RETURN_IF(SQLITE_DBCONFIG_ENABLE_FKEY);
+    RETURN_IF(SQLITE_DBCONFIG_ENABLE_TRIGGER);
 #if SQLITE_VERSION_NUMBER >= 3012002
-        case SQLITE_DBCONFIG_ENABLE_FTS3_TOKENIZER:
+    RETURN_IF(SQLITE_DBCONFIG_ENABLE_FTS3_TOKENIZER);
 #endif
 #if SQLITE_VERSION_NUMBER >= 3013000
-        case SQLITE_DBCONFIG_ENABLE_LOAD_EXTENSION:
+    RETURN_IF(SQLITE_DBCONFIG_ENABLE_LOAD_EXTENSION);
 #endif
 #if SQLITE_VERSION_NUMBER >= 3016000
-        case SQLITE_DBCONFIG_NO_CKPT_ON_CLOSE:
+    RETURN_IF(SQLITE_DBCONFIG_NO_CKPT_ON_CLOSE);
 #endif
 #if SQLITE_VERSION_NUMBER >= 3020000
-        case SQLITE_DBCONFIG_ENABLE_QPSG:
+    RETURN_IF(SQLITE_DBCONFIG_ENABLE_QPSG);
 #endif
 #if SQLITE_VERSION_NUMBER >= 3022000
-        case SQLITE_DBCONFIG_TRIGGER_EQP:
+    RETURN_IF(SQLITE_DBCONFIG_TRIGGER_EQP);
 #endif
 #if SQLITE_VERSION_NUMBER >= 3024000
-        case SQLITE_DBCONFIG_RESET_DATABASE:
+    RETURN_IF(SQLITE_DBCONFIG_RESET_DATABASE);
 #endif
 #if SQLITE_VERSION_NUMBER >= 3026000
-        case SQLITE_DBCONFIG_DEFENSIVE:
+    RETURN_IF(SQLITE_DBCONFIG_DEFENSIVE);
 #endif
 #if SQLITE_VERSION_NUMBER >= 3028000
-        case SQLITE_DBCONFIG_WRITABLE_SCHEMA:
+    RETURN_IF(SQLITE_DBCONFIG_WRITABLE_SCHEMA);
 #endif
 #if SQLITE_VERSION_NUMBER >= 3029000
-        case SQLITE_DBCONFIG_DQS_DDL:
-        case SQLITE_DBCONFIG_DQS_DML:
-        case SQLITE_DBCONFIG_LEGACY_ALTER_TABLE:
+    RETURN_IF(SQLITE_DBCONFIG_DQS_DDL);
+    RETURN_IF(SQLITE_DBCONFIG_DQS_DML);
+    RETURN_IF(SQLITE_DBCONFIG_LEGACY_ALTER_TABLE);
 #endif
 #if SQLITE_VERSION_NUMBER >= 3030000
-        case SQLITE_DBCONFIG_ENABLE_VIEW:
+    RETURN_IF(SQLITE_DBCONFIG_ENABLE_VIEW);
 #endif
 #if SQLITE_VERSION_NUMBER >= 3031000
-        case SQLITE_DBCONFIG_LEGACY_FILE_FORMAT:
-        case SQLITE_DBCONFIG_TRUSTED_SCHEMA:
+    RETURN_IF(SQLITE_DBCONFIG_LEGACY_FILE_FORMAT);
+    RETURN_IF(SQLITE_DBCONFIG_TRUSTED_SCHEMA);
 #endif
-            return true;
-        default:
-            return false;
+
+#undef RETURN_IF
+    return NULL;
+}
+
+static inline bool
+is_int_config(const int op)
+{
+    return config_verb(op) != NULL;
+}
+
+static PyObject *
+set_db_flag(pysqlite_Connection *self, int op, int enable)
+{
+    int actual;
+    int rc = sqlite3_db_config(self->db, op, enable, &actual);
+    if (rc != SQLITE_OK) {
+        (void)_pysqlite_seterror(self->state, self->db);
+        return NULL;
+    }
+    if (enable != actual) {
+        PyErr_SetString(self->state->OperationalError, "unable to set config");
+        return NULL;
+    }
+    Py_RETURN_NONE;
+}
+
+static PyObject *
+set_db_str_config(pysqlite_Connection *self, int op, const char *str)
+{
+    int rc = sqlite3_db_config(self->db, op, str);
+    if (rc != SQLITE_OK) {
+        (void)_pysqlite_seterror(self->state, self->db);
+        return NULL;
     }
+    Py_RETURN_NONE;
 }
 
 /*[clinic input]
@@ -2395,37 +2436,69 @@ _sqlite3.Connection.setconfig as setconfig
 
     op: int
         The configuration verb; one of the sqlite3.SQLITE_DBCONFIG codes.
-    enable: bool = True
-    /
+    *args: object
 
 [clinic start generated code]*/
 
 static PyObject *
-setconfig_impl(pysqlite_Connection *self, int op, int enable)
-/*[clinic end generated code: output=c60b13e618aff873 input=01d77271ea8ca45f]*/
+setconfig_impl(pysqlite_Connection *self, int op, PyObject *args)
+/*[clinic end generated code: output=7147af2dd2d86d32 input=ee83aa72d4d61bb9]*/
 {
     if (!pysqlite_check_thread(self) || !pysqlite_check_connection(self)) {
         return NULL;
     }
-    if (!is_int_config(op)) {
-        return PyErr_Format(PyExc_ValueError, "unknown config 'op': %d", op);
-    }
-
-    int actual;
-    int rc = sqlite3_db_config(self->db, op, enable, &actual);
-    if (rc != SQLITE_OK) {
-        (void)_pysqlite_seterror(self->state, self->db);
-        return NULL;
-    }
-    if (enable != actual) {
-        PyErr_SetString(self->state->OperationalError, "Unable to set config");
-        return NULL;
+    if (is_int_config(op)) {
+        switch (PyTuple_GET_SIZE(args)) {
+        case 0:
+            return set_db_flag(self, op, 1);
+        case 1:
+            break;
+        default:
+            return PyErr_Format(PyExc_ValueError,
+                                "too many argument for 'op' %s (%d)",
+                                config_verb(op), op);
+        }
+        PyObject *flag = PyTuple_GET_ITEM(args, 0);
+        if (Py_IsTrue(flag)) {
+            return set_db_flag(self, op, 1);
+        }
+        if (Py_IsFalse(flag)) {
+            return set_db_flag(self, op, 0);
+        }
+        return PyErr_Format(PyExc_ValueError,
+                            "%s (%d) requires a boolean argument",
+                            config_verb(op), op);
+    }
+    else if (op == SQLITE_DBCONFIG_MAINDBNAME) {
+        if (PyTuple_GET_SIZE(args) != 1) {
+            PyErr_SetString(PyExc_ValueError,
+                    "SQLITE_DBCONFIG_MAINDBNAME requires a single argument");
+            return NULL;
+        }
+        PyObject *item = PyTuple_GET_ITEM(args, 0);
+        if (!PyUnicode_Check(item)) {
+            PyErr_SetString(PyExc_ValueError,
+                    "SQLITE_DBCONFIG_MAINDBNAME requires a 'str'");
+            return NULL;
+        }
+        const char *str = PyUnicode_AsUTF8(item);
+        if (str == NULL) {
+            return NULL;
+        }
+        size_t len = strlen(str);
+        PyMem_Free(self->maindbname);
+        self->maindbname = PyMem_Malloc(len+1);
+        if (self->maindbname == NULL) {
+            return NULL;
+        }
+        (void)strncpy(self->maindbname, str, len+1);
+        return set_db_str_config(self, op, self->maindbname);
     }
-    Py_RETURN_NONE;
+    return PyErr_Format(PyExc_ValueError, "unknown config 'op': %d", op);
 }
 
 /*[clinic input]
-_sqlite3.Connection.getconfig as getconfig -> bool
+_sqlite3.Connection.getconfig as getconfig
 
     op: int
         The configuration verb; one of the sqlite3.SQLITE_DBCONFIG codes.
@@ -2433,25 +2506,30 @@ _sqlite3.Connection.getconfig as getconfig -> bool
 
 [clinic start generated code]*/
 
-static int
+static PyObject *
 getconfig_impl(pysqlite_Connection *self, int op)
-/*[clinic end generated code: output=25ac05044c7b78a3 input=667d2ef05fff2f61]*/
+/*[clinic end generated code: output=bbf3069a373ba80c input=8eab0692c3fb9b39]*/
 {
     if (!pysqlite_check_thread(self) || !pysqlite_check_connection(self)) {
-        return -1;
+        return NULL;
     }
-    if (!is_int_config(op)) {
-        PyErr_Format(PyExc_ValueError, "unknown config 'op': %d", op);
-        return -1;
+    if (is_int_config(op)) {
+        int current;
+        int rc = sqlite3_db_config(self->db, op, -1, &current);
+        if (rc != SQLITE_OK) {
+            (void)_pysqlite_seterror(self->state, self->db);
+            return NULL;
+        }
+        return PyLong_FromLong(current);
     }
-
-    int current;
-    int rc = sqlite3_db_config(self->db, op, -1, &current);
-    if (rc != SQLITE_OK) {
-        (void)_pysqlite_seterror(self->state, self->db);
-        return -1;
+    else if (op == SQLITE_DBCONFIG_MAINDBNAME) {
+        if (self->maindbname == NULL) {
+            return PyUnicode_FromString("main");
+        }
+        return PyUnicode_FromString(self->maindbname);
     }
-    return current;
+    PyErr_Format(PyExc_ValueError, "unknown config 'op': %d", op);
+    return NULL;
 }
 
 static PyObject *
diff --git a/Modules/_sqlite/connection.h b/Modules/_sqlite/connection.h
index 1df92065a5..b5ca51ea87 100644
--- a/Modules/_sqlite/connection.h
+++ b/Modules/_sqlite/connection.h
@@ -104,6 +104,8 @@ typedef struct
     PyObject* InternalError;
     PyObject* ProgrammingError;
     PyObject* NotSupportedError;
+
+    char *maindbname;
 } pysqlite_Connection;
 
 int pysqlite_check_thread(pysqlite_Connection* self);
diff --git a/Modules/_sqlite/module.c b/Modules/_sqlite/module.c
index 9c42faa232..b08e92619c 100644
--- a/Modules/_sqlite/module.c
+++ b/Modules/_sqlite/module.c
@@ -542,6 +542,9 @@ add_integer_constants(PyObject *module) {
     ADD_INT(SQLITE_DBCONFIG_LEGACY_FILE_FORMAT);
     ADD_INT(SQLITE_DBCONFIG_TRUSTED_SCHEMA);
 #endif
+
+    // PoC set db config stuff
+    ADD_INT(SQLITE_DBCONFIG_MAINDBNAME);
 #undef ADD_INT
     return 0;
 }

I'd say that the added complexity is not worth it at the time. I do think we would be fine with adding support for this, if needed, sometime in the future.

@erlend-aasland
Copy link
Contributor

For SQLITE_DBCONFIG_LOOKASIDE, the diff is even bigger, as is the complexity of the feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-sqlite3 type-feature A feature request or enhancement
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants