QT

Multithreaded database access with QtSql

Last time, we had a look at how to make the database access asynchronous with QFutures. Our solution was to use a one thread QThreadPool with QtConcurrent. That works pretty well for most of the use cases, but as mentioned previously there are some cases where you might want to use multiple threads in parallel to access the database. That case may be some sort of service, but it of course also works for all other kinds of applications.

However you should also be aware of the disadvantages of having multiple database threads:

  1. Creating many database connections is going to cost you a lot of memory. This might be a point on weaker desktop computers or mobile devices.
  2. If you’ve only got a modest amount of queries to be executed it might even be a bit slower since you can’t reuse prepared queries across threads. Also database connection creation doesn’t come without any costs.

In the end you should test whether the gained performance improvements are actually relevant and whether they outweigh the memory usage. You should also examine how well your database driver handles multithreading, e.g. sqlite can always only commit one transaction at a time. The good news is that at least the code doesn’t get complicated.

The solution

What we currently have got is a QThreadPool with only one thread. That is because the database connections aren’t able to handle queries from different threads, so we can’t just increase the maximum thread count.

So to fix this we need to create one database connection per thread. There’s just one problem: the thread pool doesn’t tell us when and which thread it’s creating or destructing, so we don’t know when we need to create the database connection and when to remove it again. Fortunately for us Qt has got a ready-made thread storage for exactly this kind of issue. It works like this:

  • There’s a global static QThreadStorage
  • On every thread you can check whether the storage contains a value for this thread and get or set it.
  • When any QThread exits the thread storage automatically destructs the content for the thread. The thread storage takes ownership of the objects, so this also works with heap-allocated objects.

We’re going to use an extra class for the thread storage since we manually need to create and delete the database connection.

class DatabaseConnection
{
    Q_DISABLE_COPY(DatabaseConnection)
public:
    DatabaseConnection()
        : m_name(QString::number(QRandomGenerator::global()->generate(), 36))
    {
        auto database = QSqlDatabase::addDatabase(QStringLiteral("QSQLITE"), m_name);
        database.setDatabaseName("/home/me/data.sqlite");
        database.open();
    }

    ~DatabaseConnection()
    {
        QSqlDatabase::removeDatabase(m_name);
    }

    QSqlDatabase database()
    {
        return QSqlDatabase::database(m_name);
    }

private:
    QString m_name;
};

An object of this class will automatically create a new database connection and remove it as soon as it’s destroyed. Two things are important here:

  • Each QSqlDatabase needs to have a unique name. In this case we generate a name consisting out of random alphanumerical characters, but there’re also other solutions. It’s just important that the name is unique.
  • The QSqlDatabase isn’t stored as this would otherwise cause trouble when calling QSqlDatabase::removeDatabase().

Now we’ll see how to execute queries with the new code, but first let’s have a look at our current code:

QFuture<QList<User>> Database::fetchUsers()
{
    return QtConcurrent::run(m_pool, [this]() {
        QSqlQuery query(QSqlDatabase::database("main-connection"));
        query.exec("SELECT * FROM users");

        QList<User> users;
        while (query.next()) {
            // ...
        }
        return users;
    });
}

We now need to use the correct database connection on each thread and also need to create them if necessary.

// the storage for the database connection for each thread
static QThreadStorage<DatabaseConnection *> databaseConnections;

QSqlDatabase Database::currentDatabase()
{
    if (!databaseConnections.hasLocalData()) {
        databaseConnections.setLocalData(new DatabaseConnection());
    }
    return databaseConnections.localData()->database();
}

QFuture<QList<User>> Database::fetchUsers()
{
    return QtConcurrent::run(m_pool, [this]() {
        QSqlQuery query(currentDatabase());
        query.exec("SELECT * FROM users");

        QList<User> users;
        while (query.next()) {
            // ...
        }
        return users;
    });
}

The open() function from our previous version isn’t needed anymore and can be removed, because this is now done in our DatabaseConnection class. What’s still left to do is to adjust the thread pool settings in the constructor of our Database class:

// thread limit (default: number of system threads)
m_pool.setMaxThreadCount(4);
// unused threads are destroyed after 2 minutes (default: 30s)
m_pool.setExpiryTimeout(120000);

Thread creation and deletion is time-consuming, so you might want to choose an even higher expiry timeout here. On the other hand you might still want to save memory when the application is inactive. If you don’t care about the memory consumption, you can also disable the thread expiry (setting it to -1), that will avoid any recreation of database connections.

And basically this is it! The queries are distributed on multiple threads and are executed in parallel now!

There’re just some smaller things that don’t work like before anymore. We will take a look at two important issues and see how we can solve them.

Table creation & database migrations

In many cases you want your application to take care of table creations and database migrations. The database migrations you might already have written are fine and don’t need to be touched, but there’s one thing we need to pay attention to now: no other query must be started until all tables have been created correctly, so no errors are produced.

There are probably many ways of fixing this problem straightforwardly, here is one way:

class Database
{
    // ...
private:
    // creates a query and ensures that the tables are ready
    QSqlQuery createQuery();
    // executes database migrations to create all tables
    void createTables();

    QMutex m_tableCreationMutex;
    bool m_tablesCreated = false;
};
QSqlQuery Database::createQuery()
{
    if (!m_tablesCreated) {
        createTables();
    }
    return QSqlQuery(currentDatabase());
}
void Database::createTables()
{
    QMutexLocker locker(&m_tableCreationMutex);
    if (m_tablesCreated) {
        return;
    }
    
    QSqlQuery query(currentDatabase());
    query.exec("CREATE TABLE IF NOT EXISTS ...");

    m_tablesCreated = true;
}

If you’re using createQuery() everywhere to create queries, it’s now always safe that all database migrations have been done already.

Reusing queries

Queries can be reused in order to avoid that the SQL query string needs to be parsed every time. In a single-threaded environment we could use static QSqlQueries like this:

static auto query = []() {
    // this is only executed once
    QSqlQuery query = createQuery();
    query.prepare("SELECT * FROM users WHERE id = :1");
    return query;
}();

query.addBindValue(userId);
query.exec();

// ...

This way the query is reused and the query is only parsed once. If this query is executed very frequently this significantly increases the performance, but also (guess what) consumes more memory.

And again, this needs some adjustments to work with multithreading since the queries can’t be shared between different threads of course.

C++11 to the rescue! C++11 introduced a new keyword thread_local and this at least makes it possible to easily reuse the query for each thread. The code remained exactly the same, except there’s now a little new thread_local, which ensures that each thread is going to have its own query:

thread_local static auto query = []() {
    // this is only executed once on each thread
    QSqlQuery query = createQuery();
    query.prepare("SELECT * FROM users WHERE id = :1");
    return query;
}();

query.addBindValue(userId);
query.exec();

// ...

But of course keep in mind that this is only useful if a query is executed frequently, otherwise you just need more memory, especially since the query is cached for each thread now.

I hope I could help you with that. Feel free to comment if you’ve got any other solutions or ideas in mind or in production!

Comments