Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up 10.4+ timezone initialization #320

Merged
merged 2 commits into from
Aug 11, 2020

Conversation

grooverdan
Copy link
Member

MariaDB-10.4 defaulted to Aria for system tables.

This introduced crash safety under the name of "transactional"
that was not previously in MyISAM.

The Aria implementation of checkpointing incurs significant
penalty on fuse-overlayfs that occurs significantly in
container environments, especially those without a
/var/lib/mysql volume.

We work around this penalty by disabling the crash
safety of timezone tables for the period of timezone
initialization.

Analysis and timings are in https://jira.mariadb.org/browse/MDEV-23326
and local tests show that 10.4 is only 0.8 seconds slower
than 10.3 on startup (6.8 seconds total).

Version specific comments are used to ensure that ALTER TABLE
statements aren't run on < 10.4 server versions.

closes #262

I'm unconvinced I can get any significant fix into MariaDB before the next release so this should close off a major issue for the next release(s).

This won't be the end of the story. Lets see if we can do all the docker_setup_db under docker_init_database_dir with a little upstream help and improve the statup time again.

@grooverdan
Copy link
Member Author

FYI mariadb releases have occurred. I didn't get a fix into upstream before the release.

Further analysis indicates its not just (fuse)overlayfs affected per upstream MDEV.

While disabling the crash safety during initialization has some risks, any errors will abort the starting of the container because of the SQL errors. I do have a crash safe performance optimization work in progress that will be ready for next release (and consumes ~3s for the tz initialization).

This change will help the default deployment of mariadb containers of the user base without penalty.

@tianon
Copy link
Contributor

tianon commented Aug 10, 2020

The Aria implementation of checkpointing incurs significant
penalty on fuse-overlayfs that occurs significantly in
container environments, especially those without a
/var/lib/mysql volume.

This is the bit that has me confused -- this image defines /var/lib/mysql to be a volume, and the users reporting slowness are all thus using that default volume (there doesn't exist a way to "unvolume"), so none of them are using MariaDB on top of an overlay data directory (although I can see how/why that would cause a significant performance overhead, which is precisely why we define the VOLUME in the first place, even though it has downsides for more esoteric deployment methods).

The common thread we saw in the "slowness" discussions was spinning disks vs SSDs (or even SSDs with very low available IOPS), so I'd love to make sure we're testing the same thing before we merge a fix which is made assuming the two slowness tests are the same.

@grooverdan
Copy link
Member Author

grooverdan commented Aug 10, 2020

I was wrong about overlayfs being the cause. I generally saw problems even on my local nvme. tmpfs as a VOLUME didn't seem to be an issue.

Catch me as @Daniel Black on https://mariadb.zulipchat.com because I'd like to make sure we understand each other fully on this and there's a lot of detail.

@tianon
Copy link
Contributor

tianon commented Aug 10, 2020

Oh, I'm aware there's a lot that's gone into this (I've been following your adventures in https://jira.mariadb.org/browse/MDEV-23326 😄), I just want to make sure you've done some tests on a non-NVMe (preferably spinning disk) drive as well to ensure the change is still dramatic there before we consider #262 fully "fixed" / closed.

@yosifkit just ran a simple test on a spinning drive in his system with 10.5.4 and it took ~11s before it even started the temporary server, and it was a full three minutes later when the temporary server was stopped (doing nothing but loading timezone data and setting a root password), so it's significantly more dramatic on a spinning drive, and I just want to make sure your testing has covered that case (since that's the one that's the most common in #262).

He's going to test this change on that same drive to get a simple comparison. 👍

@tianon
Copy link
Contributor

tianon commented Aug 10, 2020

He had to test this change against 10.5.5 (because 10.5.4 is no longer available thanks to the new version being published) but it went from ~3m down to ~7s, so I'd say that's pretty compelling. 😅

docker-entrypoint.sh Outdated Show resolved Hide resolved
@grooverdan
Copy link
Member Author

Model Family:     Western Digital Green
Device Model:     WDC WD40EZRX-00SPEB0
Serial Number:    WD-WCC4E5000UCH

ext4 mounted on /home/dan/datadir

rest of smart output showed it to be in not a great state.

test script

for v in 10.3 10.4
do
  podman run -d --rm -e MYSQL_ROOT_PASSWORD=pass \
    --expose 3306 \
    --volume /home/dan/datadir/data$v:/var/lib/mysql:Z \
    --name maria$v mariadb_test:$v &
  sleep 1
  time grep -iq "ready for start up" <(podman logs -f maria$v 2>&1) 
  podman logs maria$v
  sleep 1
  podman kill maria$v
  sleep 1
done

10.3 result

+ podman run -d --rm -e MYSQL_ROOT_PASSWORD=pass --expose 3306 --volume /home/dan/datadir/data10.3:/var/lib/mysql:Z --name maria10.3 mariadb_test:10.3
b5e35dbf6783dc0fffd3b41d755ddfae8617260f68abcde196287569a1b619f3
+ grep -iq 'ready for start up' /dev/fd/63
++ podman logs -f maria10.3

real	0m7.789s
user	0m0.000s
sys	0m0.002s

10.4 result

+ podman run -d --rm -e MYSQL_ROOT_PASSWORD=pass --expose 3306 --volume /home/dan/datadir/data10.4:/var/lib/mysql:Z --name maria10.4 mariadb_test:10.4
0b86680f45cc7f8af3e0e96e136ab2c6799187767e3517cc59f50dd15e065a61
+ grep -iq 'ready for start up' /dev/fd/63
++ podman logs -f maria10.4

real	0m13.793s
user	0m0.000s
sys	0m0.002s
+ podman logs maria10.4

@grooverdan
Copy link
Member Author

and before change:

+ podman run -d --rm -e MYSQL_ROOT_PASSWORD=pass --expose 3306 --volume /home/dan/datadir/data10.4:/var/lib/mysql:Z --name maria10.4 mariadb:10.4
7365495faf0f4767909ea1818b0290730a51f40db45011767ab5b34ab300b39e
+ grep -iq 'ready for start up' /dev/fd/63
++ podman logs -f maria10.4

real	1m36.864s
user	0m0.000s
sys	0m0.002s
+ podman run -d --rm -e MYSQL_ROOT_PASSWORD=pass --expose 3306 --volume /home/dan/datadir/data10.3:/var/lib/mysql:Z --name maria10.3 mariadb:10.3
c78d97c1889a0bdf37e87da7ef673046418bb5307cfab6c8265253445ecba2de
+ grep -iq 'ready for start up' /dev/fd/63
++ podman logs -f maria10.3

real	0m7.786s
user	0m0.002s
sys	0m0.000s

So remaining question is if you want to script in some Aria recovery mysqlcheck --auto-repair just in case? I'm getting test case for that now.

@grooverdan
Copy link
Member Author

grooverdan commented Aug 11, 2020

On crash recovery, I managed to kill the statup of 10.3 (MyISAM) with a volume and the restart detected errors in the tz tables. The same applies now in 10.4 (though I haven't got the timings right - from MDEV seems there's a ~1 s window). As such I propose to leave that as is.

grooverdan and others added 2 commits August 11, 2020 11:02
MariaDB-10.4 defaulted to Aria for system tables.

This introduced crash safety under the name of "transactional"
that was not previously in MyISAM.

The Aria implementation of checkpointing incurs significant
penality on fuse-overlayfs that occurs significantly in
container environments, especially those without a
/var/lib/mysql volume.

We work around this penality by disabling the crash
safety of timezone tables for the period of timezone
initialization.

Analysis and timings are in https://jira.mariadb.org/browse/MDEV-23326
and local tests show that 10.4 is only 0.8 seconds slower
than 10.3 on startup (6.8 seconds total).

Version specific comments are used to ensure that ALTER TABLE
statements aren't run on < 10.4 server versions.

closes MariaDB#262
@tianon tianon force-pushed the MDEV-23326-issue262 branch from 108b168 to 88ff4ee Compare August 11, 2020 18:03
@tianon
Copy link
Contributor

tianon commented Aug 11, 2020

Nice, thank you!! 🤘 ❤️

I did a rebase against master (and ran update.sh to apply the docker-entrypoint.sh change across all versions). Once CI is green, I plan to merge. 👍

@tianon tianon merged commit 83f552f into MariaDB:master Aug 11, 2020
docker-library-bot added a commit to docker-library-bot/official-images that referenced this pull request Aug 11, 2020
Changes:

- MariaDB/mariadb-docker@83f552f: Merge pull request MariaDB/mariadb-docker#320 from grooverdan/MDEV-23326-issue262
- MariaDB/mariadb-docker@88ff4ee: reduce docker_process_sql runs
- MariaDB/mariadb-docker@3a151d9: Speed up 10.4+ timezone initialization
- MariaDB/mariadb-docker@846fe2f: Update to 1:10.4.14+maria~focal
- MariaDB/mariadb-docker@b847957: Update to 1:10.3.24+maria~focal
- MariaDB/mariadb-docker@a43b52d: Update to 1:10.2.33+maria~bionic
- MariaDB/mariadb-docker@f44c127: Update to 1:10.1.46+maria-1~bionic
- MariaDB/mariadb-docker@7c75646: Update to 1:10.5.5+maria~focal
grooverdan added a commit that referenced this pull request Dec 7, 2021
Monty suggested after #320
was submitted that LOCK TABLES reduces the IO by making the fdatasync
occur at the UNLOCK TABLES. The advantage of this is that the timezone
tables are not written twice.

The two impedimates are that TRUNCATE TABLE is in the output of
mysql_tzinfo_to_sql, which implictly UNLOCKS the tables, and
START TRANSACTION, useful for Galera, but also implictly UNLOCKS.

A comparison of the method prior to commit (with TRUNCATE/START
TRANSACTION removed for a fair comparision).

real	0m1.865s
user	0m0.205s
sys	0m0.091s

To now:

real	0m1.254s
user	0m0.193s
sys	0m0.120s

Lower performing storage will show better gains.

Further improving mysql_tzinfo_to_sql remains server task MDEV-23326.

https://bugs.mysql.com/bug.php?id=20545 doesn't output "Local time" in
MariaDB in combination with the bionic/focal images so remove it.

As tested by:
$ podman run --rm mariadb:10.{2,6} mysql_tzinfo_to_sql /usr/share/zoneinfo | grep 'Local time'
@grooverdan grooverdan deleted the MDEV-23326-issue262 branch March 31, 2022 11:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

MySQL init process hangs when using image mariadb 10.1.42, 10.2.27, 10.3.18, 10.4.8
2 participants