用systemd管理GreatSQL服务详解
用systemd管理GreatSQL服务详解1.GreatSQL服务文件
官网 greatsql.service 文件
Description=GreatSQL Server
Documentation=man:mysqld(8)
Documentation=http://dev.mysql.com/doc/refman/en/using-systemd.html
After=network.target
After=syslog.target
WantedBy=multi-user.target
# 本文省略some limits
User=mysql
Group=mysql
Type=notify
TimeoutSec=10
PermissionsStartOnly=true
ExecStartPre=/usr/local/GreatSQL-8.0.32-27-Linux-glibc2.28-x86_64/bin/mysqld_pre_systemd
ExecStart=/usr/local/GreatSQL-8.0.32-27-Linux-glibc2.28-x86_64/bin/mysqld $MYSQLD_OPTS
EnvironmentFile=-/etc/sysconfig/mysql
Restart=on-failure
RestartPreventExitStatus=1
Environment=MYSQLD_PARENT_PID=1
PrivateTmp=false上述服务文件中 MYSQLD_OPTS、MYSQLD_PARENT_PID 的用途是什么?Type 和 ExecStart 有什么关系?服务停止的逻辑是什么?TimeoutSec 超时会怎样?
2.环境变量
ExecStart=/data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf $MYSQLD_OPTS
EnvironmentFile=-/data/conf/greatsql2.1 MYSQLD_OPTS
MYSQLD_OPTS 是一个特殊的环境变量,用于在启动时向 MYSQLD 进程传递额外的命令行参数。适合需要动态调整参数的场景。
可以通过以下方式设置 MYSQLD_OPTS
[*]systemctl,全局环境变量
# 设置
systemctl set-environment MYSQLD_OPTS="--general_log=1"
# 取消
systemctl unset-environment MYSQLD_OPTS
[*]服务文件,单个服务环境变量
Environment=MYSQLD_OPTS=--general_log=1
EnvironmentFile=-/data/conf/greatsql2.2 Environment
[*]服务文件中设置 Environment
Environment=LD_PRELOAD=/usr/local/jemalloc-5.3.0/lib/libjemalloc.so
Environment=LD_PRELOAD=/data/svr/greatsql/lib/mysql/libjemalloc.so#覆盖之前同名的变量
Environment=#清空所有环境变量如果同一变量被重复设置,后续的赋值会覆盖之前的值。如果将此选项赋值为空字符串,则会重置环境变量列表,之前的所有设置均失效。
[*]服务文件中设置 EnvironmentFile
EnvironmentFile=-/etc/sysconfig/mysql#-表示忽略文件不存在错误
EnvironmentFile=-/data/conf/greatsql
EnvironmentFile=#清空所有待读取的文件
$ cat /data/conf/greatsql
LD_PRELOAD=/data/svr/greatsql/lib/mysql/libjemalloc.so
LD_LIBRARY_PATH=/data/svr/greatsql/lib
TZ=CST
MYSQLD_OPTS=--general_log=1 --port=4307EnvironmentFile 可以设置多次,所有匹配的文件均会被读取。若将此选项赋值为空字符串,则会清空待读取的文件列表,之前所有设置均失效。
EnvironmentFile 按顺序依次读取,后加载的变量会覆盖之前的设定,且会覆盖 Environment 中的同名变量。
Environment、EnvironmentFile在服务启动前解析,这些变量会被直接写入服务的环境变量列表,对所有后续命令(ExecStartPre、ExecStart、ExecStartPost)可见。
如果 EnvironmentFile 指定的文件在运行时动态生成,systemd 会尝试读取它,如果文件在读取时被修改,systemd 会使用最新的内容。
3.启动
systemd 通过 fork-exec + cgroups 的机制创建并严格管理服务进程,确保所有进程均为其子进程。
[*]Uses fork() + execve() to spawn the new process:
[*]fork(): Creates a child process (a copy of the systemd parent).
[*]execve(): Overwrites the child process with the target binary.
[*]Assigns the process to a dedicated cgroup
[*]Ensures all child processes remain within the same cgroup.
[*]Enables resource limits and process tracking.
3.1 Type=simple
Type=simple
ExecStart=/data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf $MYSQLD_OPTS3.1.1 行为
[*]要求 ExecStart 启动的是前台命令,其将作为服务的主进程(Main Process)
[*]ExecStart 进程创建即启动成功(即使进程还在初始化或监听端口未就绪);如果进程崩溃或退出,systemd 会根据 Restart= 规则决定是否重启
[*]适用于不 fork() 且不依赖其他进程的服务
3.1.2 错误示例
如果 ExecStart 启动的命令以 daemon 模式运行,daemon 进程有一个瞬间退出的中间父进程,对应就是子进程。在子进程退出时,systemd 会将其从监控队列中踢掉,同时杀掉所有附属进程(杀进程的方式由 KillMode 控制)。
# KillMode=control-group
$ systemctl start db-4306
$ systemctl status db-4306
● db-4306.service - db-4306 Server
Loaded: loaded (/usr/lib/systemd/system/db-4306.service; enabled; vendor preset: disabled)
Active: inactive (dead) since Fri 2025-05-30 11:03:26 CST; 9s ago
Process: 1914 ExecStart=/data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf --daemonize $MYSQLD_OPTS (code=exited, status=0/SUCCESS)
Main PID: 1914 (code=exited, status=0/SUCCESS)
Jun 05 11:03:22 dbcluster-165 systemd: Started db-4306 Server.
$ ps aux |grep 4306 |grep -v grepType=simple,执行 daemon 命令,默认启动后马上会停止。
3.2 Type=forking
Type=forking
PIDFile=/data/dbdata/data4306/data/mysql.pid
ExecStart=/data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf --daemonize $MYSQLD_OPTS3.2.1 行为
[*]要求 ExecStart 启动的命令以 daemon 模式运行,服务预期自行 fork() 并退出
[*]ExecStart 进程 fork() 出的进程将作为服务的主进程(Main Process),推荐设置 PIDFile 用以正确监控服务主进程,否则通过 cgroup 跟踪
[*]PIDFile 只适合在 Type=forking 模式下使用,如果有设置 PIDFile,systemd 会在 ExecStart 进程退出后立即读取这个 PIDFile,读取成功后就认为该服务已经启动成功,读取失败就认为该服务启动失败
[*]适用于传统 Unix 守护进程
以下是 forking 模式下正常启动的服务
$ systemctl status db-4306
● db-4306.service - db-4306 Server
Loaded: loaded (/usr/lib/systemd/system/db-4306.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2025-05-29 22:28:03 CST; 11s ago
Process: 24262 ExecStart=/data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf --daemonize $MYSQLD_OPTS (code=exited, status=0/SUCCESS)
Main PID: 24342 (mysqld)
Tasks: 54
CGroup: /system.slice/db-4306.service
└─24342 /data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf --daemonize
May 29 22:28:01 dbcluster-165 systemd: Starting db-4306 Server...
May 29 22:28:03 dbcluster-165 systemd: Started db-4306 Server.ExecStart 启动的进程 PID=24262,且该进程的状态是已退出,退出状态码为0,这个进程是 daemon 类进程创建过程中瞬间退出的中间父进程。Main PID: 24342 (mysqld),这是 systemd 真正监控的服务主进程。
3.2.2 错误示例
如果 ExecStart 是一个前台命令,systemd 会一直等待 ExecStart 启动的进程作为中间父进程退出,在等待过程中,systemctl start 会一直卡住,直到等待超时而失败。
$ systemctl status db-4306
● db-4306.service - db-4306 Server
Loaded: loaded (/usr/lib/systemd/system/db-4306.service; enabled; vendor preset: disabled)
Active: activating (start) since Fri 2025-05-30 17:25:01 CST; 52s ago
Main PID: 27683 (code=exited, status=0/SUCCESS); : 12646 (mysqld)
Tasks: 54
CGroup: /system.slice/db-4306.service
└─12646 /data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf
May 30 17:25:01 dbcluster-165 systemd: Starting db-4306 Server...
$ ps axj |grep 4306 |grep -v grep
18266 12640 12640 18266 pts/1 12640 S+ 0 0:00 systemctl start db-4306
1 12646 12646 12646 ? -1 Ssl 986 0:02 /data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf
$ tailf /var/log/messages |grep db-4306
May 30 17:25:01 dbcluster-165 systemd: Starting db-4306 Server...
May 30 17:26:31 dbcluster-165 systemd: db-4306.service start operation timed out. Terminating.
May 30 17:26:32 dbcluster-165 systemd: Failed to start db-4306 Server.
May 30 17:26:32 dbcluster-165 systemd: Unit db-4306.service entered failed state.
May 30 17:26:32 dbcluster-165 systemd: db-4306.service failed.
May 30 17:26:32 dbcluster-165 systemd: db-4306.service holdoff time over, scheduling restart.
May 30 17:26:32 dbcluster-165 systemd: Stopped db-4306 Server.
May 30 17:26:32 dbcluster-165 systemd: Starting db-4306 Server...Type=forking,执行前台命令,在Restart=on-failure场景,启动超时导致服务反复重启。
3.3 Type=notify
Type=notify
ExecStart=/data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf $MYSQLD_OPTS3.3.1 行为
[*]类似 simple,要求 ExecStart 启动的是前台命令,其将作为服务的主进程(Main Process)
[*]进程支持 sd_notify(),必须正确配置 NotifyAccess 和超时时间
[*]进程在启动完成、状态更新、停止通知后必需主动通过 sd_notify() 向 systemd 发送通知
[*]适用于实现更精确的启动、运行和停止管理服务
3.4 mysqld显示更多变量
当使用 mysqld_safe 启动数据库时,ps 可以看到 mysqld 进程带有很多变量
$ ps aux |grep 4306
mysql 87870.00.0 1133161640 pts/1 S 08:59 0:00 /bin/sh /data/svr/greatsql/bin/mysqld_safe --defaults-file=/data/conf/greatsql4306.cnf
mysql104240.63.0 1251912 499724 pts/1Sl 08:59 0:16 /data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf --basedir=/data/svr/greatsql --datadir=/data/dbdata/data4306/data --plugin-dir=/data/svr/greatsql/lib/plugin --log-error=/data/logs/error4306.log --open-files-limit=65535 --pid-file=/data/dbdata/data4306/data/mysql.pid --socket=/data/dbdata/data4306/data/mysql.sock --port=4306
$ mysqld_safe 处理逻辑如下
cmd="`mysqld_ld_preload_text`$NOHUP_NICENESS"
for i in"$ledir/$MYSQLD" "$defaults" "--basedir=$MY_BASEDIR_VERSION" \
"--datadir=$DATADIR" "--plugin-dir=$plugin_dir" "$USER_OPTION"
do
cmd="$cmd "`shell_quote_string "$i"`
done
cmd="$cmd $args"
# Avoid 'nohup: ignoring input' warning
test -n "$NOHUP_NICENESS" && cmd="$cmd < /dev/null"
log_notice "Starting $MYSQLD daemon with databases from $DATADIR"对于 systemd service,可以添加 ExecStartPre,从 --defaults-file 中获取需要显示的变量
Type=notify
ExecStartPre=-/bin/bash -c "sed 's/_/-/g; s/ //g; s/#.*//' /data/conf/greatsql4306.cnf |grep -E '^(basedir|datadir|log-error|socket|port)=' |sed 's/^/--/' |tr '\n' ' ' |sed 's/^/MYSQLD_OPTS=/' > /data/conf/greatsql4306.env"
ExecStart=/data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf $MYSQLD_OPTS
EnvironmentFile=-/data/conf/greatsql4306.env
# 启动后的效果
$ systemctl -l status db-4306
● db-4306.service - db-4306 Server
Loaded: loaded (/usr/lib/systemd/system/db-4306.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2025-06-06 10:27:16 CST; 7h ago
Main PID: 12020 (mysqld)
Status: "Server is operational"
Tasks: 53
CGroup: /system.slice/db-4306.service
└─12020 /data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf --port=4306 --basedir=/data/svr/greatsql --datadir=/data/dbdata/data4306/data --pid-file=/data/dbdata/data4306/data/mysql.pid --socket=/data/dbdata/data4306/data/mysql.sock --log-error=/data/logs/error4306.log
Jun 06 10:27:14 dbcluster-165 systemd: Starting db-4306 Server...
Jun 06 10:27:16 dbcluster-165 systemd: Started db-4306 Server.3.5 mysqld先于磁盘挂载启动
对于开机自动启动,如果磁盘挂载服务启动较慢,数据库服务可能会报错,可以配置数据库服务延迟启动
Description=GreatSQL Server
After=network.target local-fs.target#部分环境After、Requires=local-fs.target无效
Requires=local-fs.target
Type=notify
Environment=LD_PRELOAD=/usr/local/jemalloc-5.3.0/lib/libjemalloc.so
ExecStartPre=-/usr/bin/sleep 5#本节采用
ExecStart=/data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf $MYSQLD_OPTS服务日志如下
$ systemctl start db-4306
$ systemctl -l status db-4306
● db-4306.service - db-4306 Server
Loaded: loaded (/usr/lib/systemd/system/db-4306.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2025-06-13 23:16:09 CST; 15s ago
Process: 3920 ExecStartPre=/usr/bin/sleep 5 (code=exited, status=0/SUCCESS)
Main PID: 4004 (mysqld)
Status: "Server is operational"
Tasks: 54
CGroup: /system.slice/db-4306.service
└─4004 /data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf
Jun 13 23:16:02 dbcluster-165 systemd: Starting db-4306 Server...
Jun 13 23:16:02 dbcluster-165 sleep: ERROR: ld.so: object '/usr/local/jemalloc-5.3.0/lib/libjemalloc.so' from LD_PRELOAD cannot be preloaded: ignored.
Jun 13 23:16:09 dbcluster-165 systemd: Started db-4306 Server.
$ lsof -p 4004 |grep -i jem
mysqld4004 mysqlmem REG 8,210479400 846986 /usr/local/jemalloc-5.3.0/lib/libjemalloc.so.2systemctl status中 ERROR 的原因:执行 ExecStartPre 会加载服务的环境变量,此时由于磁盘挂载暂未完成,导致 so 文件无法加载(这一条 ERROR 可以忽略)。只要延迟足够,在磁盘挂载完成后再执行 ExecStart,就能正常加载配置的 so 文件。
4.停止
4.1 systemctl stop 逻辑
[*]执行 ExecStop (设置了才执行,否则进入第2步)
[*]ExecStop 成功终止进程,systemd 检测到进程已退出,不再发送 SIGTERM
[*]ExecStop 未终止进程,例如 ExecStop 超时(TimeoutStopSec),ExecStop 命令与停止进程无关,进入第2步
[*]发送 SIGTERM(默认,KillSignal=SIGTERM),优雅退出
[*]SIGTERM 成功终止进程,服务退出
[*]SIGTERM 未终止进程,例如 SIGTERM 超时(TimeoutStopSec),进入第3步
[*]发送 SIGKILL(默认,SendSIGKILL=yes),强制终止
4.2 systemctl stop 执行超时
对于版本升级等场景,通常会设置 innodb_fast_shutdown=0,此时关闭数据库会比较慢,如果 TimeoutStopSec 过小,可能导致 ExecStop、SIGTERM 超时,触发 SIGKILL。
[*]数据库以 systemd 服务运行
Type=notify
ExecStart=/data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf
Restart=on-failure
[*]ExecStop 超时,SIGTERM 超时,对应的服务日志
$ date;systemctl stop db-4306
Tue Jun3 15:37:19 CST 2025
$ systemctl status db-4306
● db-4306.service - db-4306 Server
Loaded: loaded (/usr/lib/systemd/system/db-4306.service; enabled; vendor preset: disabled)
Active: failed (Result: signal) since Tue 2025-06-03 15:37:21 CST; 36s ago
Process: 26645 ExecStop=/usr/bin/sleep 60 (code=killed, signal=TERM)
Process: 26322 ExecStart=/data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf (code=killed, signal=KILL)
Main PID: 26322 (code=killed, signal=KILL)
Status: "Server shutdown in progress"
Jun 03 15:37:01 dbcluster-165 systemd: Starting db-4306 Server...
Jun 03 15:37:03 dbcluster-165 systemd: Started db-4306 Server.
Jun 03 15:37:19 dbcluster-165 systemd: Stopping db-4306 Server...
Jun 03 15:37:20 dbcluster-165 systemd: db-4306.service stopping timed out. Terminating.
Jun 03 15:37:21 dbcluster-165 systemd: db-4306.service stop-sigterm timed out. Killing.
Jun 03 15:37:21 dbcluster-165 systemd: db-4306.service: main process exited, code=killed, status=9/KILL
Jun 03 15:37:21 dbcluster-165 systemd: Stopped db-4306 Server.
Jun 03 15:37:21 dbcluster-165 systemd: Unit db-4306.service entered failed state.
Jun 03 15:37:21 dbcluster-165 systemd: db-4306.service failed.systemctl stop 超时(ExecStop、SIGTERM),服务会被 SIGKILL。systemd 认为是一个预期的行为,不会触发重启。
4.3 命令行执行 shutdown 超时
[*]数据库以 systemd 服务运行(同上)
[*]命令行执行 shutdown
greatsql> shutdown;systemd 对所有通过它管理的服务都实施完整的生命周期控制。当执行 shutdown 时,信号传递链:shutdown --> 数据库服务 --> systemd 服务管理器。systemd 会监控整个停止过程。
[*]对应服务日志
Jun3 17:32:23 dbcluster-165 systemd: db-4306.service stop-sigterm timed out. Killing.
Jun3 17:32:24 dbcluster-165 systemd: db-4306.service: main process exited, code=killed, status=9/KILL
Jun3 17:32:24 dbcluster-165 systemd: Unit db-4306.service entered failed state.
Jun3 17:32:24 dbcluster-165 systemd: db-4306.service failed.
Jun3 17:32:24 dbcluster-165 systemd: db-4306.service holdoff time over, scheduling restart.
Jun3 17:32:24 dbcluster-165 systemd: Stopped db-4306 Server.
Jun3 17:32:24 dbcluster-165 systemd: Starting db-4306 Server...
Jun3 17:32:25 dbcluster-165 systemd: Started db-4306 Server.SIGTERM 超时,发送 SIGKILL 强制终止,进程退出原因是 Unclean signal。在 Restart=on-failure 场景,等待 RestartSec(默认100ms)后重启。
命令行 shutdown 超时(SIGTERM),服务会被 SIGKILL,之后会触发重启。
4.4 禁用 systemd 的默认信号
如果要禁用 systemd 的默认停止行为,可以参考如下设置
ExecStop=/path/to/your-stop-script# 必须确保此脚本终止所有进程
KillMode=none # 禁用systemd的默认终止行为
TimeoutStopSec=0 # 避免超时干预5.重启
Configures whether the service shall be restarted when the service process exits, is killed, or a timeout is reached. The service process may be the main service process, but it may also be one of the processes specified with ExecStartPre=, ExecStartPost=, ExecStop=, ExecStopPost=, or ExecReload=. When the death of the process is a result of systemd operation (e.g. service stop or restart), the service will not be restarted. Timeouts include missing the watchdog "keep-alive ping" deadline and a service start, reload, and stop operation timeouts.
5.1 Restart 的取值
Restart settings/Exit causesnoalwayson-successon-failureon-abnormalon-aborton-watchdogClean exit code or signalXXUnclean exit codeXXUnclean signalXXXXTimeoutXXXWatchdogXXXXA clean exit means an exit code of 0, or one of the signals SIGHUP, SIGINT, SIGTERM or SIGPIPE, and additionally, exit statuses and signals specified in SuccessExitStatus=.
5.2 命令行执行 restart 无法重启
[*]数据库以 systemd 服务运行(同上)
[*]命令行执行 restart
greatsql> restart;
ERROR 3707 (HY000): Restart server failed (mysqld is not managed by supervisor process).原因:没有设置相关的监控进程(https://dev.mysql.com/doc/refman/8.0/en/restart.html)
[*]查看运行中数据库进程的 MYSQLD_PARENT_PID
$ ps aux |grep 4306 |grep -v grep
mysql210550.53.1 2635940 505712 ? SslJun03 7:31 /data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf
$ cat /proc/21055/environ |tr '\0' '\n'
LANG=en_US.UTF-8
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
NOTIFY_SOCKET=/run/systemd/notify
HOME=/home/mysql
LOGNAME=mysql
USER=mysql
SHELL=/bin/bash
[*]设置 MYSQLD_PARENT_PID
[*]mysqld_safe,默认将 mysqld_safe 的PID设置为监控进程
#echo "Running mysqld: [$cmd]"
cmd="env MYSQLD_PARENT_PID=$$ $cmd"
eval "$cmd"
[*]systemd,需在服务文件添加
Environment=MYSQLD_PARENT_PID=1
[*]重启数据库生效
$ systemctl restart db-4306
$ ps aux |grep 4306 |grep -v grep
mysql110500.93.0 2635676 500004 ? Ssl15:53 0:04 /data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf
$ cat /proc/11050/environ |tr '\0' '\n' |grep MYSQLD_PARENT_PID
MYSQLD_PARENT_PID=1
greatsql> restart;
Query OK, 0 rows affected (0.00 sec)如果要求不重启数据库就能执行 restart 命令,可通过 gdb 动态修改环境变量,gdb 会短时间阻塞数据库
$ gdb -p 21055
(gdb) call putenv("MYSQLD_PARENT_PID=1")
$2 = 0
(gdb) detach
Detaching from program: /data/svr/GreatSQL-8.0.32-26-Linux-glibc2.17-x86_64/bin/mysqld, process 21055
(gdb) quit
greatsql> restart;
Query OK, 0 rows affected (0.00 sec)clone 同样需要设置 MYSQLD_PARENT_PID,才能自动重启。
5.3相关参数
ReStartSec:重启条件满足后等多久自动重启
StartLimitInterval、StartLimitBurst:限制指定时间内(StartLimitInterval)重启的次数(StartLimitBurst)
RestartPreventExitStatus:指定某些退出状态码或信号不重启。GreatSQL服务建议设置为1,在遇到严重错误时不重启实例,需人工介入处理
RestartForceExitStatus:强制将某些退出状态码或信号重启。比如 Restart=no,RestartForceExitStatus=16,则不依赖自动重启,但命令行执行 restart 可正常重启实例
6.总结
[*]合理设置 Type 和 ExecStart
[*]建议设置一个较大的 TimeoutStopSec,避免 ExecStop 或者 SIGTERM 超时
[*]Clone 和命令行 restart 需要设置 MYSQLD_PARENT_PID
[*]建议设置 ReStartSec、StartLimitInterval 和 StartLimitBurst 限制重启频率
[*]可以设置 ExecStartPre和EnvironmentFile,添加 ps 需要显示的变量
Enjoy GreatSQL
来源:程序园用户自行投稿发布,如果侵权,请联系站长删除
免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!
页:
[1]