找回密码
 立即注册
首页 业界区 安全 PostgreSQL pg_auto_failover 高可用 1:pg_auto_failov ...

PostgreSQL pg_auto_failover 高可用 1:pg_auto_failover集群搭建

濮阳雅爱 昨天 19:40
 
0,pg_auto_failover架构

开始之前,先看一下pg_auto_failover的一个最基础架构原理如下,需要弄清楚几个节点的作用
1,monitor节点的身份是一个监控节点,仅存储元数据,负责监控primary和secondary的健康状况以及异常情况下的故障转移。
2,monitor不负责存储用户数据,负责存储用户数据的是primary和secondary节点。
3,monitor节点是一个单点,存在单点故障的可能性,这是pg_auto_failover的硬伤,但monitor节点故障后不影响primary和secondary的运行。
4,primary和secondary节点是用户数据库的存储节点,先注册到monitor中的节点为主节点,后注册到monitor的节点为从节点,正常注册后实现流复制,其身份可以互换。
1.png

 
 
 
 
1,环境准备

1.1 主机环境

Ubuntu 20,三台机器均已安装好PostgreSQL 16.4版本数据库
  1. monitor: 192.168.152.121 ubuntu11
  2. 主:      192.168.152.122 ubuntu12
  3. 从:      192.168.152.123 ubuntu13
复制代码
1.2 三台主机同步修改,操作一致
  1. vi /etc/hosts
  2. 192.168.152.121 ubuntu11
  3. 192.168.152.122 ubuntu12
  4. 192.168.152.123 ubuntu13
复制代码
1.3 设置环境变量
  1. --配置postgres的环境变量
  2. vi /etc/profile
  3. export LANG=en_US.utf8
  4. export PGDATA=/usr/local/pgsql16/pg9300/data/
  5. export PGHOME=/usr/local/pgsql16/server
  6. export PATH=/usr/local/pgsql16/server/bin:$PATH
  7. export LD_LIBRARY_PATH=/usr/local/pgsql16/server/lib
复制代码
 2,pg_auto_failover源码编译安装

2.1 下载源码包,编译安装

三个节点都要安装,这里仅以Ubuntu11上的安装为示例
  1. --下载pg_auto_failover源码包
  2. cd /usr/local/pg_auto_failover
  3. wget https://github.com/hapostgres/pg_auto_failover/archive/refs/tags/v2.2.zip
  4. apt install unzip
  5. unzip v2.2.zip
  6. --解压后路径如下
  7. drwxr-xr-x  3 root root    4096 Jun  4 16:27 ./
  8. drwxr-xr-x 14 root root    4096 Jun  4 16:26 ../
  9. drwxr-xr-x  8 root root    4096 Apr  3 20:05 pg_auto_failover-2.2/
  10. -rw-r--r--  1 root root 1364027 Jun  4 16:26 v2.2.zip
  11. --这里的安装,实际上将pg_auto_failover的编译文件,安装到上面配置postgres的环境变量的指定的路径中
  12. cd pg_auto_failover-2.2/
  13. make
  14. make install
  15. --安装完成后,重新将PGHOME相关目录授权给postgres用户,否则后续使用pg_autoctl的时候会报找不到命令的错误
  16. chown -R postgres:postgres /usr/local/pgsql16/
复制代码
2.2 编译安装的pg_auto_failover位置

实际上pg_auto_failover的编译安装之后,把文件存放在上述环境变量的PGHOME=/usr/local/pgsql16/server目录的bin和lib目录下
2.png

3.png
  3,monitor节点配置pg_auto_failover

3.1 pg_auto_failover初始化

原实例是自动安装的,初始化了data目录,这里需要停止原实例,并且需要将data目录清理干净,需要切换到postgres用户下执行pg_autoctl create monitor --pgdata /usr/local/pgsql16/pg9300/data/  --auth trust  --ssl-self-signed --hostname ubuntu11  --pgport 9300 --run1,这里加上pgdata和pgport等参数,不加的话pgdata从上面设置的环境变量中获取,端口号为默认的54322,加上 --run会非后台的方式自动运行pg_auto_failover实例,没必要加3,--hostname 要用主机名,不能用IP 遇到的问题:1,ERROR Failed to create state directory "/run/user/0/pg_autoctl": Permission denied        root@ubuntu11:/usr/local/pg_install_packgae# sudo chmod -R 777 /run/user/0        root@ubuntu11:/usr/local/pg_install_packgae# sudo chown -R postgres:postgres /run/user/02,pg_autoctl: command not found        source /etc/profile错误如下
  1. postgres@ubuntu11:/usr/local/pg_install_packgae$ pg_autoctl create monitor --pgdata /usr/local/pgsql16/pg9300/data/  --auth trust  --ssl-self-signed --hostname ubuntu11  --pgport 9300
  2. pg_autoctl: command not found
  3. postgres@ubuntu11:/usr/local/pg_install_packgae$
  4. postgres@ubuntu11:/usr/local/pg_install_packgae$
  5. postgres@ubuntu11:/usr/local/pg_install_packgae$ /usr/local/pgsql16/server/bin/pg_autoctl create monitor --pgdata /usr/local/pgsql16/pg9300/data/  --auth trust  --ssl-self-signed --hostname 127.0.0.1  --pgport 9300
  6. 10:47:45 2821 ERROR Failed to create state directory "/run/user/0/pg_autoctl": Permission denied
  7. 10:47:45 2821 ERROR Failed to build pg_autoctl pid file pathname, see above.
  8. 10:47:45 2821 FATAL Failed to set pid filename from PGDATA "/usr/local/pgsql16/pg9300/data/", see above for details.
  9. postgres@ubuntu11:/usr/local/pg_install_packgae$
  10. postgres@ubuntu11:/usr/local/pg_install_packgae$
  11. postgres@ubuntu11:/usr/local/pg_install_packgae$ exit
  12. exit
  13. root@ubuntu11:/usr/local/pg_install_packgae# sudo chmod -R 777 /run/user/0
  14. root@ubuntu11:/usr/local/pg_install_packgae# sudo chown -R postgres:postgres /run/user/0
  15. root@ubuntu11:/usr/local/pg_install_packgae#
复制代码
修复后正常初始化monitor节点
  1. root@ubuntu11:/usr/local/pg_install_packgae# su - postgres #切换到postgres用户下执行
  2. postgres@ubuntu11:~$
  3. postgres@ubuntu11:~$
  4. postgres@ubuntu11:~$
  5. postgres@ubuntu11:~$ pg_autoctl create monitor --pgdata /usr/local/pgsql16/pg9300/data/  --auth trust  --ssl-self-signed --hostname ubuntu11  --pgport 9300 --run
  6. 05:04:27 2216 INFO  Using default --ssl-mode "require"
  7. 05:04:27 2216 INFO  Using --ssl-self-signed: pg_autoctl will create self-signed certificates, allowing for encrypted network traffic
  8. 05:04:27 2216 WARN  Self-signed certificates provide protection against eavesdropping; this setup does NOT protect against Man-In-The-Middle attacks nor Impersonation attacks.
  9. 05:04:27 2216 WARN  See https://www.postgresql.org/docs/current/libpq-ssl.html for details
  10. 05:04:27 2216 INFO  Initialising a PostgreSQL cluster at "/usr/local/pgsql16/pg9300/data"
  11. 05:04:27 2216 INFO  /usr/local/pgsql16/server/bin/pg_ctl initdb -s -D /usr/local/pgsql16/pg9300/data --option '--auth=trust'
  12. 05:04:29 2216 INFO   /usr/bin/openssl req -new -x509 -days 365 -nodes -text -out /usr/local/pgsql16/pg9300/data/server.crt -keyout /usr/local/pgsql16/pg9300/data/server.key -subj "/CN=ubuntu11"
  13. 05:04:29 2216 INFO  Started pg_autoctl postgres service with pid 2238
  14. 05:04:29 2216 INFO  Started pg_autoctl listener service with pid 2239
  15. 05:04:29 2238 INFO   /usr/local/pgsql16/server/bin/pg_autoctl do service postgres --pgdata /usr/local/pgsql16/pg9300/data/ -v
  16. 05:04:29 2243 INFO   /usr/local/pgsql16/server/bin/postgres -D /usr/local/pgsql16/pg9300/data -p 9300 -h *
  17. 05:04:29 2238 INFO  Postgres is now serving PGDATA "/usr/local/pgsql16/pg9300/data" on port 9300 with pid 2243
  18. 05:04:29 2239 WARN  NOTICE:  installing required extension "btree_gist"
  19. 05:04:29 2239 INFO  Granting connection privileges on 192.168.152.0/24
  20. 05:04:29 2239 WARN  Skipping HBA edits (per --skip-pg-hba) for rule: hostssl "pg_auto_failover" "autoctl_node" 192.168.152.0/24 trust
  21. 05:04:29 2239 INFO  Your pg_auto_failover monitor instance is now ready on port 9300.
  22. 05:04:29 2239 INFO  Monitor has been successfully initialized.
  23. 05:04:29 2239 INFO   /usr/local/pgsql16/server/bin/pg_autoctl do service listener --pgdata /usr/local/pgsql16/pg9300/data/ -v
  24. 05:04:29 2239 INFO  Managing the monitor at postgres://autoctl_node@ubuntu11:9300/pg_auto_failover?sslmode=require
  25. 05:04:29 2239 INFO  Reloaded the new configuration from "/home/postgres/.config/pg_autoctl/usr/local/pgsql16/pg9300/data/pg_autoctl.cfg"
  26. 05:04:29 2239 INFO  Reloading Postgres configuration and HBA rules
  27. 05:04:30 2239 INFO  The version of extension "pgautofailover" is "2.2" on the monitor
  28. 05:04:30 2239 INFO  Contacting the monitor to LISTEN to its events.
复制代码
3.2 pg_auto_failover systemctl脚本配置

切换到root用户下执行pg_autoctl show systemd,会打印出来生成的systemctl命令以及自动启动命令,执行如下4条命令即可pg_autoctl -q show systemd --pgdata "/usr/local/pgsql16/pg9300/data" | tee /etc/systemd/system/pgautofailover.servicesystemctl daemon-reloadsystemctl enable pgautofailoversystemctl start pgautofailover
  1. postgres@ubuntu11:~$ exit #切换到root用户执行
  2. logout
  3. root@ubuntu11:/usr/local/pg_install_packgae# pg_autoctl show systemd
  4. 05:06:26 2333 INFO  HINT: to complete a systemd integration, run the following commands (as root):
  5. 05:06:26 2333 INFO  pg_autoctl -q show systemd --pgdata "/usr/local/pgsql16/pg9300/data" | tee /etc/systemd/system/pgautofailover.service
  6. 05:06:26 2333 INFO  systemctl daemon-reload
  7. 05:06:26 2333 INFO  systemctl enable pgautofailover
  8. 05:06:26 2333 INFO  systemctl start pgautofailover
  9. [Unit]
  10. Description = pg_auto_failover
  11. [Service]
  12. WorkingDirectory = /home/postgres
  13. Environment = 'PGDATA=/usr/local/pgsql16/pg9300/data'
  14. User = postgres
  15. ExecStart = /usr/local/pgsql16/server/bin/pg_autoctl run
  16. Restart = always
  17. StartLimitBurst = 0
  18. ExecReload = /usr/local/pgsql16/server/bin/pg_autoctl reload
  19. [Install]
  20. WantedBy = multi-user.target
  21. root@ubuntu11:/usr/local/pg_install_packgae#
  22. root@ubuntu11:/usr/local/pg_install_packgae#
  23. root@ubuntu11:/usr/local/pg_install_packgae# pg_autoctl -q show systemd --pgdata "/usr/local/pgsql16/pg9300/data" | tee /etc/systemd/system/pgautofailover.service
  24. [Unit]
  25. Description = pg_auto_failover
  26. [Service]
  27. WorkingDirectory = /home/postgres
  28. Environment = 'PGDATA=/usr/local/pgsql16/pg9300/data'
  29. User = postgres
  30. ExecStart = /usr/local/pgsql16/server/bin/pg_autoctl run
  31. Restart = always
  32. StartLimitBurst = 0
  33. ExecReload = /usr/local/pgsql16/server/bin/pg_autoctl reload
  34. [Install]
  35. WantedBy = multi-user.target
  36. root@ubuntu11:/usr/local/pg_install_packgae#
  37. root@ubuntu11:/usr/local/pg_install_packgae# systemctl daemon-reload
  38. root@ubuntu11:/usr/local/pg_install_packgae#
  39. root@ubuntu11:/usr/local/pg_install_packgae# systemctl enable pgautofailover
  40. Created symlink /etc/systemd/system/multi-user.target.wants/pgautofailover.service → /etc/systemd/system/pgautofailover.service.
  41. root@ubuntu11:/usr/local/pg_install_packgae#
  42. root@ubuntu11:/usr/local/pg_install_packgae# systemctl start pgautofailover
  43. root@ubuntu11:/usr/local/pg_install_packgae#
  44. root@ubuntu11:/usr/local/pg_install_packgae# systemctl status pgautofailover
  45. ● pgautofailover.service - pg_auto_failover
  46.      Loaded: loaded (/etc/systemd/system/pgautofailover.service; enabled; vendor preset: enabled)
  47.      Active: active (running) since Thu 2025-10-09 05:06:43 UTC; 5s ago
  48.    Main PID: 2421 (pg_autoctl)
  49.       Tasks: 14 (limit: 4550)
  50.      Memory: 29.2M
  51.      CGroup: /system.slice/pgautofailover.service
  52.              ├─2421 /usr/local/pgsql16/server/bin/pg_autoctl run
  53.              ├─2444 pg_autoctl: start/stop postgres
  54.              ├─2445 pg_autoctl: monitor listener
  55.              ├─2454 /usr/local/pgsql16/server/bin/postgres -D /usr/local/pgsql16/pg9300/data -p 9300 -h *
  56.              ├─2455 postgres: pg_auto_failover monitor: logger
  57.              ├─2456 postgres: pg_auto_failover monitor: checkpointer
  58.              ├─2457 postgres: pg_auto_failover monitor: background writer
  59.              ├─2459 postgres: pg_auto_failover monitor: walwriter
  60.              ├─2460 postgres: pg_auto_failover monitor: autovacuum launcher
  61.              ├─2461 postgres: pg_auto_failover monitor: pg_auto_failover monitor
  62.              ├─2462 postgres: pg_auto_failover monitor: logical replication launcher
  63.              ├─2463 postgres: pg_auto_failover monitor: pg_auto_failover monitor healthcheck worker postgres
  64.              ├─2464 postgres: pg_auto_failover monitor: pg_auto_failover monitor healthcheck worker pg_auto_failover
  65.              └─2466 postgres: pg_auto_failover monitor: autoctl_node pg_auto_failover [local] idle
  66. Oct 09 05:06:43 ubuntu11 pg_autoctl[2421]: 05:06:43 2421 INFO  Started pg_autoctl postgres service with pid 2444
  67. Oct 09 05:06:43 ubuntu11 pg_autoctl[2421]: 05:06:43 2421 INFO  Started pg_autoctl listener service with pid 2445
  68. Oct 09 05:06:43 ubuntu11 pg_autoctl[2445]: 05:06:43 2445 INFO   /usr/local/pgsql16/server/bin/pg_autoctl do service listener --pgdata /usr/local/pgsql16/pg9300/data -v
  69. Oct 09 05:06:43 ubuntu11 pg_autoctl[2444]: 05:06:43 2444 INFO   /usr/local/pgsql16/server/bin/pg_autoctl do service postgres --pgdata /usr/local/pgsql16/pg9300/data -v
  70. Oct 09 05:06:43 ubuntu11 pg_autoctl[2445]: 05:06:43 2445 INFO  Managing the monitor at postgres://autoctl_node@ubuntu11:9300/pg_auto_failover?sslmode=require
  71. Oct 09 05:06:43 ubuntu11 pg_autoctl[2445]: 05:06:43 2445 INFO  Reloaded the new configuration from "/home/postgres/.config/pg_autoctl/usr/local/pgsql16/pg9300/data/pg_autoctl.cfg"
  72. Oct 09 05:06:43 ubuntu11 pg_autoctl[2454]: 05:06:43 2454 INFO   /usr/local/pgsql16/server/bin/postgres -D /usr/local/pgsql16/pg9300/data -p 9300 -h *
  73. Oct 09 05:06:43 ubuntu11 pg_autoctl[2444]: 05:06:43 2444 INFO  Postgres is now serving PGDATA "/usr/local/pgsql16/pg9300/data" on port 9300 with pid 2454
  74. Oct 09 05:06:43 ubuntu11 pg_autoctl[2445]: 05:06:43 2445 INFO  The version of extension "pgautofailover" is "2.2" on the monitor
  75. Oct 09 05:06:43 ubuntu11 pg_autoctl[2445]: 05:06:43 2445 INFO  Contacting the monitor to LISTEN to its events.
  76. root@ubuntu11:/usr/local/pg_install_packgae#
复制代码
执行完上述命令后,已正常启动pg_auto_failover的monitor节点。 3.3 pg_auto_failover 初始化验证

monitor节点初始化的数据文件如下,此时monitor节点使用的配置文件是postgresql-auto-failover.conf,而不是单机安装使用的postgresql.conf
4.png
查看初始化的库和相关角色对象
  1. root@ubuntu11:/usr/local/pg_install_packgae# psql -h ubuntu11 -p 9300 postgres postgres
  2. psql (16.4)
  3. SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)
  4. Type "help" for help.
  5. #1,自动安装了pg_auto_failover库
  6. postgres=# \l
  7.                                                           List of databases
  8.        Name       |  Owner   | Encoding | Locale Provider |  Collate   |   Ctype    | ICU Locale | ICU Rules |   Access privileges
  9. ------------------+----------+----------+-----------------+------------+------------+------------+-----------+-----------------------
  10. pg_auto_failover | autoctl  | UTF8     | libc            | en_US.utf8 | en_US.utf8 |            |           |
  11. postgres         | postgres | UTF8     | libc            | en_US.utf8 | en_US.utf8 |            |           |
  12. template0        | postgres | UTF8     | libc            | en_US.utf8 | en_US.utf8 |            |           | =c/postgres          +
  13.                   |          |          |                 |            |            |            |           | postgres=CTc/postgres
  14. template1        | postgres | UTF8     | libc            | en_US.utf8 | en_US.utf8 |            |           | =c/postgres          +
  15.                   |          |          |                 |            |            |            |           | postgres=CTc/postgres
  16. (4 rows)
  17. #2,自动创建了autoctl和autoctl_node两个角色
  18. postgres=# \du
  19.                                List of roles
  20.   Role name   |                         Attributes
  21. --------------+------------------------------------------------------------
  22. autoctl      |
  23. autoctl_node |
  24. postgres     | Superuser, Create role, Create DB, Replication, Bypass RLS
  25. #3,自动安装了如下两个扩展
  26. postgres=# \c pg_auto_failover
  27. SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)
  28. You are now connected to database "pg_auto_failover" as user "postgres".
  29. pg_auto_failover=# \dx
  30.                              List of installed extensions
  31.       Name      | Version |   Schema   |                  Description
  32. ----------------+---------+------------+-----------------------------------------------
  33. btree_gist     | 1.7     | public     | support for indexing common datatypes in GiST
  34. pgautofailover | 2.2     | public     | pg_auto_failover
  35. plpgsql        | 1.0     | pg_catalog | PL/pgSQL procedural language
  36. (3 rows)
  37. pg_auto_failover=#
复制代码
 查看monitor节点的连接串,主从节点加入pg_auto_failover的时候会用到这个连接串
  1. root@ubuntu11:/usr/local/pg_install_packgae#
  2. root@ubuntu11:/usr/local/pg_install_packgae#
  3. root@ubuntu11:/usr/local/pg_install_packgae# pg_autoctl show uri
  4.         Type |    Name | Connection String
  5. -------------+---------+-------------------------------
  6.      monitor | monitor | postgres://autoctl_node@127.0.0.1:9300/pg_auto_failover?sslmode=require        #一开始我懵逼了,为什么连接串里是127.0.0.1
  7.    formation | default |
  8. root@ubuntu11:/usr/local/pg_install_packgae# su - postgres
  9. postgres@ubuntu11:~$
  10. postgres@ubuntu11:~$ source /etc/profile
  11. postgres@ubuntu11:~$
  12. postgres@ubuntu11:~$ pg_autoctl show uri
  13.         Type |    Name | Connection String
  14. -------------+---------+-------------------------------
  15.      monitor | monitor | postgres://autoctl_node@Ubuntu11:9300/pg_auto_failover?sslmode=require                #后来尝试切换到postgres用户下查看,竟然变成了主机名,搞什么鬼哦
  16.    formation | default |
  17. postgres@ubuntu11:~$
复制代码
 4,主节点和从节点依次加入pg_auto_failover集群

主从节点上1,因为PostgreSQL实例是自动化安装的,初始化了数据库,这里需要停止PostgreSQL实例的服务,并且删除相关的pgdata2,因为PostgreSQL服务已经交由pg_auto_failover托管,由pg_auto_failover来启动,所以务必禁用PostgreSQL自身的自动启动服务3,2中提到了,主从以及monitor节点都要安装pg_auto_failover,这里所有节点都已经安装了pg_auto_failover 4.1 注册主节点中遇到的异常处理

注册命令:pg_autoctl create postgres --hostname ubuntu12 --name ubuntu12 --auth trust --ssl-self-signed --pgdata /usr/local/pgsql16/pg9300/data/ --pgport 9300 --monitor 'postgres://autoctl_node@Ubuntu11:9300/pg_auto_failover?sslmode=require'1,注意,Ubuntu下一定要切换到postgres下执行2,执行注册主节点命令,但是不很不幸,报错了
5.png
解决办法:1,修改monitor节点的hba.conf文件,增加受信任链接配置:host     all             all              192.168.152.0/24          trust2,重启monitor节点,systemctl restart pgautofailover 然后又报错,切换到root下执行chmod -R 777 /run/user/0chown -R postgres:postgres /run/user/0
6.png
  1. postgres@ubuntu12:/root$ pg_autoctl create postgres --hostname ubuntu12 --name ubuntu12 --auth trust --ssl-self-signed --pgdata /usr/local/pgsql16/pg9300/data/ --pgport 9300 --monitor 'postgres://autoctl_node@Ubuntu11:9300/pg_auto_failover?sslmode=require'
  2. 03:09:00 58370 ERROR Failed to create state directory "/run/user/0/pg_autoctl": Permission denied
  3. 03:09:00 58370 ERROR Failed to build pg_autoctl pid file pathname, see above.
  4. 03:09:00 58370 FATAL Failed to set pid filename from PGDATA "/usr/local/pgsql16/pg9300/data/", see above for details.
  5. postgres@ubuntu12:/root$
  6. postgres@ubuntu12:/root$
  7. postgres@ubuntu12:/root$ exit
  8. exit
  9. root@ubuntu12:~# sudo chmod -R 777 /run/user/0
  10. root@ubuntu12:~# sudo chown -R postgres:postgres /run/user/0
复制代码
 对于错误清理,由于各种原因,第一次初始化失败的话,需要完成以下两步操作才能“重置”,否则会报各种各样的错误,这里批评一下pg_auto_failover,报错信息太乱了,笔者在这里卡了很久,尝试了很久弄清楚这个地方的坑。 1,直接从monitor节点的数据库pgautofailover.node中删除注册的节点,参考如下
  1. root@ubuntu11:/usr/local/pg_install_packgae# psql -h 127.0.0.1 -p 9300 postgres postgres
  2. psql (16.4)
  3. SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)
  4. Type "help" for help.
  5. postgres=# \c pg_auto_failover
  6. SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)
  7. You are now connected to database "pg_auto_failover" as user "postgres".
  8. pg_auto_failover=#
  9. pg_auto_failover=# SELECT nodeid, nodename, nodehost, nodeport, goalstate, reportedstate FROM pgautofailover.node;
  10. nodeid | nodename | nodehost | nodeport | goalstate | reportedstate
  11. --------+----------+----------+----------+-----------+---------------
  12.       1 | ubuntu12 | ubuntu12 |     9300 | single    | single
  13. (1 row)
  14. pg_auto_failover=# delete from  pgautofailover.node where nodeid = 1;
  15. DELETE 1
  16. pg_auto_failover=#
  17. pg_auto_failover=#
  18. root@ubuntu11:/usr/local/pg_install_packgae# su - postgres
  19. postgres@ubuntu11:~$
  20. postgres@ubuntu11:~$ source /etc/profile
  21. postgres@ubuntu11:~$
  22. postgres@ubuntu11:~$ pg_autoctl show uri                                                #这样monitor节点恢复成原始状态,解决错误后再重新注册主/从节点
  23.         Type |    Name | Connection String
  24. -------------+---------+-------------------------------
  25.      monitor | monitor | postgres://autoctl_node@Ubuntu11:9300/pg_auto_failover?sslmode=require
  26.    formation | default |
  27. postgres@ubuntu11:~$
复制代码
这一点非常扯淡,pg_autoctl在注册一个节点的时候,先把元数据写入monitor节点的pg_auto_failover数据库,然后(主节点)拉取数据,或者(从节点)备份数据,如果后一个东西失败,节点信息还是回注册到monitor库中,后续就需要手动删除节点信息。 2,清理 pg_autoctl 状态文件
  1. 本地清理 keeper state,再强制注册
  2. 本地 pg_autoctl 会维护一个状态文件(通常在 ~/.local/share/pg_autoctl/.../pg_autoctl.state)。
  3. 如果你只是清理 pgdata,但没有清理这个文件,再次注册会冲突。
  4. 解决:
  5. rm -rf /home/postgres/.local/share/pg_autoctl
  6. 然后重新运行 pg_autoctl create,这样它会认为是一个全新节点。
复制代码
 

4.2 注册主节点

再次注册主节点,这次可以了,注册主节点命令如下
/usr/local/pgsql16/server/bin/pg_autoctl create postgres \
--hostname ubuntu12 \
--name ubuntu12 \
--pgport 9300 \
--auth trust \
--ssl-self-signed \
--monitor 'postgres://autoctl_node@ubuntu11:9300/pg_auto_failover?sslmode=require'
  1. root@ubuntu12:~# su - postgres
  2. postgres@ubuntu12:~$
  3. postgres@ubuntu12:~$
  4. postgres@ubuntu12:~$
  5. postgres@ubuntu12:~$
  6. postgres@ubuntu12:~$ pg_autoctl create postgres --hostname ubuntu12 --name ubuntu12 --auth trust --ssl-self-signed --pgdata /usr/local/pgsql16/pg9300/data/ --pgport 9300 --monitor 'postgres://autoctl_node@Ubuntu11:9300/pg_auto_failover?sslmode=require'
  7. 05:17:27 2094 INFO  Using default --ssl-mode "require"
  8. 05:17:27 2094 INFO  Using --ssl-self-signed: pg_autoctl will create self-signed certificates, allowing for encrypted network traffic
  9. 05:17:27 2094 WARN  Self-signed certificates provide protection against eavesdropping; this setup does NOT protect against Man-In-The-Middle attacks nor Impersonation attacks.
  10. 05:17:27 2094 WARN  See https://www.postgresql.org/docs/current/libpq-ssl.html for details
  11. 05:17:27 2094 INFO  Started pg_autoctl postgres service with pid 2096
  12. 05:17:27 2094 INFO  Started pg_autoctl node-init service with pid 2097
  13. 05:17:27 2096 INFO   /usr/local/pgsql16/server/bin/pg_autoctl do service postgres --pgdata /usr/local/pgsql16/pg9300/data/ -v
  14. 05:17:27 2097 INFO  Registered node 52 "ubuntu12" (ubuntu12:9300) in formation "default", group 0, state "single"
  15. 05:17:27 2097 INFO  Writing keeper state file at "/home/postgres/.local/share/pg_autoctl/usr/local/pgsql16/pg9300/data/pg_autoctl.state"
  16. 05:17:27 2097 INFO  Writing keeper init state file at "/home/postgres/.local/share/pg_autoctl/usr/local/pgsql16/pg9300/data/pg_autoctl.init"
  17. 05:17:27 2097 INFO  Successfully registered as "single" to the monitor.
  18. 05:17:27 2097 INFO  FSM transition from "init" to "single": Start as a single node
  19. 05:17:27 2097 INFO  Initialising postgres as a primary
  20. 05:17:27 2097 INFO  Initialising a PostgreSQL cluster at "/usr/local/pgsql16/pg9300/data"
  21. 05:17:27 2097 INFO  /usr/local/pgsql16/server/bin/pg_ctl initdb -s -D /usr/local/pgsql16/pg9300/data --option '--auth=trust'
  22. 05:17:27 2097 INFO   /usr/bin/openssl req -new -x509 -days 365 -nodes -text -out /usr/local/pgsql16/pg9300/data/server.crt -keyout /usr/local/pgsql16/pg9300/data/server.key -subj "/CN=ubuntu12"
  23. 05:17:28 2122 INFO   /usr/local/pgsql16/server/bin/postgres -D /usr/local/pgsql16/pg9300/data -p 9300 -h *
  24. 05:17:28 2096 INFO  Postgres is now serving PGDATA "/usr/local/pgsql16/pg9300/data" on port 9300 with pid 2122
  25. 05:17:28 2097 INFO  The user "postgres" already exists, skipping.
  26. 05:17:28 2097 INFO  CREATE USER postgres
  27. 05:17:28 2097 INFO  CREATE DATABASE postgres;
  28. 05:17:28 2097 INFO  The database "postgres" already exists, skipping.
  29. 05:17:28 2097 INFO  CREATE EXTENSION pg_stat_statements;
  30. 05:17:28 2097 INFO  Disabling synchronous replication
  31. 05:17:28 2097 INFO  Reloading Postgres configuration and HBA rules
  32. 05:17:28 2097 WARN  Failed to resolve hostname "Ubuntu11" to an IP address that resolves back to the hostname on a reverse DNS lookup.
  33. 05:17:28 2097 WARN  Postgres might deny connection attempts from "Ubuntu11", even with the new HBA rules.
  34. 05:17:28 2097 WARN  Hint: correct setup of HBA with host names requires proper reverse DNS setup. You might want to use IP addresses.
  35. 05:17:28 2097 WARN  Using IP address "192.168.152.121" in HBA file instead of hostname "Ubuntu11"
  36. 05:17:28 2097 INFO  Reloading Postgres configuration and HBA rules
  37. 05:17:28 2097 INFO  Transition complete: current state is now "single"
  38. 05:17:28 2097 INFO  keeper has been successfully initialized.
  39. 05:17:28 2094 WARN  pg_autoctl service node-init exited with exit status 0
  40. 05:17:28 2096 INFO  Postgres controller service received signal SIGTERM, terminating
  41. 05:17:28 2096 INFO  Stopping pg_autoctl postgres service
  42. 05:17:28 2096 INFO  /usr/local/pgsql16/server/bin/pg_ctl --pgdata /usr/local/pgsql16/pg9300/data --wait stop --mode fast
  43. 05:17:28 2094 INFO  Stop pg_autoctl
  44. postgres@ubuntu12:~$
复制代码
初始化主节点systemctl服务,切换到root用户下执行pg_autoctl  show systemd然后依次执行如下命令即可pg_autoctl -q show systemd --pgdata "/usr/local/pgsql16/pg9300/data" | tee /etc/systemd/system/pgautofailover.servicesystemctl daemon-reloadsystemctl enable pgautofailoversystemctl start pgautofailover
  1. postgres@ubuntu12:~$
  2. postgres@ubuntu12:~$ exit
  3. logout
  4. root@ubuntu12:~# pg_autoctl  show systemd
  5. 05:18:06 2160 INFO  HINT: to complete a systemd integration, run the following commands (as root):
  6. 05:18:06 2160 INFO  pg_autoctl -q show systemd --pgdata "/usr/local/pgsql16/pg9300/data" | tee /etc/systemd/system/pgautofailover.service
  7. 05:18:06 2160 INFO  systemctl daemon-reload
  8. 05:18:06 2160 INFO  systemctl enable pgautofailover
  9. 05:18:06 2160 INFO  systemctl start pgautofailover
  10. [Unit]
  11. Description = pg_auto_failover
  12. [Service]
  13. WorkingDirectory = /home/postgres
  14. Environment = 'PGDATA=/usr/local/pgsql16/pg9300/data'
  15. User = postgres
  16. ExecStart = /usr/local/pgsql16/server/bin/pg_autoctl run
  17. Restart = always
  18. StartLimitBurst = 0
  19. ExecReload = /usr/local/pgsql16/server/bin/pg_autoctl reload
  20. [Install]
  21. WantedBy = multi-user.target
  22. root@ubuntu12:~# pg_autoctl -q show systemd --pgdata "/usr/local/pgsql16/pg9300/data" | tee /etc/systemd/system/pgautofailover.service
  23. [Unit]
  24. Description = pg_auto_failover
  25. [Service]
  26. WorkingDirectory = /home/postgres
  27. Environment = 'PGDATA=/usr/local/pgsql16/pg9300/data'
  28. User = postgres
  29. ExecStart = /usr/local/pgsql16/server/bin/pg_autoctl run
  30. Restart = always
  31. StartLimitBurst = 0
  32. ExecReload = /usr/local/pgsql16/server/bin/pg_autoctl reload
  33. [Install]
  34. WantedBy = multi-user.target
  35. root@ubuntu12:~# systemctl daemon-reload
  36. root@ubuntu12:~# systemctl enable pgautofailover
  37. Created symlink /etc/systemd/system/multi-user.target.wants/pgautofailover.service → /etc/systemd/system/pgautofailover.service.
  38. root@ubuntu12:~#  systemctl start pgautofailover
  39. root@ubuntu12:~#
  40. root@ubuntu12:~# systemctl status pgautofailover
  41. ● pgautofailover.service - pg_auto_failover
  42.      Loaded: loaded (/etc/systemd/system/pgautofailover.service; enabled; vendor preset: enabled)
  43.      Active: active (running) since Thu 2025-10-09 05:18:32 UTC; 10s ago
  44.    Main PID: 2250 (pg_autoctl)
  45.       Tasks: 11 (limit: 4550)
  46.      Memory: 26.8M
  47.      CGroup: /system.slice/pgautofailover.service
  48.              ├─2250 /usr/local/pgsql16/server/bin/pg_autoctl run
  49.              ├─2267 pg_autoctl: start/stop postgres
  50.              ├─2268 pg_autoctl: node active
  51.              ├─2278 /usr/local/pgsql16/server/bin/postgres -D /usr/local/pgsql16/pg9300/data -p 9300 -h *
  52.              ├─2279 postgres: logger
  53.              ├─2280 postgres: checkpointer
  54.              ├─2281 postgres: background writer
  55.              ├─2283 postgres: walwriter
  56.              ├─2284 postgres: autovacuum launcher
  57.              ├─2285 postgres: logical replication launcher
  58.              └─2313 postgres: postgres postgres [local] startup
  59. Oct 09 05:18:32 ubuntu12 pg_autoctl[2250]: 05:18:32 2250 INFO  Started pg_autoctl postgres service with pid 2267
  60. Oct 09 05:18:32 ubuntu12 pg_autoctl[2250]: 05:18:32 2250 INFO  Started pg_autoctl node-active service with pid 2268
  61. Oct 09 05:18:32 ubuntu12 pg_autoctl[2268]: 05:18:32 2268 INFO   /usr/local/pgsql16/server/bin/pg_autoctl do service node-active --pgdata /usr/local/pgsql16/pg9300/data -v
  62. Oct 09 05:18:32 ubuntu12 pg_autoctl[2267]: 05:18:32 2267 INFO   /usr/local/pgsql16/server/bin/pg_autoctl do service postgres --pgdata /usr/local/pgsql16/pg9300/data -v
  63. Oct 09 05:18:32 ubuntu12 pg_autoctl[2268]: 05:18:32 2268 INFO  Reloaded the new configuration from "/home/postgres/.config/pg_autoctl/usr/local/pgsql16/pg9300/data/pg_autoctl.cfg"
  64. Oct 09 05:18:32 ubuntu12 pg_autoctl[2268]: 05:18:32 2268 INFO  pg_autoctl service is running, current state is "single"
  65. Oct 09 05:18:32 ubuntu12 pg_autoctl[2278]: 05:18:32 2278 INFO   /usr/local/pgsql16/server/bin/postgres -D /usr/local/pgsql16/pg9300/data -p 9300 -h *
  66. Oct 09 05:18:32 ubuntu12 pg_autoctl[2268]: 05:18:32 2268 WARN  PostgreSQL was not running, restarted with pid 2278
  67. Oct 09 05:18:33 ubuntu12 pg_autoctl[2267]: 05:18:33 2267 INFO  Postgres is now serving PGDATA "/usr/local/pgsql16/pg9300/data" on port 9300 with pid 2278
  68. Oct 09 05:18:33 ubuntu12 pg_autoctl[2268]: 05:18:33 2268 INFO  New state for this node (node 52, "ubuntu12") (ubuntu12:9300): single ➜ single
  69. root@ubuntu12:~#
复制代码
 4.3 monitor 节点查看已注册的主节点

此时回到monitor节点上,1,查看pg_autoctl show uri,可以看到ubuntu12节点已经注册到monitor节点2,pg_autoctl show state,可以看到ubuntu12节点是read-write状态,处于single模式
  1. root@ubuntu11:/usr/local/pg_install_packgae#
  2. root@ubuntu11:/usr/local/pg_install_packgae# pg_autoctl show uri
  3.         Type |    Name | Connection String
  4. -------------+---------+-------------------------------
  5.      monitor | monitor | postgres://autoctl_node@127.0.0.1:9300/pg_auto_failover?sslmode=require
  6.    formation | default | postgres://ubuntu12:9300/postgres?target_session_attrs=read-write&sslmode=require
  7. root@ubuntu11:/usr/local/pg_install_packgae#
  8. root@ubuntu11:/usr/local/pg_install_packgae#
  9. root@ubuntu11:/usr/local/pg_install_packgae#
  10. root@ubuntu11:/usr/local/pg_install_packgae# pg_autoctl show state
  11.     Name |  Node |     Host:Port |       TLI: LSN |   Connection |      Reported State |      Assigned State
  12. ---------+-------+---------------+----------------+--------------+---------------------+--------------------
  13. ubuntu12 |     8 | ubuntu12:9300 |   1: 0/15596F8 |   read-write |              single |              single
  14. root@ubuntu11:/usr/local/pg_install_packgae#
复制代码
 
5,从节点加入pg_auto_failover

5.1 注册从节点Ubuntu13

注册命令:pg_autoctl create postgres --hostname ubuntu13 --name ubuntu13 --auth trust --ssl-self-signed --pgdata /usr/local/pgsql16/pg9300/data/ --pgport 9300 --monitor 'postgres://autoctl_node@Ubuntu11:9300/pg_auto_failover?sslmode=require'因为从节点ubuntu13是之前安装过PostgreSQL实例的,这里直接停掉服务,删掉数据文件
  1. root@ubuntu13:/usr/local/pg_install_package# systemctl stop postgresql9300
  2. root@ubuntu13:/usr/local/pg_install_package# systemctl disable postgresql9300
  3. root@ubuntu13:/usr/local/pg_install_package# rm -rf /usr/local/pgsql16/pg9300
复制代码
注册从节点成功
  1. postgres@ubuntu13:~$ pg_autoctl create postgres --hostname ubuntu13 --name ubuntu13 --auth trust --ssl-self-signed --pgdata /usr/local/pgsql16/pg9300/data/ --pgport 9300 --monitor 'postgres://autoctl_node@Ubuntu11:9300/pg_auto_failover?sslmode=require'
  2. 05:46:07 11100 INFO  Using default --ssl-mode "require"
  3. 05:46:07 11100 INFO  Using --ssl-self-signed: pg_autoctl will create self-signed certificates, allowing for encrypted network traffic
  4. 05:46:07 11100 WARN  Self-signed certificates provide protection against eavesdropping; this setup does NOT protect against Man-In-The-Middle attacks nor Impersonation attacks.
  5. 05:46:07 11100 WARN  See https://www.postgresql.org/docs/current/libpq-ssl.html for details
  6. 05:46:07 11100 INFO  Started pg_autoctl postgres service with pid 11102
  7. 05:46:07 11100 INFO  Started pg_autoctl node-init service with pid 11103
  8. 05:46:07 11102 INFO   /usr/local/pgsql16/server/bin/pg_autoctl do service postgres --pgdata /usr/local/pgsql16/pg9300/data/ -v
  9. 05:46:07 11103 INFO  Registered node 60 "ubuntu13" (ubuntu13:9300) in formation "default", group 0, state "wait_standby"
  10. 05:46:07 11103 INFO  Writing keeper state file at "/home/postgres/.local/share/pg_autoctl/usr/local/pgsql16/pg9300/data/pg_autoctl.state"
  11. 05:46:07 11103 INFO  Writing keeper init state file at "/home/postgres/.local/share/pg_autoctl/usr/local/pgsql16/pg9300/data/pg_autoctl.init"
  12. 05:46:07 11103 INFO  Successfully registered as "wait_standby" to the monitor.
  13. 05:46:07 11103 INFO  FSM transition from "init" to "wait_standby": Start following a primary
  14. 05:46:07 11103 INFO  Transition complete: current state is now "wait_standby"
  15. 05:46:07 11103 INFO  New state for node 52 "ubuntu12" (ubuntu12:9300): single ➜ wait_primary
  16. 05:46:07 11103 INFO  New state for node 52 "ubuntu12" (ubuntu12:9300): wait_primary ➜ wait_primary
  17. 05:46:07 11103 INFO  FSM transition from "wait_standby" to "catchingup": The primary is now ready to accept a standby
  18. 05:46:07 11103 INFO  Initialising PostgreSQL as a hot standby
  19. 05:46:07 11103 INFO   /usr/local/pgsql16/server/bin/pg_basebackup -w -d 'application_name=pgautofailover_standby_60 host=ubuntu12 port=9300 user=pgautofailover_replicator sslmode=require' --pgdata /usr/local/pgsql16/pg9300/backup/node_60 -U pgautofailover_replicator --verbose --progress --max-rate 100M --wal-method=stream --slot pgautofailover_standby_60
  20. 05:46:07 11103 INFO  pg_basebackup:
  21. 05:46:07 11103 INFO
  22. 05:46:07 11103 INFO  initiating base backup, waiting for checkpoint to complete
  23. 05:46:07 11103 INFO  pg_basebackup:
  24. 05:46:07 11103 INFO
  25. 05:46:07 11103 INFO  checkpoint completed
  26. 05:46:07 11103 INFO  pg_basebackup:
  27. 05:46:07 11103 INFO
  28. 05:46:07 11103 INFO  write-ahead log start point: 0/2000028 on timeline 1
  29. 05:46:07 11103 INFO  pg_basebackup:
  30. 05:46:07 11103 INFO
  31. 05:46:07 11103 INFO  starting background WAL receiver
  32. 05:46:07 11103 INFO  22591/22591 kB (100%), 0/1 tablespace (...backup/node_60/global/pg_control)
  33. 05:46:08 11103 INFO  22591/22591 kB (100%), 1/1 tablespace
  34. 05:46:08 11103 INFO  pg_basebackup: write-ahead log end point: 0/2000138
  35. 05:46:08 11103 INFO  pg_basebackup: waiting for background process to finish streaming ...
  36. 05:46:08 11103 INFO  pg_basebackup: syncing data to disk ...
  37. 05:46:08 11103 INFO  pg_basebackup: renaming backup_manifest.tmp to backup_manifest
  38. 05:46:08 11103 INFO  pg_basebackup: base backup completed
  39. 05:46:08 11103 INFO  Creating the standby signal file at "/usr/local/pgsql16/pg9300/data/standby.signal", and replication setup at "/usr/local/pgsql16/pg9300/data/postgresql-auto-failover-standby.conf"
  40. 05:46:08 11103 INFO   /usr/bin/openssl req -new -x509 -days 365 -nodes -text -out /usr/local/pgsql16/pg9300/data/server.crt -keyout /usr/local/pgsql16/pg9300/data/server.key -subj "/CN=ubuntu13"
  41. 05:46:08 11116 INFO   /usr/local/pgsql16/server/bin/postgres -D /usr/local/pgsql16/pg9300/data -p 9300 -h *
  42. 05:46:08 11102 INFO  Postgres is now serving PGDATA "/usr/local/pgsql16/pg9300/data" on port 9300 with pid 11116
  43. 05:46:08 11103 INFO  PostgreSQL started on port 9300
  44. 05:46:08 11103 INFO  Fetched current list of 1 other nodes from the monitor to update HBA rules, including 1 changes.
  45. 05:46:08 11103 INFO  Ensuring HBA rules for node 52 "ubuntu12" (ubuntu12:9300)
  46. 05:46:08 11103 INFO  Adding HBA rule: hostssl replication "pgautofailover_replicator" ubuntu12 trust
  47. 05:46:08 11103 INFO  Adding HBA rule: hostssl "postgres" "pgautofailover_replicator" ubuntu12 trust
  48. 05:46:08 11103 INFO  Writing new HBA rules in "/usr/local/pgsql16/pg9300/data/pg_hba.conf"
  49. 05:46:08 11103 INFO  Reloading Postgres configuration and HBA rules
  50. 05:46:08 11103 INFO  Transition complete: current state is now "catchingup"
  51. 05:46:08 11103 INFO  keeper has been successfully initialized.
  52. 05:46:08 11100 WARN  pg_autoctl service node-init exited with exit status 0
  53. 05:46:08 11102 INFO  Postgres controller service received signal SIGTERM, terminating
  54. 05:46:08 11102 INFO  Stopping pg_autoctl postgres service
  55. 05:46:08 11102 INFO  /usr/local/pgsql16/server/bin/pg_ctl --pgdata /usr/local/pgsql16/pg9300/data --wait stop --mode fast
  56. 05:46:08 11100 INFO  Stop pg_autoctl
  57. postgres@ubuntu13:~$
复制代码
设置systemctl启动服务
  1. root@ubuntu13:/usr/local/pg_install_package#  pg_autoctl  show systemd
  2. 05:48:11 12172 INFO  HINT: to complete a systemd integration, run the following commands (as root):
  3. 05:48:11 12172 INFO  pg_autoctl -q show systemd --pgdata "/usr/local/pgsql16/pg9300/data" | tee /etc/systemd/system/pgautofailover.service
  4. 05:48:11 12172 INFO  systemctl daemon-reload
  5. 05:48:11 12172 INFO  systemctl enable pgautofailover
  6. 05:48:11 12172 INFO  systemctl start pgautofailover
  7. [Unit]
  8. Description = pg_auto_failover
  9. [Service]
  10. WorkingDirectory = /home/postgres
  11. Environment = 'PGDATA=/usr/local/pgsql16/pg9300/data'
  12. User = postgres
  13. ExecStart = /usr/local/pgsql16/server/bin/pg_autoctl run
  14. Restart = always
  15. StartLimitBurst = 0
  16. ExecReload = /usr/local/pgsql16/server/bin/pg_autoctl reload
  17. [Install]
  18. WantedBy = multi-user.target
  19. root@ubuntu13:/usr/local/pg_install_package# pg_autoctl -q show systemd --pgdata "/usr/local/pgsql16/pg9300/data" | tee /etc/systemd/system/pgautofailover.service
  20. [Unit]
  21. Description = pg_auto_failover
  22. [Service]
  23. WorkingDirectory = /home/postgres
  24. Environment = 'PGDATA=/usr/local/pgsql16/pg9300/data'
  25. User = postgres
  26. ExecStart = /usr/local/pgsql16/server/bin/pg_autoctl run
  27. Restart = always
  28. StartLimitBurst = 0
  29. ExecReload = /usr/local/pgsql16/server/bin/pg_autoctl reload
  30. [Install]
  31. WantedBy = multi-user.target
  32. root@ubuntu13:/usr/local/pg_install_package# systemctl daemon-reload
  33. root@ubuntu13:/usr/local/pg_install_package# systemctl enable pgautofailover
  34. root@ubuntu13:/usr/local/pg_install_package# systemctl start pgautofailover
  35. root@ubuntu13:/usr/local/pg_install_package#
  36. root@ubuntu13:/usr/local/pg_install_package# systemctl status pgautofailover
  37. ● pgautofailover.service - pg_auto_failover
  38.      Loaded: loaded (/etc/systemd/system/pgautofailover.service; enabled; vendor preset: enabled)
  39.      Active: active (running) since Thu 2025-10-09 05:48:26 UTC; 5s ago
  40.    Main PID: 12381 (pg_autoctl)
  41.       Tasks: 9 (limit: 4550)
  42.      Memory: 42.0M
  43.      CGroup: /system.slice/pgautofailover.service
  44.              ├─12381 /usr/local/pgsql16/server/bin/pg_autoctl run
  45.              ├─12391 pg_autoctl: start/stop postgres
  46.              ├─12392 pg_autoctl: node active
  47.              ├─12402 /usr/local/pgsql16/server/bin/postgres -D /usr/local/pgsql16/pg9300/data -p 9300 -h *
  48.              ├─12403 postgres: logger
  49.              ├─12404 postgres: checkpointer
  50.              ├─12405 postgres: background writer
  51.              ├─12406 postgres: startup recovering 000000010000000000000003
  52.              └─12407 postgres: walreceiver streaming 0/3000110
  53. Oct 09 05:48:27 ubuntu13 pg_autoctl[12391]: 05:48:27 12391 INFO  Postgres is now serving PGDATA "/usr/local/pgsql16/pg9300/data" on port 9300 with pid 12402
  54. Oct 09 05:48:27 ubuntu13 pg_autoctl[12392]: 05:48:27 12392 WARN  PostgreSQL was not running, restarted with pid 12402
  55. Oct 09 05:48:28 ubuntu13 pg_autoctl[12392]: 05:48:28 12392 INFO  Updated the keeper's state from the local PostgreSQL instance, which is running
  56. Oct 09 05:48:28 ubuntu13 pg_autoctl[12392]: 05:48:28 12392 INFO  pg_autoctl managed to ensure current state "catchingup": PostgreSQL is running
  57. Oct 09 05:48:29 ubuntu13 pg_autoctl[12392]: 05:48:29 12392 INFO  Monitor assigned new state "secondary"
  58. Oct 09 05:48:29 ubuntu13 pg_autoctl[12392]: 05:48:29 12392 INFO  FSM transition from "catchingup" to "secondary": Convinced the monitor that I'm up and running, and eligible for promotion again
  59. Oct 09 05:48:29 ubuntu13 pg_autoctl[12392]: 05:48:29 12392 INFO  Reached timeline 1, same as upstream node 52 "ubuntu12" (ubuntu12:9300)
  60. Oct 09 05:48:29 ubuntu13 pg_autoctl[12392]: 05:48:29 12392 INFO  Creating replication slot "pgautofailover_standby_52"
  61. Oct 09 05:48:29 ubuntu13 pg_autoctl[12392]: 05:48:29 12392 INFO  Transition complete: current state is now "secondary"
  62. Oct 09 05:48:29 ubuntu13 pg_autoctl[12392]: 05:48:29 12392 INFO  New state for node 52 "ubuntu12" (ubuntu12:9300): primary ➜ primary
  63. root@ubuntu13:/usr/local/pg_install_package#
复制代码
 5.2 monitor 节点查看已注册的从节点

再次回到monitor节点查看,可看到ubuntu11作为monitor节点,ubuntu12作为主节点,ubuntu13作为从节点
  1. root@ubuntu11:~#
  2. root@ubuntu11:~# pg_autoctl show uri;
  3.         Type |    Name | Connection String
  4. -------------+---------+-------------------------------
  5.      monitor | monitor | postgres://autoctl_node@ubuntu11:9300/pg_auto_failover?sslmode=require
  6.    formation | default | postgres://ubuntu12:9300,ubuntu13:9300/postgres?target_session_attrs=read-write&sslmode=require
  7. root@ubuntu11:~#
  8. root@ubuntu11:~#
  9. root@ubuntu11:~# pg_autoctl show state
  10.     Name |  Node |     Host:Port |       TLI: LSN |   Connection |      Reported State |      Assigned State
  11. ---------+-------+---------------+----------------+--------------+---------------------+--------------------
  12. ubuntu12 |    52 | ubuntu12:9300 |   1: 0/3000148 |   read-write |             primary |             primary
  13. ubuntu13 |    60 | ubuntu13:9300 |   1: 0/3000148 |    read-only |           secondary |           secondary
  14. root@ubuntu11:~#
  15. root@ubuntu11:~#
复制代码
 
6,pg_auto_failover外部访问配置

1,修改postgres用户密码

登录主节点修改postgres用户密码
  1. ALTER USER postgres WITH PASSWORD 'a-strong-password';
复制代码
2,修改hba.conf

1,修改从节点ubuntu13的hba.conf,添加一下访问规则
hostssl postgres all 192.168.0.0/16 md5
2,重启主节点ubuntu12,此时故障转移,ubuntu13从节点提升为主节点,同时ubuntu13作为新的主节点,配置规则会覆盖从节点ubuntu12
pg_auto_failover的hba.conf覆盖规则是“新的主节点覆盖旧的主节点”,这一点有点绕,有兴趣的自己测试验证


然后从客户端连接至pg_auto_failover集群的主节点,查看复制状态,其实跟手动搭建的流复制就一样了,只不过是pg_auto_failover把整个postgresql集群的搭建过程给屏蔽掉了
  1. SELECT * FROM pg_replication_slots;
  2. slot_name                |plugin|slot_type|datoid|database|temporary|active|active_pid|xmin|catalog_xmin|restart_lsn|confirmed_flush_lsn|wal_status|safe_wal_size|two_phase|conflicting|
  3. -------------------------+------+---------+------+--------+---------+------+----------+----+------------+-----------+-------------------+----------+-------------+---------+-----------+
  4. pgautofailover_standby_60|      |physical |      |        |false    |true  |     11783|747 |            |0/5020868  |                   |reserved  |             |false    |           |
  5. select * from pg_stat_replication;
  6. pid  |usesysid|usename                  |application_name         |client_addr    |client_hostname|client_port|backend_start                |backend_xmin|state    |sent_lsn |write_lsn|flush_lsn|replay_lsn|write_lag      |flush_lag      |replay_lag     |sync_priority|sync_state|reply_time                   |
  7. -----+--------+-------------------------+-------------------------+---------------+---------------+-----------+-----------------------------+------------+---------+---------+---------+---------+----------+---------------+---------------+---------------+-------------+----------+-----------------------------+
  8. 11783|   16416|pgautofailover_replicator|pgautofailover_standby_60|192.168.152.123|ubuntu13       |      60426|2025-10-09 13:23:00.095 +0800|            |streaming|0/5020868|0/5020868|0/5020868|0/5020868 |00:00:00.001064|00:00:00.001766|00:00:00.001773|            1|quorum    |2025-10-09 13:51:42.416 +0800|
复制代码
可以看到,pg_auto_failover在一主一从的模式下,是同步复制
  1. select * from pg_settings where name like '%synchronous_commit%';
  2. name              |setting|unit|category                  |short_desc                                           |extra_desc|context|vartype|source            |min_val|max_val|enumvals                                |boot_val|reset_val|sourcefile                                                  |sourceline|pending_restart|
  3. ------------------+-------+----+--------------------------+-----------------------------------------------------+----------+-------+-------+------------------+-------+-------+----------------------------------------+--------+---------+------------------------------------------------------------+----------+---------------+
  4. synchronous_commit|on     |    |Write-Ahead Log / Settings|Sets the current transaction''s synchronization level.|          |user   |enum   |configuration file|       |       |{local,remote_write,remote_apply,on,off}|on      |on       |/usr/local/pgsql16/pg9300/data/postgresql-auto-failover.conf|        12|false          |
  5. select * from pg_settings where name like '%synchronous_standby_names%' ;
  6. name                     |setting                          |unit|category                    |short_desc                                                                     |extra_desc|context|vartype|source            |min_val|max_val|enumvals|boot_val|reset_val                        |sourcefile                                         |sourceline|pending_restart|
  7. -------------------------+---------------------------------+----+----------------------------+-------------------------------------------------------------------------------+----------+-------+-------+------------------+-------+-------+--------+--------+---------------------------------+---------------------------------------------------+----------+---------------+
  8. synchronous_standby_names|ANY 1 (pgautofailover_standby_60)|    |Replication / Primary Server|Number of synchronous standbys and list of names of potential synchronous ones.|          |sighup |string |configuration file|       |       |NULL    |        |ANY 1 (pgautofailover_standby_60)|/usr/local/pgsql16/pg9300/data/postgresql.auto.conf|         4|false          |
复制代码
 
 

来源:程序园用户自行投稿发布,如果侵权,请联系站长删除
免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!

相关推荐

您需要登录后才可以回帖 登录 | 立即注册