drbd主服務器ds狀態變成了Diskless(磁盤IO錯誤引起)
if 136網段=60網段
then
一、在拔掉60網段網線之前:
drbd1主服務器負載很高,top的%wa達到60左右。cat /proc/drbd發現當前服務器同步狀態變成了diskless(on-io-error detach; #策略:發生I/O錯誤的節點將放棄底層設備,以diskless mode繼續工作)。
查message日誌,發現15 號的時候磁盤同步狀態就變成了Diskless,提示IO錯誤。
Jun 15 08:56:06 drbd1 kernel: block drbd0: local WRITE IO error sector 1600487576+8 on dm-2 Jun 15 08:56:06 drbd1 kernel: block drbd0: disk( UpToDate -> Failed ) Jun 15 08:56:06 drbd1 kernel: block drbd0: Local IO failed in __req_mod. Detaching... Jun 15 08:56:06 drbd1 kernel: block drbd0: receiver updated UUIDs to effective data uuid: C6B3D27C4098E93E Jun 15 08:56:06 drbd1 kernel: block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map. Jun 15 08:56:06 drbd1 kernel: block drbd0: disk( Failed -> Diskless )
備註: drbdadm detach all可以運行這個命令模擬上述錯誤,即ds變成了Diskless,但是drbd分區依然可以掛在,依然可以訪問。
權威解釋(DRBD書籍和DRBD中文應用指南):如果某個節點作為DRBD的後端磁盤設備出現故障,DRBD可能把這個I/O錯誤傳遞給上層(通常是文件系統),或者DRBD可能對上層屏蔽了I/O錯誤。
Passing on I/O errors:如果DRBD被配置為 pass on I/O錯誤,則任何底層設備的錯誤都會透明地傳遞給上層I/O層。這樣,就由上層來處理錯誤(這會導致文件系統被重新掛載為read-only)。這個策略不保證服務持續性,並且對大多數用戶來說也不推薦。
Masking I/O errors:如果DRBD被配置為 detach 底層I/O錯誤,則DRBD將分離錯誤。這個I/O錯誤被DRBD對上層屏蔽,並且DRBD透明地通過網絡從對端節點提取受影響的數據塊。在這種情況下,DRBD被稱為運行在diskless模式,並處理所有相應的I/O操作,讀寫實際上都是發生在對端(不是本地)。這種diskless運行模式會影響性能,但是服務將繼續運行不受影響,並且可以從容地在一個合適的時間遷移到對端節點。(這個方式有點類似 Soft RAID1,當鏡像磁盤發生故障時可以確保應用繼續運行並提供恢復機會。)
參考配置I/O錯誤處理策略有關配置I/O處理策略的信息。
二、在拔掉60網段網線之後:
1、拔掉60網段(drbd數據同步+心跳線)網線後,ha沒有做切換(排查了/var/log/ha-debug日誌),即heartbeat正常。
但是drbd分區此時卻莫名其妙沒掛載,ls掛載目錄好像提示了一個錯誤,但是沒註意看(漂移IP地址還在),正常情況下不會造成drbd掛載異常的(虛擬機測試過)。
2、message日誌就報大量IO錯誤了(這裏就有點疑問了,為什麽拔掉網線後就這麽多IO錯誤,拔掉之前總報了一個IO錯誤然後ds變成了Diskless):
Jun 19 10:28:04 drbd1 kernel: e1000: ens34 NIC Link is Down Jun 19 10:28:10 drbd1 NetworkManager[806]: <info> [1529375290.6437] device (ens34): state change: activated -> unavailable (reason 'carrier-changed', sys-iface-state: 'managed') Jun 19 10:28:10 drbd1 dbus[801]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service' Jun 19 10:28:10 drbd1 systemd: Starting Network Manager Script Dispatcher Service... Jun 19 10:28:10 drbd1 dbus[801]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher' Jun 19 10:28:10 drbd1 systemd: Started Network Manager Script Dispatcher Service. Jun 19 10:28:10 drbd1 nm-dispatcher: req:1 'down' [ens34]: new request (3 scripts) Jun 19 10:28:10 drbd1 nm-dispatcher: req:1 'down' [ens34]: start running ordered scripts... Jun 19 10:28:23 drbd1 kernel: drbd r0: PingAck did not arrive in time. Jun 19 10:28:23 drbd1 kernel: drbd r0: peer( Secondary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) Jun 19 10:28:23 drbd1 kernel: drbd r0: ack_receiver terminated Jun 19 10:28:23 drbd1 kernel: drbd r0: Terminating drbd_a_r0 Jun 19 10:28:23 drbd1 kernel: drbd r0: error receiving DataReply, e: -5 l: 212992! Jun 19 10:28:23 drbd1 kernel: block drbd0: helper command: /sbin/drbdadm pri-on-incon-degr minor-0 Jun 19 10:28:24 drbd1 kernel: block drbd0: helper command: /sbin/drbdadm pri-on-incon-degr minor-0 exit code 0 (0x0) Jun 19 10:28:24 drbd1 kernel: EXT4-fs warning (device drbd0): ext4_end_bio:316: I/O error -5 writing to inode 50607824 (offset 0 size 0 starting block 150058741) Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on device drbd0, logical block 150058741 Jun 19 10:28:24 drbd1 kernel: EXT4-fs warning (device drbd0): ext4_end_bio:316: I/O error -5 writing to inode 50607811 (offset 1843200 size 0 starting block 209230140) Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on device drbd0, logical block 209230140 Jun 19 10:28:24 drbd1 kernel: Aborting journal on device drbd0-8. Jun 19 10:28:24 drbd1 kernel: EXT4-fs warning (device drbd0): ext4_end_bio:316: I/O error -5 writing to inode 50607811 (offset 1843200 size 0 starting block 147366649) Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on device drbd0, logical block 147366649 Jun 19 10:28:24 drbd1 kernel: EXT4-fs warning (device drbd0): ext4_end_bio:316: I/O error -5 writing to inode 50607811 (offset 1843200 size 0 starting block 147366654) Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on device drbd0, logical block 147366654 Jun 19 10:28:24 drbd1 kernel: EXT4-fs warning (device drbd0): ext4_end_bio:316: I/O error -5 writing to inode 50607811 (offset 1843200 size 0 starting block 147366855) Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on device drbd0, logical block 147366855 Jun 19 10:28:24 drbd1 kernel: EXT4-fs warning (device drbd0): ext4_end_bio:316: I/O error -5 writing to inode 50607811 (offset 1843200 size 0 starting block 147366869) Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on device drbd0, logical block 147366869 Jun 19 10:28:24 drbd1 kernel: EXT4-fs warning (device drbd0): ext4_end_bio:316: I/O error -5 writing to inode 50607811 (offset 1843200 size 0 starting block 146704209) Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on device drbd0, logical block 146704209 Jun 19 10:28:24 drbd1 kernel: EXT4-fs warning (device drbd0): ext4_end_bio:316: I/O error -5 writing to inode 50607811 (offset 1843200 size 0 starting block 146704214) Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on device drbd0, logical block 146704214 Jun 19 10:28:24 drbd1 kernel: EXT4-fs warning (device drbd0): ext4_end_bio:316: I/O error -5 writing to inode 50607811 (offset 1843200 size 0 starting block 146704227) Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on device drbd0, logical block 146704227 Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on device drbd0, logical block 146704228 Jun 19 10:28:24 drbd1 kernel: EXT4-fs warning (device drbd0): ext4_end_bio:316: I/O error -5 writing to inode 50607811 (offset 1843200 size 0 starting block 146704233) Jun 19 10:28:24 drbd1 kernel: drbd r0: Connection closed Jun 19 10:28:24 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 427170072+1024 Jun 19 10:28:24 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 1011089408+8 Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 126386176, lost sync page write Jun 19 10:28:24 drbd1 kernel: drbd r0: conn( NetworkFailure -> Unconnected ) Jun 19 10:28:24 drbd1 kernel: drbd r0: receiver terminated Jun 19 10:28:24 drbd1 kernel: drbd r0: Restarting receiver thread Jun 19 10:28:24 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 0+8 Jun 19 10:28:24 drbd1 kernel: drbd r0: receiver (re)started Jun 19 10:28:24 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 427170488+8 Jun 19 10:28:24 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 473052608+1024 Jun 19 10:28:24 drbd1 kernel: drbd r0: conn( Unconnected -> WFConnection ) Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:28:24 drbd1 kernel: EXT4-fs error (device drbd0): ext4_journal_check_start:56: Detected aborted journal Jun 19 10:28:24 drbd1 kernel: EXT4-fs (drbd0): Remounting filesystem read-only Jun 19 10:28:24 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:28:24 drbd1 kernel: drbd r0: bind before listen failed, err = -99 Jun 19 10:28:24 drbd1 kernel: drbd r0: create_listen_socket failed, err = -5 Jun 19 10:28:24 drbd1 kernel: drbd r0: conn( WFConnection -> Disconnecting ) Jun 19 10:28:24 drbd1 kernel: drbd r0: Connection closed Jun 19 10:28:24 drbd1 kernel: drbd r0: conn( Disconnecting -> StandAlone ) Jun 19 10:28:24 drbd1 kernel: drbd r0: Not fencing peer, I'm not even Consistent myself. Jun 19 10:28:24 drbd1 kernel: EXT4-fs (drbd0): I/O error while writing superblock Jun 19 10:28:24 drbd1 kernel: EXT4-fs error (device drbd0): ext4_journal_check_start:56: Detected aborted journal Jun 19 10:28:24 drbd1 kernel: drbd r0: Not fencing peer, I'm not even Consistent myself. Jun 19 10:28:24 drbd1 kernel: EXT4-fs error (device drbd0) in ext4_reserve_inode_write:5173: Journal has aborted Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:28:24 drbd1 kernel: EXT4-fs error (device drbd0) in ext4_dirty_inode:5290: Journal has aborted Jun 19 10:28:24 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:28:24 drbd1 kernel: EXT4-fs error (device drbd0) in ext4_da_write_begin:2718: IO failure Jun 19 10:28:24 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:28:24 drbd1 kernel: JBD2: Error -5 detected when updating journal superblock for drbd0-8. Jun 19 10:28:24 drbd1 kernel: JBD2: Detected IO errors while flushing file data on drbd0-8 Jun 19 10:28:24 drbd1 kernel: EXT4-fs error (device drbd0): __ext4_get_inode_loc:4180: inode #50629208: block 202377413: comm mysqld: unable to read itable block Jun 19 10:28:24 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:28:24 drbd1 kernel: EXT4-fs error (device drbd0): __ext4_get_inode_loc:4180: inode #50629208: block 202377413: comm mysqld: unable to read itable block Jun 19 10:28:24 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:28:24 drbd1 kernel: EXT4-fs error (device drbd0): ext4_find_entry:1312: inode #50331649: comm mysqld: reading directory lblock 0 Jun 19 10:28:24 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:28:24 drbd1 kernel: EXT4-fs error (device drbd0): ext4_find_entry:1312: inode #52428801: comm postgres: reading directory lblock 0 Jun 19 10:28:24 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:28:24 drbd1 kernel: EXT4-fs error (device drbd0): __ext4_get_inode_loc:4180: inode #50629208: block 202377413: comm mysqld: unable to read itable block Jun 19 10:28:24 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:28:24 drbd1 kernel: EXT4-fs error (device drbd0): ext4_wait_block_bitmap:497: comm postgres: Cannot read block bitmap - block_group = 1251, block_bitmap = 40894467 Jun 19 10:28:24 drbd1 kernel: EXT4-fs error (device drbd0): ext4_discard_preallocations:4006: comm postgres: Error loading buddy information for 1251 Jun 19 10:28:24 drbd1 systemd: postgresql-9.5.service: main process exited, code=killed, status=6/ABRT Jun 19 10:28:24 drbd1 systemd: mongod.service: main process exited, code=exited, status=14/n/a Jun 19 10:28:24 drbd1 pg_ctl: pg_ctl: directory "/store/pgsql" is not a database cluster directory Jun 19 10:28:24 drbd1 systemd: postgresql-9.5.service: control process exited, code=exited status=1 Jun 19 10:28:24 drbd1 systemd: Unit postgresql-9.5.service entered failed state. Jun 19 10:28:24 drbd1 systemd: postgresql-9.5.service failed. Jun 19 10:28:24 drbd1 mongod: Stopping mongod: [FAILED] Jun 19 10:28:24 drbd1 systemd: Unit mongod.service entered failed state. Jun 19 10:28:24 drbd1 systemd: mongod.service failed. Jun 19 10:28:25 drbd1 kernel: drbd r0: State change failed: Need a connection to start verify or resync Jun 19 10:28:25 drbd1 kernel: drbd r0: mask = 0x1f0 val = 0x80 Jun 19 10:28:25 drbd1 kernel: drbd r0: old_conn:StandAlone wanted_conn:WFConnection Jun 19 10:28:25 drbd1 kernel: drbd r0: receiver terminated Jun 19 10:28:25 drbd1 kernel: drbd r0: Terminating drbd_r_r0 Jun 19 10:28:29 drbd1 kernel: block drbd0: 729 messages suppressed in /builddir/build/BUILD/drbd-8.4.11-1/drbd/drbd_req.c:1446. Jun 19 10:28:29 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 1677722552+8 Jun 19 10:28:29 drbd1 kernel: buffer_io_error: 177 callbacks suppressed Jun 19 10:28:29 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 209715319, lost async page write Jun 19 10:28:29 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 209715482, lost async page write Jun 19 10:28:29 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 209715619, lost async page write Jun 19 10:28:29 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 209715647, lost async page write Jun 19 10:28:29 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 209715726, lost async page write Jun 19 10:28:29 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 209715801, lost async page write Jun 19 10:28:29 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 209715857, lost async page write Jun 19 10:28:29 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 209715879, lost async page write Jun 19 10:28:29 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 209715885, lost async page write Jun 19 10:28:29 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 209715898, lost async page write Jun 19 10:28:33 drbd1 ipfail: [8426]: info: Link Status update: Link drbd2.db.com/ens34 now has status dead Jun 19 10:28:33 drbd1 ipfail: [8426]: info: Asking other side for ping node count. Jun 19 10:28:33 drbd1 ipfail: [8426]: info: Checking remote count of ping nodes. Jun 19 10:28:35 drbd1 ipfail: [8426]: info: Ping node count is balanced. Jun 19 10:28:35 drbd1 ipfail: [8426]: info: No giveup timer to abort. Jun 19 10:30:02 drbd1 systemd-logind: Removed session 6994. Jun 19 10:30:09 drbd1 kernel: block drbd0: 41 messages suppressed in /builddir/build/BUILD/drbd-8.4.11-1/drbd/drbd_req.c:1446. Jun 19 10:30:09 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 74960+8 Jun 19 10:30:09 drbd1 kernel: EXT4-fs warning: 117 callbacks suppressed Jun 19 10:30:09 drbd1 kernel: EXT4-fs warning (device drbd0): __ext4_read_dirblock:902: error reading directory block (ino 2, block 0) Jun 19 10:30:32 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 74960+8 Jun 19 10:30:32 drbd1 kernel: EXT4-fs warning (device drbd0): __ext4_read_dirblock:902: error reading directory block (ino 2, block 0) Jun 19 10:31:04 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 74960+8 Jun 19 10:31:04 drbd1 kernel: EXT4-fs warning (device drbd0): __ext4_read_dirblock:902: error reading directory block (ino 2, block 0)
3、由於訪問的是對端的drbd分區,所以拔掉網線後就訪問不了,影響業務。這個時候停止heartbeat服務後講將資源切換到另外一臺drbd2。
Jun 19 10:31:34 drbd1 systemd: Stopping Heartbeat High Availability Cluster Communication and Membership... Jun 19 10:31:35 drbd1 systemd: Starting MariaDB database server... Jun 19 10:31:35 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 1610678528+8 Jun 19 10:31:35 drbd1 kernel: EXT4-fs warning (device drbd0): __ext4_read_dirblock:902: error reading directory block (ino 50331649, block 0) Jun 19 10:31:35 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 1610678528+8 Jun 19 10:31:35 drbd1 kernel: EXT4-fs error: 174 callbacks suppressed Jun 19 10:31:35 drbd1 kernel: EXT4-fs error (device drbd0): ext4_find_entry:1312: inode #50331649: comm mariadb-prepare: reading directory lblock 0 Jun 19 10:31:35 drbd1 kernel: EXT4-fs: 176 callbacks suppressed Jun 19 10:31:35 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:31:35 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 0+8 Jun 19 10:31:35 drbd1 kernel: buffer_io_error: 32 callbacks suppressed Jun 19 10:31:35 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:31:35 drbd1 mariadb-prepare-db-dir: Database MariaDB is not initialized, but the directory /store/mysql is not empty, so initialization cannot be done. Jun 19 10:31:35 drbd1 mariadb-prepare-db-dir: Make sure the /store/mysql is empty before running mariadb-prepare-db-dir. Jun 19 10:31:35 drbd1 systemd: mariadb.service: control process exited, code=exited status=1 Jun 19 10:31:35 drbd1 systemd: Failed to start MariaDB database server. Jun 19 10:31:35 drbd1 systemd: Unit mariadb.service entered failed state. Jun 19 10:31:35 drbd1 systemd: mariadb.service failed. Jun 19 10:31:35 drbd1 systemd: Starting PostgreSQL 9.5 database server... Jun 19 10:31:35 drbd1 postgresql95-check-db-dir: "/store/pgsql/" is missing or empty. Jun 19 10:31:35 drbd1 postgresql95-check-db-dir: Use "/usr/pgsql-9.5/bin/postgresql95-setup initdb" to initialize the database cluster. Jun 19 10:31:35 drbd1 postgresql95-check-db-dir: See /usr/share/doc/postgresql95-9.5.13/README.rpm-dist for more information. Jun 19 10:31:35 drbd1 systemd: postgresql-9.5.service: control process exited, code=exited status=1 Jun 19 10:31:35 drbd1 systemd: Failed to start PostgreSQL 9.5 database server. Jun 19 10:31:35 drbd1 systemd: Unit postgresql-9.5.service entered failed state. Jun 19 10:31:35 drbd1 systemd: postgresql-9.5.service failed. Jun 19 10:31:35 drbd1 systemd: Starting SYSV: Mongo is a scalable, document-oriented database.... Jun 19 10:31:35 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 1673593088+8 Jun 19 10:31:35 drbd1 kernel: EXT4-fs error (device drbd0): ext4_find_entry:1312: inode #52297729: comm mongod: reading directory lblock 0 Jun 19 10:31:35 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:31:35 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 0+8 Jun 19 10:31:35 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:31:35 drbd1 mongod: Starting mongod: [FAILED] Jun 19 10:31:35 drbd1 systemd: mongod.service: control process exited, code=exited status=1 Jun 19 10:31:35 drbd1 systemd: Failed to start SYSV: Mongo is a scalable, document-oriented database.. Jun 19 10:31:35 drbd1 systemd: Unit mongod.service entered failed state. Jun 19 10:31:35 drbd1 systemd: mongod.service failed. Jun 19 10:31:36 drbd1 systemd: Starting MariaDB database server... Jun 19 10:31:36 drbd1 mariadb-prepare-db-dir: Database MariaDB is not initialized, but the directory /store/mysql is not empty, so initialization cannot be done. Jun 19 10:31:36 drbd1 kernel: EXT4-fs warning (device drbd0): __ext4_read_dirblock:902: error reading directory block (ino 50331649, block 0) Jun 19 10:31:36 drbd1 mariadb-prepare-db-dir: Make sure the /store/mysql is empty before running mariadb-prepare-db-dir. Jun 19 10:31:36 drbd1 systemd: mariadb.service: control process exited, code=exited status=1 Jun 19 10:31:36 drbd1 systemd: Failed to start MariaDB database server. Jun 19 10:31:36 drbd1 systemd: Unit mariadb.service entered failed state. Jun 19 10:31:36 drbd1 systemd: mariadb.service failed. Jun 19 10:31:36 drbd1 systemd: Starting PostgreSQL 9.5 database server... Jun 19 10:31:36 drbd1 postgresql95-check-db-dir: "/store/pgsql/" is missing or empty. Jun 19 10:31:36 drbd1 systemd: postgresql-9.5.service: control process exited, code=exited status=1 Jun 19 10:31:36 drbd1 postgresql95-check-db-dir: Use "/usr/pgsql-9.5/bin/postgresql95-setup initdb" to initialize the database cluster. Jun 19 10:31:36 drbd1 postgresql95-check-db-dir: See /usr/share/doc/postgresql95-9.5.13/README.rpm-dist for more information. Jun 19 10:31:36 drbd1 systemd: Failed to start PostgreSQL 9.5 database server. Jun 19 10:31:36 drbd1 systemd: Unit postgresql-9.5.service entered failed state. Jun 19 10:31:36 drbd1 systemd: postgresql-9.5.service failed. Jun 19 10:31:36 drbd1 systemd: Starting SYSV: Mongo is a scalable, document-oriented database.... Jun 19 10:31:36 drbd1 mongod: Starting mongod: [FAILED] Jun 19 10:31:36 drbd1 systemd: mongod.service: control process exited, code=exited status=1 Jun 19 10:31:36 drbd1 systemd: Failed to start SYSV: Mongo is a scalable, document-oriented database.. Jun 19 10:31:36 drbd1 systemd: Unit mongod.service entered failed state. Jun 19 10:31:36 drbd1 systemd: mongod.service failed. Jun 19 10:31:37 drbd1 systemd: Starting MariaDB database server... Jun 19 10:31:37 drbd1 kernel: EXT4-fs warning (device drbd0): __ext4_read_dirblock:902: error reading directory block (ino 50331649, block 0) Jun 19 10:31:37 drbd1 mariadb-prepare-db-dir: Database MariaDB is not initialized, but the directory /store/mysql is not empty, so initialization cannot be done. Jun 19 10:31:37 drbd1 mariadb-prepare-db-dir: Make sure the /store/mysql is empty before running mariadb-prepare-db-dir. Jun 19 10:31:37 drbd1 systemd: mariadb.service: control process exited, code=exited status=1 Jun 19 10:31:37 drbd1 systemd: Failed to start MariaDB database server. Jun 19 10:31:37 drbd1 systemd: Unit mariadb.service entered failed state. Jun 19 10:31:37 drbd1 systemd: mariadb.service failed. Jun 19 10:31:37 drbd1 systemd: Starting PostgreSQL 9.5 database server... Jun 19 10:31:37 drbd1 postgresql95-check-db-dir: "/store/pgsql/" is missing or empty. Jun 19 10:31:37 drbd1 postgresql95-check-db-dir: Use "/usr/pgsql-9.5/bin/postgresql95-setup initdb" to initialize the database cluster. Jun 19 10:31:37 drbd1 postgresql95-check-db-dir: See /usr/share/doc/postgresql95-9.5.13/README.rpm-dist for more information. Jun 19 10:31:37 drbd1 systemd: postgresql-9.5.service: control process exited, code=exited status=1 Jun 19 10:31:37 drbd1 systemd: Failed to start PostgreSQL 9.5 database server. Jun 19 10:31:37 drbd1 systemd: Unit postgresql-9.5.service entered failed state. Jun 19 10:31:37 drbd1 systemd: postgresql-9.5.service failed. Jun 19 10:31:37 drbd1 systemd: Starting SYSV: Mongo is a scalable, document-oriented database.... Jun 19 10:31:37 drbd1 mongod: Starting mongod: [FAILED] Jun 19 10:31:37 drbd1 systemd: mongod.service: control process exited, code=exited status=1 Jun 19 10:31:37 drbd1 systemd: Failed to start SYSV: Mongo is a scalable, document-oriented database.. Jun 19 10:31:37 drbd1 systemd: Unit mongod.service entered failed state. Jun 19 10:31:37 drbd1 systemd: mongod.service failed. Jun 19 10:31:38 drbd1 systemd: Starting MariaDB database server... Jun 19 10:31:38 drbd1 kernel: EXT4-fs warning (device drbd0): __ext4_read_dirblock:902: error reading directory block (ino 50331649, block 0) Jun 19 10:31:38 drbd1 mariadb-prepare-db-dir: Database MariaDB is not initialized, but the directory /store/mysql is not empty, so initialization cannot be done. Jun 19 10:31:38 drbd1 mariadb-prepare-db-dir: Make sure the /store/mysql is empty before running mariadb-prepare-db-dir. Jun 19 10:31:38 drbd1 systemd: mariadb.service: control process exited, code=exited status=1 Jun 19 10:31:38 drbd1 systemd: Failed to start MariaDB database server. Jun 19 10:31:38 drbd1 systemd: Unit mariadb.service entered failed state. Jun 19 10:31:38 drbd1 systemd: mariadb.service failed. Jun 19 10:31:38 drbd1 systemd: Starting PostgreSQL 9.5 database server... Jun 19 10:31:38 drbd1 postgresql95-check-db-dir: "/store/pgsql/" is missing or empty. Jun 19 10:31:38 drbd1 postgresql95-check-db-dir: Use "/usr/pgsql-9.5/bin/postgresql95-setup initdb" to initialize the database cluster. Jun 19 10:31:38 drbd1 systemd: postgresql-9.5.service: control process exited, code=exited status=1 Jun 19 10:31:38 drbd1 postgresql95-check-db-dir: See /usr/share/doc/postgresql95-9.5.13/README.rpm-dist for more information. Jun 19 10:31:38 drbd1 systemd: Failed to start PostgreSQL 9.5 database server. Jun 19 10:31:38 drbd1 systemd: Unit postgresql-9.5.service entered failed state. Jun 19 10:31:38 drbd1 systemd: postgresql-9.5.service failed. Jun 19 10:31:38 drbd1 systemd: Starting SYSV: Mongo is a scalable, document-oriented database.... Jun 19 10:31:38 drbd1 mongod: Starting mongod: [FAILED] Jun 19 10:31:38 drbd1 systemd: mongod.service: control process exited, code=exited status=1 Jun 19 10:31:38 drbd1 systemd: Failed to start SYSV: Mongo is a scalable, document-oriented database.. Jun 19 10:31:38 drbd1 systemd: Unit mongod.service entered failed state. Jun 19 10:31:38 drbd1 systemd: mongod.service failed. Jun 19 10:31:39 drbd1 systemd: Starting MariaDB database server... Jun 19 10:31:39 drbd1 kernel: EXT4-fs warning (device drbd0): __ext4_read_dirblock:902: error reading directory block (ino 50331649, block 0) Jun 19 10:31:39 drbd1 mariadb-prepare-db-dir: Database MariaDB is not initialized, but the directory /store/mysql is not empty, so initialization cannot be done. Jun 19 10:31:39 drbd1 mariadb-prepare-db-dir: Make sure the /store/mysql is empty before running mariadb-prepare-db-dir. Jun 19 10:31:39 drbd1 systemd: mariadb.service: control process exited, code=exited status=1 Jun 19 10:31:39 drbd1 systemd: Failed to start MariaDB database server. Jun 19 10:31:39 drbd1 systemd: Unit mariadb.service entered failed state. Jun 19 10:31:39 drbd1 systemd: mariadb.service failed. Jun 19 10:31:39 drbd1 systemd: Starting PostgreSQL 9.5 database server... Jun 19 10:31:39 drbd1 postgresql95-check-db-dir: "/store/pgsql/" is missing or empty. Jun 19 10:31:39 drbd1 postgresql95-check-db-dir: Use "/usr/pgsql-9.5/bin/postgresql95-setup initdb" to initialize the database cluster. Jun 19 10:31:39 drbd1 systemd: postgresql-9.5.service: control process exited, code=exited status=1 Jun 19 10:31:39 drbd1 postgresql95-check-db-dir: See /usr/share/doc/postgresql95-9.5.13/README.rpm-dist for more information. Jun 19 10:31:39 drbd1 systemd: Failed to start PostgreSQL 9.5 database server. Jun 19 10:31:39 drbd1 systemd: Unit postgresql-9.5.service entered failed state. Jun 19 10:31:39 drbd1 systemd: postgresql-9.5.service failed. Jun 19 10:31:39 drbd1 systemd: Starting SYSV: Mongo is a scalable, document-oriented database.... Jun 19 10:31:39 drbd1 mongod: Starting mongod: [FAILED] Jun 19 10:31:39 drbd1 systemd: mongod.service: control process exited, code=exited status=1 Jun 19 10:31:39 drbd1 systemd: Failed to start SYSV: Mongo is a scalable, document-oriented database.. Jun 19 10:31:39 drbd1 systemd: Unit mongod.service entered failed state. Jun 19 10:31:39 drbd1 systemd: mongod.service failed. Jun 19 10:31:40 drbd1 systemd: start request repeated too quickly for mariadb.service Jun 19 10:31:40 drbd1 systemd: Failed to start MariaDB database server. Jun 19 10:31:40 drbd1 systemd: mariadb.service failed. Jun 19 10:31:40 drbd1 systemd: start request repeated too quickly for postgresql-9.5.service Jun 19 10:31:40 drbd1 systemd: Failed to start PostgreSQL 9.5 database server. Jun 19 10:31:40 drbd1 systemd: postgresql-9.5.service failed. Jun 19 10:31:40 drbd1 systemd: start request repeated too quickly for mongod.service Jun 19 10:31:40 drbd1 systemd: Failed to start SYSV: Mongo is a scalable, document-oriented database.. Jun 19 10:31:40 drbd1 systemd: mongod.service failed. Jun 19 10:31:41 drbd1 systemd: start request repeated too quickly for mariadb.service Jun 19 10:31:41 drbd1 systemd: Failed to start MariaDB database server. Jun 19 10:31:41 drbd1 systemd: mariadb.service failed. Jun 19 10:31:41 drbd1 systemd: start request repeated too quickly for postgresql-9.5.service Jun 19 10:31:41 drbd1 systemd: Failed to start PostgreSQL 9.5 database server. Jun 19 10:31:41 drbd1 systemd: postgresql-9.5.service failed. Jun 19 10:31:41 drbd1 systemd: start request repeated too quickly for mongod.service Jun 19 10:31:41 drbd1 systemd: Failed to start SYSV: Mongo is a scalable, document-oriented database.. Jun 19 10:31:41 drbd1 systemd: mongod.service failed. Jun 19 10:31:43 drbd1 systemd: start request repeated too quickly for mariadb.service Jun 19 10:31:43 drbd1 systemd: Failed to start MariaDB database server. Jun 19 10:31:43 drbd1 systemd: mariadb.service failed. Jun 19 10:31:43 drbd1 systemd: start request repeated too quickly for postgresql-9.5.service Jun 19 10:31:43 drbd1 systemd: Failed to start PostgreSQL 9.5 database server. Jun 19 10:31:43 drbd1 systemd: postgresql-9.5.service failed. Jun 19 10:31:43 drbd1 systemd: start request repeated too quickly for mongod.service Jun 19 10:31:43 drbd1 systemd: Failed to start SYSV: Mongo is a scalable, document-oriented database.. Jun 19 10:31:43 drbd1 systemd: mongod.service failed. Jun 19 10:31:44 drbd1 systemd: start request repeated too quickly for mariadb.service Jun 19 10:31:44 drbd1 systemd: Failed to start MariaDB database server. Jun 19 10:31:44 drbd1 systemd: mariadb.service failed. Jun 19 10:31:44 drbd1 systemd: start request repeated too quickly for postgresql-9.5.service Jun 19 10:31:44 drbd1 systemd: Failed to start PostgreSQL 9.5 database server. Jun 19 10:31:44 drbd1 systemd: postgresql-9.5.service failed. Jun 19 10:31:44 drbd1 systemd: start request repeated too quickly for mongod.service Jun 19 10:31:44 drbd1 systemd: Failed to start SYSV: Mongo is a scalable, document-oriented database.. Jun 19 10:31:44 drbd1 systemd: mongod.service failed. Jun 19 10:31:45 drbd1 systemd: Starting MariaDB database server... Jun 19 10:31:45 drbd1 kernel: block drbd0: 4 messages suppressed in /builddir/build/BUILD/drbd-8.4.11-1/drbd/drbd_req.c:1446. Jun 19 10:31:45 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 1610678528+8 Jun 19 10:31:45 drbd1 kernel: EXT4-fs warning (device drbd0): __ext4_read_dirblock:902: error reading directory block (ino 50331649, block 0) Jun 19 10:31:45 drbd1 mariadb-prepare-db-dir: Database MariaDB is not initialized, but the directory /store/mysql is not empty, so initialization cannot be done. Jun 19 10:31:45 drbd1 mariadb-prepare-db-dir: Make sure the /store/mysql is empty before running mariadb-prepare-db-dir. Jun 19 10:31:45 drbd1 systemd: mariadb.service: control process exited, code=exited status=1 Jun 19 10:31:45 drbd1 systemd: Failed to start MariaDB database server. Jun 19 10:31:45 drbd1 systemd: Unit mariadb.service entered failed state. Jun 19 10:31:45 drbd1 systemd: mariadb.service failed. Jun 19 10:31:45 drbd1 systemd: start request repeated too quickly for postgresql-9.5.service Jun 19 10:31:45 drbd1 systemd: Failed to start PostgreSQL 9.5 database server. Jun 19 10:31:45 drbd1 systemd: postgresql-9.5.service failed. Jun 19 10:31:45 drbd1 systemd: start request repeated too quickly for mongod.service Jun 19 10:31:45 drbd1 systemd: Failed to start SYSV: Mongo is a scalable, document-oriented database.. Jun 19 10:31:45 drbd1 systemd: mongod.service failed. Jun 19 10:31:46 drbd1 systemd: Starting MariaDB database server... Jun 19 10:31:46 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 1610678528+8 Jun 19 10:31:46 drbd1 kernel: EXT4-fs warning (device drbd0): __ext4_read_dirblock:902: error reading directory block (ino 50331649, block 0) Jun 19 10:31:46 drbd1 mariadb-prepare-db-dir: Database MariaDB is not initialized, but the directory /store/mysql is not empty, so initialization cannot be done. Jun 19 10:31:46 drbd1 mariadb-prepare-db-dir: Make sure the /store/mysql is empty before running mariadb-prepare-db-dir. Jun 19 10:31:46 drbd1 systemd: mariadb.service: control process exited, code=exited status=1 Jun 19 10:31:46 drbd1 systemd: Failed to start MariaDB database server. Jun 19 10:31:46 drbd1 systemd: Unit mariadb.service entered failed state. Jun 19 10:31:46 drbd1 systemd: mariadb.service failed. Jun 19 10:31:46 drbd1 systemd: Starting PostgreSQL 9.5 database server... Jun 19 10:31:46 drbd1 postgresql95-check-db-dir: "/store/pgsql/" is missing or empty. Jun 19 10:31:46 drbd1 postgresql95-check-db-dir: Use "/usr/pgsql-9.5/bin/postgresql95-setup initdb" to initialize the database cluster. Jun 19 10:31:46 drbd1 postgresql95-check-db-dir: See /usr/share/doc/postgresql95-9.5.13/README.rpm-dist for more information. Jun 19 10:31:46 drbd1 systemd: postgresql-9.5.service: control process exited, code=exited status=1 Jun 19 10:31:46 drbd1 systemd: Failed to start PostgreSQL 9.5 database server. Jun 19 10:31:46 drbd1 systemd: Unit postgresql-9.5.service entered failed state. Jun 19 10:31:46 drbd1 systemd: postgresql-9.5.service failed. Jun 19 10:31:46 drbd1 systemd: Starting SYSV: Mongo is a scalable, document-oriented database.... Jun 19 10:31:46 drbd1 mongod: Starting mongod: [FAILED] Jun 19 10:31:46 drbd1 systemd: mongod.service: control process exited, code=exited status=1 Jun 19 10:31:46 drbd1 systemd: Failed to start SYSV: Mongo is a scalable, document-oriented database.. Jun 19 10:31:46 drbd1 systemd: Unit mongod.service entered failed state. Jun 19 10:31:46 drbd1 systemd: mongod.service failed. Jun 19 10:31:46 drbd1 systemd: Starting MariaDB database server... Jun 19 10:31:46 drbd1 mariadb-prepare-db-dir: Database MariaDB is not initialized, but the directory /store/mysql is not empty, so initialization cannot be done. Jun 19 10:31:46 drbd1 kernel: EXT4-fs warning (device drbd0): __ext4_read_dirblock:902: error reading directory block (ino 50331649, block 0) Jun 19 10:31:46 drbd1 mariadb-prepare-db-dir: Make sure the /store/mysql is empty before running mariadb-prepare-db-dir. Jun 19 10:31:46 drbd1 systemd: mariadb.service: control process exited, code=exited status=1 Jun 19 10:31:46 drbd1 systemd: Failed to start MariaDB database server. Jun 19 10:31:46 drbd1 systemd: Unit mariadb.service entered failed state. Jun 19 10:31:46 drbd1 systemd: mariadb.service failed. Jun 19 10:31:46 drbd1 systemd: Starting PostgreSQL 9.5 database server... Jun 19 10:31:46 drbd1 postgresql95-check-db-dir: "/store/pgsql/" is missing or empty. Jun 19 10:31:46 drbd1 postgresql95-check-db-dir: Use "/usr/pgsql-9.5/bin/postgresql95-setup initdb" to initialize the database cluster. Jun 19 10:31:46 drbd1 postgresql95-check-db-dir: See /usr/share/doc/postgresql95-9.5.13/README.rpm-dist for more information. Jun 19 10:31:46 drbd1 systemd: postgresql-9.5.service: control process exited, code=exited status=1 Jun 19 10:31:46 drbd1 systemd: Failed to start PostgreSQL 9.5 database server. Jun 19 10:31:46 drbd1 systemd: Unit postgresql-9.5.service entered failed state. Jun 19 10:31:46 drbd1 systemd: postgresql-9.5.service failed. Jun 19 10:31:46 drbd1 systemd: Starting SYSV: Mongo is a scalable, document-oriented database.... Jun 19 10:31:46 drbd1 mongod: Starting mongod: [FAILED] Jun 19 10:31:46 drbd1 systemd: mongod.service: control process exited, code=exited status=1 Jun 19 10:31:46 drbd1 systemd: Failed to start SYSV: Mongo is a scalable, document-oriented database.. Jun 19 10:31:46 drbd1 systemd: Unit mongod.service entered failed state. Jun 19 10:31:46 drbd1 systemd: mongod.service failed. Jun 19 10:31:48 drbd1 kernel: EXT4-fs error (device drbd0): ext4_wait_block_bitmap:497: comm umount: Cannot read block bitmap - block_group = 5640, block_bitmap = 184549384 Jun 19 10:31:48 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:31:48 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:31:48 drbd1 kernel: EXT4-fs error (device drbd0): ext4_discard_preallocations:4013: comm umount: Error reading block bitmap for 5640 Jun 19 10:31:48 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:31:48 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:31:49 drbd1 kernel: EXT4-fs error (device drbd0): ext4_wait_block_bitmap:497: comm umount: Cannot read block bitmap - block_group = 6142, block_bitmap = 200802318 Jun 19 10:31:49 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:31:49 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:31:49 drbd1 kernel: EXT4-fs error (device drbd0): ext4_discard_preallocations:4013: comm umount: Error reading block bitmap for 6142 Jun 19 10:31:49 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:31:49 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:31:50 drbd1 kernel: EXT4-fs error (device drbd0): ext4_wait_block_bitmap:497: comm umount: Cannot read block bitmap - block_group = 1270, block_bitmap = 41418758 Jun 19 10:31:50 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:31:50 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:31:50 drbd1 kernel: EXT4-fs error (device drbd0): ext4_discard_preallocations:4013: comm umount: Error reading block bitmap for 1270 Jun 19 10:31:50 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:31:50 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:31:50 drbd1 kernel: EXT4-fs error (device drbd0): ext4_wait_block_bitmap:497: comm umount: Cannot read block bitmap - block_group = 1275, block_bitmap = 41418763 Jun 19 10:31:50 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:31:50 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:31:50 drbd1 kernel: EXT4-fs error (device drbd0): ext4_discard_preallocations:4013: comm umount: Error reading block bitmap for 1275 Jun 19 10:31:50 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:31:50 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:31:50 drbd1 kernel: VFS: Dirty inode writeback failed for block device drbd0 (err=-5). Jun 19 10:31:50 drbd1 kernel: block drbd0: role( Primary -> Secondary ) Jun 19 10:31:53 drbd1 systemd: Stopped Heartbeat High Availability Cluster Communication and Membership. Jun 19 10:31:57 drbd1 systemd: Started Heartbeat High Availability Cluster Communication and Membership. Jun 19 10:31:57 drbd1 systemd: Starting Heartbeat High Availability Cluster Communication and Membership... Jun 19 10:31:57 drbd1 heartbeat: Jun 19 10:31:57 drbd1.db.com heartbeat: [4628]: WARN: heartbeat: udp port 1112 reserved for service "icp". Jun 19 10:31:57 drbd1 heartbeat: heartbeat: udpport setting must precede media statementsheartbeat: baudrate setting must precede media statementsJun 19 10:31:57 drbd1.db.com heartbeat: [4628]: info: Pacemaker support: false Jun 19 10:31:57 drbd1 heartbeat: Jun 19 10:31:57 drbd1.db.com heartbeat: [4628]: WARN: Logging daemon is disabled --enabling logging daemon is recommended Jun 19 10:31:57 drbd1 heartbeat: Jun 19 10:31:57 drbd1.db.com heartbeat: [4628]: info: ************************** Jun 19 10:31:57 drbd1 heartbeat: Jun 19 10:31:57 drbd1.db.com heartbeat: [4628]: info: Configuration validated. Starting heartbeat 3.0.6 Jun 19 10:32:03 drbd1 ipfail: [4655]: info: Ping node count is balanced. Jun 19 10:32:20 drbd1 systemd: Stopping DRBD -- please disable. Unless you are NOT using a cluster manager.... Jun 19 10:32:21 drbd1 kernel: drbd r0: Terminating drbd_w_r0 Jun 19 10:32:21 drbd1 kernel: drbd: module cleanup done. Jun 19 10:32:21 drbd1 drbd: Stopping all DRBD resources: . Jun 19 10:32:21 drbd1 systemd: Starting DRBD -- please disable. Unless you are NOT using a cluster manager.... Jun 19 10:32:21 drbd1 kernel: Request for unknown module key 'The ELRepo Project (http://elrepo.org): ELRepo.org Secure Boot Key: f365ad3481a7b20e3427b61b2a26635b83fe427b' err -11 Jun 19 10:32:21 drbd1 kernel: drbd: initialized. Version: 8.4.11-1 (api:1/proto:86-101) Jun 19 10:32:21 drbd1 kernel: drbd: GIT-hash: 66145a308421e9c124ec391a7848ac20203bb03c build by mockbuild@, 2018-04-26 12:10:42 Jun 19 10:32:21 drbd1 kernel: drbd: registered as block device major 147 Jun 19 10:32:21 drbd1 drbd: Starting DRBD resources: drbd.d/db.res:18: in resource r0, on drbd1.db.com: Jun 19 10:32:21 drbd1 drbd: IP 192.168.60.54 not found on this host. Jun 19 10:32:21 drbd1 systemd: drbd.service: main process exited, code=exited, status=20/n/a Jun 19 10:32:21 drbd1 systemd: Failed to start DRBD -- please disable. Unless you are NOT using a cluster manager.. Jun 19 10:32:21 drbd1 systemd: Unit drbd.service entered failed state. Jun 19 10:32:21 drbd1 systemd: drbd.service failed. Jun 19 10:32:28 drbd1 systemd: Starting DRBD -- please disable. Unless you are NOT using a cluster manager.... Jun 19 10:32:28 drbd1 drbd: Starting DRBD resources: drbd.d/db.res:18: in resource r0, on drbd1.db.com: Jun 19 10:32:28 drbd1 drbd: IP 192.168.60.54 not found on this host. Jun 19 10:32:28 drbd1 systemd: drbd.service: main process exited, code=exited, status=20/n/a Jun 19 10:32:28 drbd1 systemd: Failed to start DRBD -- please disable. Unless you are NOT using a cluster manager.. Jun 19 10:32:28 drbd1 systemd: Unit drbd.service entered failed state. Jun 19 10:32:28 drbd1 systemd: drbd.service failed. Jun 19 10:47:00 drbd1 kernel: e1000: ens34 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Jun 19 10:47:00 drbd1 kernel: IPv6: ADDRCONF(NETDEV_CHANGE): ens34: link becomes ready Jun 19 10:47:00 drbd1 NetworkManager[806]: <info> [1529376420.7939] device (ens34): carrier: link connected Jun 19 10:47:00 drbd1 NetworkManager[806]: <info> [1529376420.7946] device (ens34): state change: unavailable -> disconnected (reason 'carrier-changed', sys-iface-state: 'managed') Jun 19 10:47:00 drbd1 NetworkManager[806]: <info> [1529376420.7958] policy: auto-activating connection 'ens34' Jun 19 10:47:00 drbd1 NetworkManager[806]: <info> [1529376420.7975] device (ens34): Activation: starting connection 'ens34' (94aea789-efb3-ef4c-81b0-e8b18ecc9797) Jun 19 10:47:00 drbd1 NetworkManager[806]: <info> [1529376420.7978] device (ens34): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed') Jun 19 10:47:00 drbd1 NetworkManager[806]: <info> [1529376420.7986] device (ens34): state change: prepare -> config (reason 'none', sys-iface-state: 'managed') Jun 19 10:47:00 drbd1 NetworkManager[806]: <info> [1529376420.8010] device (ens34): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed') Jun 19 10:47:00 drbd1 NetworkManager[806]: <info> [1529376420.9347] device (ens34): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed') Jun 19 10:47:00 drbd1 NetworkManager[806]: <info> [1529376420.9363] device (ens34): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'managed') Jun 19 10:47:00 drbd1 NetworkManager[806]: <info> [1529376420.9367] device (ens34): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed') Jun 19 10:47:00 drbd1 NetworkManager[806]: <info> [1529376420.9390] device (ens34): Activation: successful, device activated. Jun 19 10:47:00 drbd1 dbus[801]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service' Jun 19 10:47:00 drbd1 systemd: Starting Network Manager Script Dispatcher Service... Jun 19 10:47:00 drbd1 dbus[801]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher' Jun 19 10:47:00 drbd1 systemd: Started Network Manager Script Dispatcher Service. Jun 19 10:47:00 drbd1 nm-dispatcher: req:1 'up' [ens34]: new request (3 scripts) Jun 19 10:47:00 drbd1 nm-dispatcher: req:1 'up' [ens34]: start running ordered scripts... Jun 19 10:47:01 drbd1 ipfail: [4655]: info: Link Status update: Link drbd2.db.com/ens34 now has status up Jun 19 10:48:22 drbd1 systemd: Starting DRBD -- please disable. Unless you are NOT using a cluster manager.... Jun 19 10:48:22 drbd1 drbd: Starting DRBD resources: [ Jun 19 10:48:22 drbd1 drbd: create res: r0 Jun 19 10:48:22 drbd1 drbd: prepare disk: r0 Jun 19 10:48:23 drbd1 kernel: drbd r0: Starting worker thread (from drbdsetup-84 [4920]) Jun 19 10:48:23 drbd1 kernel: block drbd0: disk( Diskless -> Attaching ) Jun 19 10:48:23 drbd1 kernel: drbd r0: Method to ensure write ordering: flush Jun 19 10:48:23 drbd1 kernel: block drbd0: max BIO size = 1048576 Jun 19 10:48:23 drbd1 kernel: block drbd0: Adjusting my ra_pages to backing device's (32 -> 1024) Jun 19 10:48:23 drbd1 kernel: block drbd0: drbd_bm_resize called with capacity == 2023943792 Jun 19 10:48:23 drbd1 kernel: block drbd0: resync bitmap: bits=252992974 words=3953016 pages=7721 Jun 19 10:48:23 drbd1 kernel: block drbd0: size = 965 GB (1011971896 KB) Jun 19 10:48:23 drbd1 kernel: block drbd0: recounting of set bits took additional 5 jiffies Jun 19 10:48:23 drbd1 kernel: block drbd0: 4948 MB (1266688 bits) marked out-of-sync by on disk bit-map. Jun 19 10:48:23 drbd1 kernel: block drbd0: disk( Attaching -> UpToDate ) Jun 19 10:48:23 drbd1 kernel: block drbd0: attached to UUIDs A918D4C3621EAB6C:0000000000000000:5A76D6F6AD549605:5A75D6F6AD549605 Jun 19 10:48:23 drbd1 drbd: adjust disk: r0 Jun 19 10:48:23 drbd1 drbd: adjust net: r0 Jun 19 10:48:23 drbd1 drbd: ] Jun 19 10:48:23 drbd1 kernel: drbd r0: conn( StandAlone -> Unconnected ) Jun 19 10:48:23 drbd1 kernel: drbd r0: Starting receiver thread (from drbd_w_r0 [4921]) Jun 19 10:48:23 drbd1 kernel: drbd r0: receiver (re)started Jun 19 10:48:23 drbd1 kernel: drbd r0: conn( Unconnected -> WFConnection ) Jun 19 10:48:23 drbd1 drbd: WARN: stdin/stdout is not a TTY; using /dev/consoleoutdated-wfc-timeout has to be shorter than degr-wfc-timeout Jun 19 10:48:23 drbd1 drbd: outdated-wfc-timeout implicitly set to degr-wfc-timeout (120s) Jun 19 10:48:23 drbd1 kernel: drbd r0: Handshake successful: Agreed network protocol version 101 Jun 19 10:48:23 drbd1 kernel: drbd r0: Feature flags enabled on protocol level: 0xf TRIM THIN_RESYNC WRITE_SAME WRITE_ZEROES. Jun 19 10:48:23 drbd1 kernel: drbd r0: conn( WFConnection -> WFReportParams ) Jun 19 10:48:23 drbd1 kernel: drbd r0: Starting ack_recv thread (from drbd_r_r0 [4926]) Jun 19 10:48:23 drbd1 kernel: block drbd0: drbd_sync_handshake: Jun 19 10:48:23 drbd1 kernel: block drbd0: self A918D4C3621EAB6C:0000000000000000:5A76D6F6AD549605:5A75D6F6AD549605 bits:1266688 flags:0 Jun 19 10:48:23 drbd1 kernel: block drbd0: peer C6B3D27C4098E93F:A918D4C3621EAB6C:5A76D6F6AD549604:5A75D6F6AD549605 bits:8839904 flags:0 Jun 19 10:48:23 drbd1 kernel: block drbd0: uuid_compare()=-1 by rule 50 Jun 19 10:48:23 drbd1 kernel: block drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) disk( UpToDate -> Outdated ) pdsk( DUnknown -> UpToDate ) Jun 19 10:48:23 drbd1 drbd: . Jun 19 10:48:23 drbd1 systemd: Started DRBD -- please disable. Unless you are NOT using a cluster manager.. Jun 19 10:48:23 drbd1 kernel: block drbd0: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE 1081767(265), total 1081767; compression: 96.6% Jun 19 10:48:23 drbd1 kernel: block drbd0: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 1041300(255), total 1041300; compression: 96.8% Jun 19 10:48:23 drbd1 kernel: block drbd0: conn( WFBitMapT -> WFSyncUUID ) Jun 19 10:48:24 drbd1 kernel: block drbd0: updated sync uuid A919D4C3621EAB6C:0000000000000000:5A76D6F6AD549605:5A75D6F6AD549605 Jun 19 10:48:24 drbd1 kernel: block drbd0: helper command: /sbin/drbdadm before-resync-target minor-0 Jun 19 10:48:24 drbd1 kernel: block drbd0: helper command: /sbin/drbdadm before-resync-target minor-0 exit code 0 (0x0) Jun 19 10:48:24 drbd1 kernel: block drbd0: conn( WFSyncUUID -> SyncTarget ) disk( Outdated -> Inconsistent ) Jun 19 10:48:24 drbd1 kernel: block drbd0: Began resync as SyncTarget (will sync 39533428 KB [9883357 bits set]).
4、最後我將數據從drbd2同步到原主服務器,然後資源也切換過來了,不知道原主服務器還會不會出現IO錯誤。
虛擬機測試環境,測試斷掉60網段線路,drbd正常日誌:
message日誌: Jun 19 11:18:02 drbd1 kernel: e1000: eth1 NIC Link is Down Jun 19 11:18:03 drbd1 kernel: d-con r0: PingAck did not arrive in time. Jun 19 11:18:03 drbd1 kernel: d-con r0: peer( Secondary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) Jun 19 11:18:03 drbd1 kernel: d-con r0: asender terminated Jun 19 11:18:03 drbd1 kernel: d-con r0: Terminating drbd_a_r0 Jun 19 11:18:03 drbd1 kernel: block drbd0: new current UUID 5BEECA7E9782384F:457559D613869C63:90760DB10A2C370F:90750DB10A2C370F Jun 19 11:18:03 drbd1 kernel: d-con r0: Connection closed Jun 19 11:18:03 drbd1 kernel: d-con r0: out of mem, failed to invoke fence-peer helper Jun 19 11:18:03 drbd1 kernel: d-con r0: conn( NetworkFailure -> Unconnected ) Jun 19 11:18:03 drbd1 kernel: d-con r0: receiver terminated Jun 19 11:18:03 drbd1 kernel: d-con r0: Restarting receiver thread Jun 19 11:18:03 drbd1 kernel: d-con r0: receiver (re)started Jun 19 11:18:03 drbd1 kernel: d-con r0: conn( Unconnected -> WFConnection ) Jun 19 11:18:30 drbd1 ipfail: [1729]: info: Link Status update: Link drbd2.gxm.com/eth1 now has status dead Jun 19 11:18:31 drbd1 ipfail: [1729]: info: Asking other side for ping node count. Jun 19 11:18:31 drbd1 ipfail: [1729]: info: Checking remote count of ping nodes. Jun 19 11:18:33 drbd1 ipfail: [1729]: info: Ping node count is balanced. Jun 19 11:18:34 drbd1 ipfail: [1729]: info: No giveup timer to abort. ha-debug日誌: Jun 19 11:18:30 drbd1.gxm.com heartbeat: [1680]: info: Link drbd2.gxm.com:eth1 dead. Jun 19 11:18:30 drbd1.gxm.com ipfail: [1729]: info: Link Status update: Link drbd2.gxm.com/eth1 now has status dead Jun 19 11:18:30 drbd1.gxm.com ipfail: [1729]: debug: Found ping node 192.168.1.1! Jun 19 11:18:31 drbd1.gxm.com ipfail: [1729]: info: Asking other side for ping node count. Jun 19 11:18:31 drbd1.gxm.com ipfail: [1729]: debug: Message [num_ping] sent. Jun 19 11:18:31 drbd1.gxm.com ipfail: [1729]: info: Checking remote count of ping nodes. Jun 19 11:18:32 drbd1.gxm.com ipfail: [1729]: debug: Got asked for num_ping. Jun 19 11:18:32 drbd1.gxm.com ipfail: [1729]: debug: Found ping node 192.168.1.1! Jun 19 11:18:33 drbd1.gxm.com ipfail: [1729]: info: Ping node count is balanced. Jun 19 11:18:33 drbd1.gxm.com ipfail: [1729]: debug: Abort message sent. Jun 19 11:18:34 drbd1.gxm.com ipfail: [1729]: info: No giveup timer to abort
drbd主服務器ds狀態變成了Diskless(磁盤IO錯誤引起)