1. 程式人生 > >BeeGFS開發環境搭建1-環境配置

BeeGFS開發環境搭建1-環境配置

BeeGFS常用於高效能運算中的分散式檔案儲存,其對巨量小檔案的支援相比於其他大多數檔案系統而言,要好太多,這裡介紹搭建的詳細過程。一共使用3臺伺服器,一個磁碟分成兩個區(分別用於元資料和資料儲存),兩個網絡卡(一個管理網,一個儲存網)。

網路配置

網路分配

管理網路用於SSH登入等功能,都是千兆口,儲存網路用於資料傳輸,為雙萬兆口Mode 6繫結。

主機名管理網路儲存網路服務部署說明
sacd01172.29.201.125172.29.39.125mgmt,meta(2),storage(401,402),client
sacd02172.29.201.126172.29.39.126meta(3),storage(501,502),client
sacd03172.29.201.133172.29.39.113client

下面的配置以sacd01為例進行說明。

DNS和主機名

sacd01 $ cat /etc/hosts
172.29.201.125 sacd01
172.29.201.126 sacd02
172.29.201.133 sacd03

sacd01 $ hostname
sacd01

sacd01 $ cat /etc/hostname 
sacd01

網絡卡繫結

  • 檢視網絡卡繫結情況:
sacd01 $ eths=$(lspci | grep Ethernet); for bond in `ls -d /sys/class/net/bond[0-9]`; do nics=$(ls -d $bond/slave* | awk -F_ -v bond=$bond '$1~bond {print $2}' | tr -d ":"); for nic in $nics; do addr=$(ethtool -i $nic | awk '$1~"bus-info" {print $2}' | awk -F: '{printf $2":"$3}'); echo -e $(echo $bond | awk -F/ '{print $5}') "\t" $nic "\t" $addr "\t" $(echo -e "$eths" | grep $addr | awk -F: '{printf $3}'); done; done
bond0  enp26s0f2  1a:00.2 Intel Corporation Ethernet Connection X722 for 1GbE (rev 09)
bond1  enp95s0f0  5f:00.0 Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 01)
bond1  enp95s0f1  5f:00.1 Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 01)

sacd01 $ ip a | grep bond1
6: enp95s0f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1600 qdisc mq master bond1 state UP group default qlen 1000
7: enp95s0f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1600 qdisc mq master bond1 state UP group default qlen 1000
8: bond1: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1600 qdisc noqueue state UP group default qlen 1000
9: bond1.1039@bond1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1600 qdisc noqueue state UP group default qlen 1000
    inet 172.29.39.125/24 brd 172.29.39.255 scope global noprefixroute bond1.1039

sacd01 $ ifconfig bond1.1039
bond1.1039: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1600
        inet 172.29.39.125  netmask 255.255.255.0  broadcast 172.29.39.255
        inet6 fe80::6e92:bfff:fe67:9bdc  prefixlen 64  scopeid 0x20<link>
        ether 6c:92:bf:67:9b:dc  txqueuelen 1000  (Ethernet)
        RX packets 10  bytes 528 (528.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 16  bytes 992 (992.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

sacd01 $ cat /sys/class/net/bond1/bonding/mode 
balance-alb 6
  • 檢視網絡卡繫結配置:
sacd01 $ cat /etc/sysconfig/network-scripts/ifcfg-enp95s0f0
TYPE=Ethernet
NAME="enp95s0f0"
DEVICE=enp95s0f0
ONBOOT=yes
BOOTPROTO=none
MASTER=bond1
SLAVE=yes

sacd01 $ cat /etc/sysconfig/network-scripts/ifcfg-enp95s0f1
TYPE=Ethernet
NAME="enp95s0f1"
DEVICE=enp95s0f1
ONBOOT=yes
BOOTPROTO=none
MASTER=bond1
SLAVE=yes

sacd01 $ cat /etc/sysconfig/network-scripts/ifcfg-bond1
DEVICE=bond1
BOOTPROTO=none
ONBOOT=yes
TYPE=Bond
NM_CONTROLED=no
IPV6INIT=no
USERCTL=no
IPV6INIT=no
BONDING_MASTER=yes
BONDING_OPTS="miimon=100 mode=6"
#BONDING_OPTS="xmit_hash_policy=layer2+3 mode=4 miimon=100"
MTU=1600

sacd01 $ cat /etc/sysconfig/network-scripts/ifcfg-bond1.1039 
DEVICE=bond1.1039
ONBOOT=yes
BOOTPROTO=static
IPADDR=172.29.39.125
NETMASK=255.255.255.0
VLAN=yes

儲存配置

磁碟資訊

sacd01 $ lsblk
NAME            MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda               8:0    0 222.6G  0 disk 
├─sda1            8:1    0   200M  0 part /boot
└─sda2            8:2    0 222.4G  0 part 
  ├─centos-root 253:0    0 122.4G  0 lvm  /
  └─centos-home 253:1    0   100G  0 lvm  /home
sdb               8:16   0   7.3T  0 disk

sacd01 $ lsblk -d | awk '$1!~"sda" && ($1~"^sd" || $1~"^nv") {print $1}' | xargs -I % sh -c 'echo -e "/dev/%\t"; smartctl -i /dev/% | grep -e "Product" -e "Device Model" -e "Model Number" -e "Rotation Rate" -e "Capacity" | awk -F\: "{print $2}"' | awk -F: 'NF==1{printf "\n"$1} NF==2{sub(/^[ ]+/,"",$2);printf $2"\t"} END {printf "\n"}'

/dev/sdb	Logical Volume	8,000,999,784,448 bytes [8.00 TB]

sacd01 $ smartctl -i /dev/sdb
smartctl 6.5 2016-05-07 r4318 [x86_64-linux-3.10.0-693.el7.x86_64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               LSI
Product:              Logical Volume
Revision:             3000
Compliance:           SPC-4
User Capacity:        8,000,999,784,448 bytes [8.00 TB]
Logical block size:   512 bytes
Physical block size:  4096 bytes
Logical Unit id:      0x600508e000000000fadfaab948205506
Device type:          disk
Local Time is:        Wed May  8 09:56:05 2019 CST
SMART support is:     Unavailable - device lacks SMART capability.

磁碟分割槽

  • 分兩個2T的盤:
$ sgdisk /dev/sdb --zap-all --clear --mbrtogpt

$ sgdisk /dev/sdb -n 1:0:+2T -c 1:"beegfs meta" -t 1:8300 -n 2:0:+2T -c 2:"beegfs data" -t 2:8300
Setting name!
partNum is 0
REALLY setting name!
Setting name!
partNum is 1
REALLY setting name!
The operation has completed successfully.

$ sgdisk /dev/sdb -p
Disk /dev/sdb: 15626952704 sectors, 7.3 TiB
Logical sector size: 512 bytes
Disk identifier (GUID): A3A34A32-C0FE-4133-BDDB-05A28E2F5455
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 15626952670
Partitions will be aligned on 2048-sector boundaries
Total free space is 7037018045 sectors (3.3 TiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048      4294969343   2.0 TiB     8300  beegfs meta
   2      4294969344      8589936639   2.0 TiB     8300  beegfs data

$ lsblk
NAME            MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda               8:0    0 222.6G  0 disk 
├─sda1            8:1    0   200M  0 part /boot
└─sda2            8:2    0 222.4G  0 part 
  ├─centos-root 253:0    0 122.4G  0 lvm  /
  └─centos-home 253:1    0   100G  0 lvm  /home
sdb               8:16   0   7.3T  0 disk 
├─sdb1            8:17   0     2T  0 part 
└─sdb2            8:18   0     2T  0 part

磁碟格式化

  • 元資料分割槽使用EXT4檔案系統:
$ mkfs.ext4 -i 2048 -I 512 -J size=400 -Odir_index,filetype /dev/sdb1
mke2fs 1.42.9 (28-Dec-2013)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
1073741824 inodes, 536870912 blocks
26843545 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=1610612736
32768 block groups
16384 blocks per group, 16384 fragments per group
32768 inodes per group
Superblock backups stored on blocks: 
	16384, 49152, 81920, 114688, 147456, 409600, 442368, 802816, 1327104, 
	2048000, 3981312, 5619712, 10240000, 11943936, 35831808, 39337984, 
	51200000, 107495424, 256000000, 275365888, 322486272

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (102400 blocks): done
Writing superblocks and filesystem accounting information: done       
  • 儲存分割槽使用XFS檔案系統:
$ mkfs.xfs /dev/sdb2
meta-data=/dev/sdb2              isize=512    agcount=4, agsize=134217728 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=0, sparse=0
data     =                       bsize=4096   blocks=536870912, imaxpct=5
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal log           bsize=4096   blocks=262144, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

磁碟掛載

$ mkdir -pv /beegfs_meta /beegfs_data

$ mount /dev/sdb1 /beegfs_meta
$ mount /dev/sdb2 /beegfs_data

$ echo "/dev/sdb1 /beegfs_meta   xfs     defaults    0   0" | tee -a /etc/fstab
$ echo "/dev/sdb2 /beegfs_data   xfs     defaults    0   0" | tee -a /etc/fstab

$ df -h | grep beegfs
/dev/sdb1                2.0T   33M  2.0T   1% /beegfs_meta
/dev/sdb2                2.0T   33M  2.0T   1% /beegfs_data

$  mount | grep beegfs
/dev/sdb1 on /beegfs_meta type xfs (rw,relatime,attr2,inode64,noquota)
/dev/sdb2 on /beegfs_data type xfs (rw,relatime,attr2,inode64,noquota)

CPU配置

$ lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                72
On-line CPU(s) list:   0-71
Thread(s) per core:    2
Core(s) per socket:    18
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 85
Model name:            Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz
Stepping:              4
CPU MHz:               1499.941
CPU max MHz:           3700.0000
CPU min MHz:           1000.0000
BogoMIPS:              4600.00
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              1024K
L3 cache:              25344K
NUMA node0 CPU(s):     0-17,36-53
NUMA node1 CPU(s):     18-35,54-71
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb cat_l3 cdp_l3 intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req

記憶體配置

$ dmidecode -t memory | grep "Size: [0-9]* " -B7 -A12 | grep -e "^Handle" -e "Size:" -e "Type:" -e "Speed:" -e "Manufacturer:" -e "Part Number"
Handle 0x0038, DMI type 17, 40 bytes
	Size: 32 GB
	Type: DDR4
	Speed: 2666 MT/s
	Manufacturer: Hynix
	Part Number: HMA84GR7AFR4N-VK    
Handle 0x003B, DMI type 17, 40 bytes
	Size: 32 GB
	Type: DDR4
	Speed: 2666 MT/s
	Manufacturer: Hynix
	Part Number: HMA84GR7AFR4N-VK    
Handle 0x003E, DMI type 17, 40 bytes
	Size: 32 GB
	Type: DDR4
	Speed: 2666 MT/s
	Manufacturer: Hynix
	Part Number: HMA84GR7AFR4N-VK    
Handle 0x0043, DMI type 17, 40 bytes
	Size: 32 GB
	Type: DDR4
	Speed: 2666 MT/s
	Manufacturer: Hynix
	Part Number: HMA84GR7AFR4N-VK    
Handle 0x0046, DMI type 17, 40 bytes
	Size: 32 GB
	Type: DDR4
	Speed: 2666 MT/s
	Manufacturer: Hynix
	Part Number: HMA84GR7AFR4N-VK    
Handle 0x0049, DMI type 17, 40 bytes
	Size: 32 GB
	Type: DDR4
	Speed: 2666 MT/s
	Manufacturer: Hynix
	Part Number: HMA84GR7AFR4N-VK    
Handle 0x004E, DMI type 17, 40 bytes
	Size: 32 GB
	Type: DDR4
	Speed: 2666 MT/s
	Manufacturer: Hynix
	Part Number: HMA84GR7AFR4N-VK    
Handle 0x0051, DMI type 17, 40 bytes
	Size: 32 GB
	Type: DDR4
	Speed: 2666 MT/s
	Manufacturer: Hynix
	Part Number: HMA84GR7AFR4N-VK    
Handle 0x0054, DMI type 17, 40 bytes
	Size: 32 GB
	Type: DDR4
	Speed: 2666 MT/s
	Manufacturer: Hynix
	Part Number: HMA84GR7AFR4N-VK    
Handle 0x0059, DMI type 17, 40 bytes
	Size: 32 GB
	Type: DDR4
	Speed: 2666 MT/s
	Manufacturer: Hynix
	Part Number: HMA84GR7AFR4N-VK    
Handle 0x005C, DMI type 17, 40 bytes
	Size: 32 GB
	Type: DDR4
	Speed: 2666 MT/s
	Manufacturer: Hynix
	Part Number: HMA84GR7AFR4N-VK    
Handle 0x005F, DMI type 17, 40 bytes
	Size: 32 GB
	Type: DDR4
	Speed: 2666 MT/s
	Manufacturer: Hynix
	Part Number: HMA84GR7AFR4N-VK

$ dmidecode -t memory | grep "Size: [0-9]* " | wc -l
12

$ free -h
              total        used        free      shared  buff/cache   available
Mem:           376G        6.8G        368G         10M        983M        368G
Swap:            0B          0B