Wednesday, February 7, 2018

Cluster 5. Nodes naming, Configuring Interfaces, Linux bonds and bridge or Open vSwitch.

Nodes naming convention:

  1. four letter code of the cluster owner name (i.e. AIST Group becomes agrp)
  2. plus c+cluster number (c01 - first cluster in a company)
  3. plus n+01 or 02 (node number in a cluster)
  4. so the name will be: agrp-c01n01 & agrp-c01n02
Change host-names on both nodes:
  • hostnamectl set-hostname agrp-c01n01 --static
  • hostnamectl status
  • logout
  • login
  • verify that hostname is displayed properly both on login screen and on CLI prompt:
    • agrp-c0n01 login:
    • [root@agrp-c01n01 ~]#
All commands below must be executed on both nodes (with proper IP addresses, here will be only agrp-c01n01 related commands).

Bond is the same as LAG (Link Aggregation) - RAID1 for network interfaces (if one goes down, the other will remain working).

Linux bonding and bridging 

Nodes naming and IP addresses:

NodeIP & BCN devIP & SN dev Ip & IFN dev 
agrp-c01n0110.10.53.1 on bcn_bond1 10.10.52.1 on sn_bond1172.16.51.1 on ifn_bridge1 (ifn_bond1 slaved)
agrp-c01n0210.10.53.2 on bcn_bond1 10.10.52.2 on sn_bond1172.16.51.2 on ifn_bridge1 (ifn_bond1 slaved)

In other articles the table was such this one:

SubnetVIDNICLink 1NICLink 2BondNet IP
BCN100eno1bcn_link1eno4back_link.100bcn_bond10.10.53.0/24
SN200eno2sn_link1eno4back_link.200sn_bond10.10.52.0/24
IFN51eno3ifn_link1eno4back_link.51ifn_bond172.16.51.0/24

That was so for simplicity. We will be using VLAN on all physical interfaces, so the actual table must be:

SubnetVIDNICLink 1NICLink 2BondNet IP
BCN100eno1bcn_link1.100eno4back_link.100bcn_bond110.10.53.0/24
SN200eno2sn_link1.200eno4back_link.200sn_bond110.10.52.0/24
IFN51eno3ifn_link1.51eno4back_link.51ifn_bond1172.16.51.0/24

ifn_bridge1 will be used as virtual switch for our servers (VMs) - it will give our VMs access to the VLAN 51 (our IFN). ifn_bond1 will connect to the ifn_bridge1 to connect to the real world

BCN setup

Setup back_link to be a member of  VLAN 100 (BCN):
back_link, only below lines must be in config file:

DEVICE=back_link
NAME=back_link
BOOTPROTO=none
ONBOOT=yes
HWADDR=proper_MAC_here

back_link.100

DEVICE=back_link.100
NAME=back_link.100
BOOTPROTO=none
ONBOOT=yes
VLAN=yes
SLAVE=yes
MASTER=bcn_bond1

Setup bcn_link1 to represent actual VLAN:
bcn_link1

DEVICE=bcn_link1
NAME=bcn_link1
BOOTPROTO=none
ONBOOT=yes
HWADDR=proper_MAC_here


bcn_link1.100

DEVICE=bcn_link1.100
NAME=bcn_link1.100
BOOTPROTO=none
ONBOOT=yes
VLAN=yes
SLAVE=yes
MASTER=bcn_bond1

Bonding options:
  1. mode=1 => Active/Passive
  2. miimon=100 => test interfaces every 100ms (MII - (Media Independent Interface) means that media type can be any - fiber,copper etc. / mon - monitoring) 
  3. downdelay=0 => when link goes down immediately switch to the other interface in bond
  4. updelay=120000 => switch back to the primary interface in 2 minutes
  5. use_carrier=1 => check the link state
Setup bcn_bond1:

vi /etc/sysconfig/network-scripts/ifcfg-bcn_bond1
DEVICE="bcn_bond1"
BOOTPROTO="none"
ONBOOT="yes"
BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=bcn_link1.100"
IPADDR=10.10.53.1
NETMASK=255.255.255.0

systemctl restart network.service

After setting agrp-c01n02, verify ping between nodes:

agrp-c01n01# ping 10.10.53.2
agrp-c01n02# ping 10.10.53.1

SN setup

Setup back_link to be a member of  VLAN 200 (SN):

back_link.200

DEVICE=back_link.200
NAME=back_link.200
BOOTPROTO=none
ONBOOT=yes
VLAN=yes
SLAVE=yes
MASTER=sn_bond1

Setup sn_link1 to represent actual VLAN:
sn_link1

DEVICE=sn_link1
NAME=sn_link1
BOOTPROTO=none
ONBOOT=yes
HWADDR=proper_MAC_here


sn_link1.200

DEVICE=sn_link1.200
NAME=sn_link1.200
BOOTPROTO=none
ONBOOT=yes
VLAN=yes
SLAVE=yes
MASTER=sn_bond1

Setup sn_bond1:

vi /etc/sysconfig/network-scripts/ifcfg-sn_bond1
DEVICE="sn_bond1"
BOOTPROTO="none"
ONBOOT="yes"
BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=sn_link1.100"
IPADDR=10.10.52.1
NETMASK=255.255.255.0

systemctl restart network.service

After setting agrp-c01n02, verify ping between nodes:

agrp-c01n01# ping 10.10.52.2
agrp-c01n02# ping 10.10.52.1

IFN setup

Setup back_link to be a member of  VLAN 51 (IFN):

back_link.51

DEVICE=back_link.51
NAME=back_link.51
BOOTPROTO=none
ONBOOT=yes
VLAN=yes
SLAVE=yes
MASTER=ifn_bond1

Setup ifn_link1 to represent actual VLAN:
ifn_link1

DEVICE=ifn_link1
NAME=ifn_link1
BOOTPROTO=none
ONBOOT=yes
HWADDR=proper_MAC_here


ifn_link1.51

DEVICE=ifn_link1.51
NAME=ifn_link1.51
BOOTPROTO=none
ONBOOT=yes
VLAN=yes
SLAVE=yes
MASTER=ifn_bond1

Setup ifn_bond1:

vi /etc/sysconfig/network-scripts/ifcfg-ifn_bond1
DEVICE="ifn_bond1"
BOOTPROTO="none"
ONBOOT="yes"
BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=ifn_link1.51"
BRIDGE=ifn_bridge1

Setup ifn_bridge1:
DEFROUTE=yes allows to use this interface as window to the outer world

vi /etc/sysconfig/network-scripts/ifcfg-ifn_bridge1
DEVICE=ifn_bridge1
TYPE=Bridge
BOOTPROTO=none
IPADDR=172.16.51.1
NETMASK=255.255.255.0
GATEWAY=172.16.51.254
DNS1=8.8.8.8
DNS2=8.8.4.4
DEFROUTE=yes

systemctl restart network.service

ping default gateway:
ping 172.16.51.254

After setting agrp-c01n02, verify ping between nodes:

agrp-c01n01# ping 172.16.51.2
agrp-c01n02# ping 172.16.51.1

Verifying

On both nodes verify master and slaves and interface states (Up/Down):
ip link | grep ifn
ip link | grep sn
ip link | grep bcn
ip link | grep back

Verify bonds (settings, slave status, failures count):
cat /proc/net/bonding/ifn_bond1
cat /proc/net/bonding/sn_bond1
cat /proc/net/bonding/bcn_bond1

Verify bridge (ifn_bridge1 must be shown, STP enabled must be no):
brctl show

PS if you encounter MAC flapping error on Cisco stack, like:
%SW_MATM-4-MACFLAP_NOTIF: Host aaaa.bbbb.cccc in vlan 51 is flapping between port Po1 and port Gi2/0/4
Then add MACADDR parameter to all bond interfaces. This MACADDR must be equal to the MAC of the non backup-link because back_link is in 3 VLANs and that can cause bond to choose back_link NIC MAC for all bond and VLAN interfaces (by default bond uses first added slave's MAC as it's own MAC).

Open vSwitch

We will bond all 4 interfaces (from eno1 through eno4) to the OvS bond ovs_bond. And then we'll create OvS internal ports and assign them IP:

Subnet
VID
OvS internal port
Net IP
BCN
100
bcn-bond1
10.10.53.0/24
SN
200
sn-bond1
10.10.52.0/24
IFN
51
ifn-bond1
172.16.51.0/24

Nodes naming and IP addresses:

NodeIP & BCN devIP & SN dev Ip & IFN dev 
agrp-c01n0110.10.53.1 on bcn-bond1 10.10.52.1 on sn-bond1172.16.51.1 on ifn-bond1
agrp-c01n0210.10.53.2 on bcn-bond1 10.10.52.2 on sn-bond1172.16.51.2 on ifn-bond1

Below commands must be executed on both nodes (with parameters appropriate to each node)


Create OvS bridge and bonds

Create OvS switch:
ovs-vsctl add-br ovs_kvm_bridge

Disable STP on this bridge:
ovs-vsctl set bridge ovs_kvm_bridge stp_enable=false

Add bonds to the ifn_bridge:
ovs-vsctl add-bond ovs_kvm_bridge ovs_bond eno1 eno2 eno3 eno4 trunks=100,200,51
In future if you need to add new VLANs to the trunk, execute below command with proper VLANs list:
ovs-vsctl set port ovs_bond trunks=100,200,300,400 etc.

Enabling LACP LAG protocol:
ovs-vsctl set port ovs_bond lacp=active bond_mode=balance-slb  bond-updelay=120000 bond-downdelay=0 other_config:lacp-time=fast  other_config:lacp-fallback-ab=true# no space is allowed in "config:lacp" part of configuration

To view bond interface configuration:
ovs-vsctl list Port ovs_bond

If you made mistake while configuring (i.e. wrote "lacp_time" instead of "lacp-time"):
ovs-vsctl remove port ovs_bond other_config lacp_time fast

lacp-time - either slow or fast -defines whether LACP packets are sent every 1 second, or every 30 seconds.
lacp-fallback-ab - if LACP failes - Active-Backup bonding will be used
balance-slb - Source-load Balancing (this is default on Cisco LACP bonds - sh run all | incl load-balance will give you src-mac):

  1. The source MAC address is extracted, and a hashing algorithm is used to map it to a hash number 0-255. 
  2. Each hash is assigned to one of the NICs on the bond, which means packets with the same hash are always sent through the same NIC. 
  3. If a new hash is found, it is assigned to the NIC that currently has the lowest utilization. 
  4. In practice, this means that when virtual machines (VMs) are set up on a bond, packets from one VM (with the same source MAC) will always be sent through the same NIC.


To remove ports and bridge (if something went wrong):
ovs-vsctl del-port ovs_kvm_bridge ovs_bond1
ovs-vsctl del-br ovs_kvm_bridge

To view bond configuration:
ovs-appctl bond/show ovs_bond # bond_mode must be active-backup / lacp_status = negotiated / all interfaces slave eno{1..4} : enabled
ovs-appctl lacp/show ovs_bond | head -n 7 # status: active negotiated / lacp_time: fast  


To view MAC address table:
ovs-appctl fdb/show ovs_kvm_bridge

Below command can be used to verify overall OvS bridge configuration (including STP status), -S option makes output scroll-able with keyboard left-right arrow keys:
ovsdb-client dump | less -S

Create OvS internal ports for node and assign them IP:

Setup IFN ifn-bond1, make it internal and assign VLAN ID 51:
ovs-vsctl add-port ovs_kvm_bridge ifn-bond1 -- set interface ifn-bond1 type=internal -- set port ifn-bond1 tag=51

Assign an IP to ifn-bond1:

vi /etc/sysconfig/network-scripts/ifcfg-ifn-bond1
DEVICE=ifn-bond1
NAME=ifn-bond1
ONBOOT=yes
BOOTPROTO=none
IPADDR=172.16.51.1
NETMASK=255.255.255.0
GATEWAY=172.16.51.254
DNS1=8.8.8.8
DNS2=8.8.4.4
DEFROUTE=yes

Setup BCN bcn-bond1, make it internal and assign VLAN ID 100:
ovs-vsctl add-port ovs_kvm_bridge bcn-bond1 -- set interface bcn-bond1 type=internal -- set port bcn-bond1 tag=100

vi /etc/sysconfig/network-scripts/ifcfg-bcn-bond1
DEVICE="bcn-bond1"
BOOTPROTO="none"
ONBOOT="yes"
IPADDR=10.10.53.1
NETMASK=255.255.255.0

Setup SN sn-bond1, make it internal and assign VLAN ID 200:
ovs-vsctl add-port ovs_kvm_bridge sn-bond1
ovs-vsctl set interface sn-bond1 type=internal
ovs-vsctl set port sn-bond1 tag=200

vi /etc/sysconfig/network-scripts/ifcfg-sn-bond1
DEVICE="sn-bond1"
BOOTPROTO="none"
ONBOOT="yes"
IPADDR=10.10.52.1
NETMASK=255.255.255.0

systemctl restart network.service

To list all ports which OpenvSwitch sees:
ovs-vsctl list-ports ovs_kvm_bridge # will show:
bcn-bond1
ifn-bond1
ovs_bond
sn-bond1

To listen to the ports traffic:
yum install tcpdump
tcpdump -i port_name # port name is one of the ports seen by OvS

Verifying

IFN test:
From agrp-c01n01 ping 172.16.51.2
From agrp-c01n01 ping 172.16.51.254
From agrp-c01n02 ping 172.16.51.1
From agrp-c01n01 ping 172.16.51.254
BCN test:
From agrp-c01n01 ping 10.10.53.2
From agrp-c01n02 ping 10.10.53.1
SN test:
From agrp-c01n01 ping 10.10.52.2
From agrp-c01n02 ping 10.10.52.1

After-setup steps

Backup configs after setting done (either using Linux bonding and bridging or Open vSwitch):
rsync -av /etc/sysconfig/network-scripts /root/backups

This tutorials were used to understand and setup clustering: 
AN!Cluster
brezular.com
citrix.com

No comments:

Post a Comment