Archive for the ‘Articles’ Category

Implement bonding in RHEL 5

Bonding is the process of combining 2 NICs on a system into a single device. For e.g., if you have 2 network cards on a machine, eth0 and eth1, you combine the same into a bond device, bond0 and then configure an IP for this bond device.

Why do we have to do that, you may ask. In the case that I mentioned, if I configure bond0 as 192.168.1.5, both eth0 and eth1 can send or receive packets that are meant for bond device IP(192.168.1.5). Its like you now have 2 paths to reach a destination. Bond devices can be configured in different modes which can be utilized to provide fault tolerance, greater performance or both, depending on the mode.

Bonding is talked about in greater detail inĀ  /usr/share/doc/kernel-doc-<kernel version>/Documentation/networking/bonding.txt.

As usual, all my experiments are on either Xen or VMWare guests and this one is no different. The below steps successfully worked for me on a RHEL 5.3 Xen guest. To start with, eth0 was configured as 192.168.122.118 while eth1 remained unassigned. I am about to create a bond device bond0 with eth0 and eth1, and assign this IP into it.

1. Add the below lines to /etc/modprobe.conf

alias bond0 bonding
options bond0 mode=1 miimon=100

We are loading the bonding kernel module required to make this work, along with some options. mode=1 means that I have opted for active-backup setup. Here, only one slave in the bond device will be active at the moment. If the active slave goes down, the other slave becomes active and all traffic is then done via the newly active slave. If this sounds a bit confusing, just read on. Also, the value of miimon specifies how often MII link monitoring occurs. For a complete list of all the available arguments, feel free to check the kernel documentation.

2. Create bond0 device file, /etc/sysconfig/network-scripts/ifcfg-bond0 with the following content:

DEVICE=bond0
BOOTPROTO=none
ONBOOT=yes
NETWORK=192.168.122.0
NETMASK=255.255.255.0
IPADDR=192.168.122.118
USERCTL=no

The lines are self-explanatory, defining the device name and then specifying its IP address, netmask and all.

3. Create /etc/sysconfig/network-scripts/ifcfg-eth0 with content:

DEVICE=eth0
MASTER=bond0
SLAVE=yes
USERCTL=no
BOOTPROTO=dhcp
IPV6INIT=yes
IPV6_AUTOCONF=yes
ONBOOT=yes

4. Create /etc/sysconfig/network-scripts/ifcfg-eth1 with content:

DEVICE=eth1
MASTER=bond0
SLAVE=yes
USERCTL=no
BOOTPROTO=dhcp
IPV6INIT=yes
IPV6_AUTOCONF=yes
ONBOOT=yes

The important lines here to note are “MASTER=bond0“, and “SLAVE=yes” which tells that both eth0 and eth1 are now part of bond0 device.

5. Restart network and you are done!

[root@localhost ~]# service network restart
Shutting down interface eth0:                              [  OK  ]
Shutting down loopback interface:                          [  OK  ]
Bringing up loopback interface:                            [  OK  ]
Bringing up interface bond0:                               [  OK  ]
[root@localhost ~]# ifconfig
bond0     Link encap:Ethernet  HWaddr 00:16:3E:1C:C5:A7
          inet addr:192.168.122.118  Bcast:192.168.122.255  Mask:255.255.255.0
          inet6 addr: fe80::216:3eff:fe1c:c5a7/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:80 errors:0 dropped:0 overruns:0 frame:0
          TX packets:54 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:11210 (10.9 KiB)  TX bytes:11630 (11.3 KiB)

eth0      Link encap:Ethernet  HWaddr 00:16:3E:1C:C5:A7
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:45 errors:0 dropped:0 overruns:0 frame:0
          TX packets:63 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:3288 (3.2 KiB)  TX bytes:13120 (12.8 KiB)

eth1      Link encap:Ethernet  HWaddr 00:16:3E:1C:C5:A7
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:42 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:8384 (8.1 KiB)  TX bytes:0 (0.0 b)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:10 errors:0 dropped:0 overruns:0 frame:0
          TX packets:10 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:764 (764.0 b)  TX bytes:764 (764.0 b)

As you can see from the output of ifconfig, device bond0 is listed as MASTER while devices eth0 and eth1 are listed as SLAVE. Also, the hardware address of bond0 and its underlying devices eth0 and eth1 are the same (00:16:3E:1C:C5:A7). In case you have multiple bond devices, comparing the hardware address of that bond device with the actual network device (ethX) will tell you whether it is a part of that particular bonding or not.

Now, the current status of the bond device bond0 is present in /proc/net/bonding/bond0. Time to fool around with bonding now… :-)

[root@localhost ~]# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.2.4 (January 28, 2008)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: eth0
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth0
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:16:3e:1c:c5:a7

Slave Interface: eth1
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:16:3e:58:02:c7

As I have highlighted above, the bonding mode is active-passive (since I used mode=1 to configure it in modprobe.conf). Also, both interfaces are up, but current active slave is eth0. Now, what happens when I down eth0? Normally when we down an interface, the IP associated with it also goes down (becomes unreachable). However, in bonding, it just switches over to next slave, eth1 - keeping the connection and the IP active:

[root@localhost ~]# ifdown eth0
[root@localhost ~]# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.2.4 (January 28, 2008)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: eth1
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth1
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:16:3e:58:02:c7
[root@localhost ~]# ifup eth0
[root@localhost ~]# cat /proc/net/bonding/bond0 
Ethernet Channel Bonding Driver: v3.2.4 (January 28, 2008)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: eth1
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth1
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:16:3e:58:02:c7

Slave Interface: eth0
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:16:3e:1c:c5:a7

Notice that when I started eth0 again (ifup eth0), it got added to the bond device automatically. Also, in the above output, even though the permanent HW Address of eth0 and eth1 are different, they retain the HW address of the bond device in ifconfig output:

[root@localhost ~]# ifconfig
bond0     Link encap:Ethernet  HWaddr 00:16:3E:1C:C5:A7 
          inet addr:192.168.122.118  Bcast:192.168.122.255  Mask:255.255.255.0
          inet6 addr: fe80::216:3eff:fe1c:c5a7/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:87 errors:0 dropped:0 overruns:0 frame:0
          TX packets:35 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:6340 (6.1 KiB)  TX bytes:4950 (4.8 KiB)

eth0      Link encap:Ethernet  HWaddr 00:16:3E:1C:C5:A7 
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:85 errors:0 dropped:0 overruns:0 frame:0
          TX packets:44 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:6360 (6.2 KiB)  TX bytes:6424 (6.2 KiB)

eth1      Link encap:Ethernet  HWaddr 00:16:3E:1C:C5:A7 
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:11 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:574 (574.0 b)  TX bytes:0 (0.0 b)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:10 errors:0 dropped:0 overruns:0 frame:0
          TX packets:10 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:764 (764.0 b)  TX bytes:764 (764.0 b)

That’s it..!! Happy bonding.. :)

 

DNS Terms you need to know

During my beginning years of System Administration, I was pretty much confused by the terms ‘primary nameserver’, ’secondary nameserver’, ‘master/slave nameserver’ etc. Different websites have different views on these terms. I am sure many system admins (even experienced ones) have got their ideas wrong about these terms.

My confusion ended the day I stumbled upon Oreily’s DNS and Bind. This book is without doubt the best DNS related book in the world, and would recommend this book to anyone who wish to know more about DNS and its working. So, here we go:

A Name Server keeps information for the translation of computer names to IP addresses (even for reverse translations). The name server takes care of a certain part from the space of names of all computers. This part is called the zone (at minimum it takes care of zone 0.0.127.in-addr.arpa). A domain or its part creates the zone. The name server can with the help of an NS type record (in its configuration) delegate administration of a subdomain to a subordinate name server. The name server is a program that performs the translation at the request of a resolver or another name server. In UNIX, the name server is materialized by the named program. Also the name BIND (Berkeley Internet Name Domain) is used for this name server.

Types of name servers differ according to the way in which they save data:

> Primary name server/primary master is the main data source for the zone. It is the authoritative server for the zone. This server acquires data about its zone from databases saved on a local disk. Names of these types of servers depend on the version of BIND they use. While only the primary name server was used for version 4.x, a primary name master is used for version 8. The administrator manually creates databases for this server. The primary server must be published as an authoritative name server for the domain in the SOA resource record, while the primary master server does not need to be published. There is only one of this type of server for each zone.

> Master name server is an authoritative server for the zone. The master server is always published as an authoritative server for the domain in NS records. The master sever is a source of data of a zone for the subordinate servers (slave/secondary servers). There can be several master servers. This type of server is used for Bind version 8 and later.

> Secondary name server/slave name server acquires data about the zone by copying the data from the primary name server (respectively from the master server) at regular time intervals. It makes no sense to edit these databases on the secondary name servers, although they are saved on the local server disk because they will be rewritten during further copying. This type of name server is also an authority for its zones, i.e., its data for the particular zone is considered irrevocable (authoritative). The name of this type of server depends again on the version of BIND it uses. For version 4, only the secondary name was used, the term slave server was used for a completely different type of server. In version 8 you can come across both names.

> Caching-only name server is neither a primary nor secondary name server (it is not an authority) for any zone. However, it uses the general characteristics of name servers, i.e., it saves data that comes through its cache. This data is called non-authoritative. Each server is a caching server, but by the words caching, we understand that it is neither a primary nor secondary name server for any zone. (Of course, even a caching-only server is a primary name server for zone 0.0.127.in-addr.arpa, but that does not count).

> Root name server is an authoritative name server for the root domain (for the dot). Each root name server is a primary server, which differentiates it from other name servers.

> Slave name server (in BIND version 4 terminology) transmits questions for a translation to other name servers; it does not perform any iteration itself.

> Stealth name server is a secret server. This type of name server is not published anywhere. It is only known to the servers that have its IP address statically listed in their configuration. It is an authoritative server. It acquires the data for the zone with the help of a zone transfer. It can be the main server for the zone. Stealth servers can be used as a local backup if the local servers are unavailable.