September 5, 2006
DHCP, VoIP & Updated Kernel
Yesterday, I was working on a new DNS server I was setting up for someone, when I noticed that I hadn't setup
rndc on my home DNS server. After fixing that, I added a couple of zones to my home DNS server to have it serve (for now) as the secondary server for some new domains, but the zone transfers weren't working. It was immediately obvious that I needed to alter the firewall configuration to allow the DNS server to initiate zone transfers, so I ended up looking at a couple parts of my firewall configuration.
It was an easy change, but in the course of all of this, I had made a typo that caused me a problem with one of my VLAN connections. When I tried to bring the link down so I could reinitialize it with some fixed configuration, it hung. Running
kill -9 didn't work, so I decided to just reboot. It had been several months since my last boot and there was a newer kernel that I wanted to be using, anyway, so it wasn't a big deal.
But then, DHCP (which is on the same box) no longer worked. I double-checked that
dhcpd was running, did a little sniffing and discovered that the DHCP segments weren't actually getting past the firewall configuration. This is odd, since a Netfilter firewall configuration that is completely locked down (i.e. a policy of
DROP on all built-in chains and no rules) has always still permitted DHCP traffic to get through. However, I decided to add a couple of rules to the firewall, such as:
-A INPUT -p udp --sport 68 --dport 67 -m state --state new -j ACCEPT
With those new DHCP rules loaded up, it started working again. It would appear that the latest kernel for Red Hat Enterprise Linux 4 and CentOS 4 (which that server runs), has made a change in Netfilter so that it now affects DHCP renewals.
Perhaps a little explanation of DHCP is order, here (skip down a little if you already know this part, in order to get the rest of the story).
When a device has no Layer 3 configuration, a DHCP client has to form the initial UDP datagram (which carries the DHCP request, itself) directly within a Layer 2 frame (usually Ethernet). This is (obviously, to those of you who know about how the network stack works) odd, as you almost never "skip" any layers when forming a packet to send out.
OK. So, this initial DHCP setup works fine, even under the newer kernel. The DHCP server's firewall was allowing this in. What happened was that my notebook had reached the end of the lease time for the DHCP lease that I had, but I didn't have any rules to allow DHCP on UDP on IP, which is what happens when your DHCP client is trying to renew the lease for the IP addresses you are already using. What I experienced was that my notebook suddenly had no working network configuration, but if I stopped the DHCP client and had it start again from scratch, it worked.
The fix for all of that was to add rules in the firewall that allowed DHCP much like I allow web browsing, email or World of Warcraft to go through.
"OK, Lamont; I'm following you so far, but what does that have to do with VoIP? Remember, you mentioned that in the title to this article?" "Why, yes; I remember and we're now going to connect that dot."
Back to our regularly scheduled programming.
So, in changing my firewall configuration, I did not add any rules to allow DHCP renewals from the outside interface of my firewall. That one is connected to my DSL router, which has a 4-port switch built in. My Vonage VoIP box is also connected to the DSL router's built-in switch. Previously, it was being configured by my DHCP server, but no more. In the early evening yesterday, I noticed that my phones were getting no dial-tone. As I was working on some other things, I only gave it a cursory look.
This morning was when it finally dawned on me as to what I had done that had killed the Vonage box; I had taken away it's IP configuration. Another notebook that I plugged into the Vonage box's 4-port switch in order to configure it was also unable to do DNS lookups.
Well, after getting home this evening (Dax and I went to the Utah Asterisk User Group meeting this evening), I spent a few minutes configuring static routes on my firewall, the DSL router and the Vonage device, plus setting up static networking and turning off NAT on the Vonage box, everything is working again. Perhaps you'll think I have too many subnets here at home (what? you don't use 4 subnets for 2 people with 8 computers? :) hehe), but I like it. Oh, and in case you're wondering, the only SNAT I'm doing is on the DSL router, so I'm really using routed subnets with all these devices with built-in 4-port switches.
To make a long story short (too late!), I learned that the newest RHEL4/C4 kernels have a change to Netfilter that affects DHCP, and I refined the configuration of my networks. Hopefully, you won't end up pulling the trigger while this particular gun is pointed at your foot. :) Good luck!
Posted by lamontp at September 5, 2006 10:46 PM