To Hell and Back: Why I've come to despise Cisco

To Hell and Back: Why I've come to despise Cisco

It's been an eventful week full of homework and some last touches on my Summer of Code application, but mostly this week has been filled with lots of IT work. On Tuesday afternoon, the internet stopped functioning at my workplace. This is a bad event, especially considering we have around 200 employees all of whom are dependent on the internet to get work done. And so we begin troubling shooting.

First we bypass our main switch to see if it is the problem. We are able to bypass it using a couple of 8-port Belkin switches and it becomes clear that the switch is not at fault. However, it's pretty interesting to switch a network over from a 24 port managed switch to a couple of crappy Belkin ones and still have things work. We throw everything back on the managed switch and continue trouble shooting.

The next thing is to try and bypass our firewall, our gateway to the internet, and this is where the problems really begin. Unfortunately we don't have any backup routing hardware, but we need something that we can bypass our firewall/router with. So, we manage to wrangle up a Linksys WRT-54G and route the internet traffic for a 200 computer network through that. As it turns out this works and that means that our firewall likely has had some kind of hardware failure (note: we had not been configuring it in the timeframe of the failure, so no chance of misconfiguration). For about six hours a $60 consumer grade wireless router was more effective than a $1000 Cisco PIX firewall.

So we've found the problem, but we managed to create a new one. We hadn't realized our main switch was a managed switch, and that certain computers had teamed links (this is where to NICs are linked into one interface for higher reliability and throughput). This caused our internal network to come to a grinding halt. Thankfully, a simple login to the web interface of the switch was all that was needed to correct this. By this time, I have been working for 12 hours and it is 2am and so I manage to get home and get some much needed rest.

Overnight, my coworkers manage to get the internet working on a different router that is slightly more beefy. Internet access is still a little slow but it isn't unbearable. However, there is still a major problem, our employees can not access our computer which runs our website (and has a database server which is now inaccessible on it). This computer contains a large amount of data that people need to get access to. By putting all of the workstations through a separate router, this computer is now on a different network from all of the others (by a beautiful miracle, the PIX firewall has another network interface which does not go down leaving our website up). I investigate for a little bit and find that it has an unused network interface. I plug it into our main switch, configure the network interface and update our DNS information to point to the new IP. It works! Just a couple of changes need to be made to some scripts with hard-coded IPs and everything is working like a charm.

After this, we work towards replacing the PIX firewall. Thankfully, we have an extra server on hand which isn't really doing much and we decide to press it into service. It's a good thing that pfSense, a FreeBSD based firewall/routing platform exists. We install it onto to the server and configure it. When we make the switch between the new pfSense box and the consumer grade router we are running all our network traffic through, the internet downtime only lasts for about 10 seconds. I liked pfSense before, I start loving it now.

Still more trouble, the email server on our network can not receive incoming SMTP connections because of the move to a new router. I configure the virtual IP on pfSense and all of the NAT and firewall rules, but none of seems to work. We spend around 2 hours banging our heads against the wall trying to figure out why it isn't working. It turns out that our external switch, a Cisco Catalyst 2900, doesn't like to update its routing table too frequently and is still sending the traffic from that IP to the now defunct PIX firewall. Disabling the LAN interface (or what had been the LAN interface at least) of the PIX causes the switch to clear it's routing table, and suddenly everything with email is working like a charm. By this time I've spent another straight 10 hours working to get things up and running again. The IS manager and Assistant IS manager both worked for over 20 hours straight to get things working.

The moral of the story, don't trust Cisco. It's products are complicated, in many ways far too complicated for what they are meant to do. For instance, why does every single network port on a switch need to be turned off by default? Typing something like "configure interface e0 no shutdown" to turn it on is completely inane. That's right, to turn something on you type "no shutdown", clever isn't it? Additionally, if anyone wants to actually configure a large network like this by hand they must be crazy, is it that hard to create a relatively simple web interface? I suggest anything but Cisco unless your setup needs to be really complicated and highly specific (which for 99% of situations it probably doesn't have to be).

On the other hand I have fallen in love with pfSense. It just works. It's web interface is powerful and intuitive, being a BSD system it has an easy to understand command line (for instance, turning on a network card is as easy as "ifup em1", but you probably didn't need to do that in the first place), and since it is a BSD system is also has installable packages so that you can easily add functionality to your router at any time. Additionally, it is very fast (it can run our entire network without problems) and changes made to any of the settings happen almost instantly. While we still need to switch some of our network over to this new router, I think that the transition will go very smoothly and I'm almost looking forward to making it.

It's been a long week thus far, sometimes going to hell and back takes a little bit of work and the determination to fix any problems which come up.

Don't blame Cisco

Hardware failures can affect any brand of hardware. If you had the firewall under a service contract, you could have had a replacement within a few hours, instead of having to start building a new network architecture with all the problems you experienced down the way.

Plan for failure, build redundancy into your design and ensure when hardware does fail, someone else has already been paid to take the responsibility of fixing it - not you. 200 people who need the Net to do their work should not be exposed to the interruptions of a couple of guys messing around for days on the network just because one device has an issue.

I agree 100%. That was your

I agree 100%. That was your fault not Cisco's. How could you not have redundancy? ALL devices will fail at some point, count on it!!

we didn't have redundancy...

Because even though I worked at a place with 200 people it was a non-profit and there was no money for redundant hardware, etc. So in this case we had no choice but attempt to fix the problem with the only tool we had open to use, our heads.

Exactly! Not all us are lucky

Exactly! Not all us are lucky enough to have the kind of budgets that allow for redundancy especially of the expensive Cisco kind. Cisco is perhaps more reliable in the long haul than less expensive products, but when something does happen, and it will, good luck unraveling the inane Cisco tangle. They are overcomplicated, offer poor service, and are not improving in either area. I'm convinced their market share is due to an aging IT population that is stuck on the “name”.