Routing Problem
Posted in Tech
First dog watch, 1 bell (4:57 pm)

I've been fighting this weird routing issue in the office today. Apparently our web developer was seeing it since he started working this morning, but I didn't notice it until he brought it up with me. I didn't notice because I hadn't yet ssh'd into one of our servers yet today.

What I saw was about a 20 second lag, every minute or two. I could work for a while, then all of a sudden I get no response from the server for about 20 seconds (but don't lose the connection) then everything catches up and I'm good for another couple minutes.

I ran a traceroute while the "outage" was happening and when I got to the router before my server, I didn't get any reverse DNS name resolution, and I got 6 hops of timeouts before it reached my server (server ping time: 3.301ms, 2.825ms, and 2.803ms—yes, I have a fast connection there). As soon as the problem resolved itself, a subsequent traceroute showed a resolved router and direct hop to my server, but with higher (80+ms) latency.

My gut reaction was a problem with our Colocation facility's router, so I called them about it. They weren't seeing anything on their network. The technician tried from another connection outside their normal network and it worked fine, too. He never saw the latency issue.

Just as I was suspecting it may be something weird with our switch being overloaded and/or dropping packets, I get a call back from the technician and he tells me he just found out there is a known issue with a new router with incoming connections through XO Communications (our Internet provider here). I was relieved to know it wasn't any of my equipment, and about 30 minutes later they discarded the route that was going through the router in question.

So now the problem is gone, and not many people on the outside world even experienced it, but on the down side my route is now 17 hops long instead of 7, and my latency is about 60ms. Still, I'll take it over a broken route any day.

Leave a Comment »
Don’t Panic
Forenoon watch, 6 bells (11:07 am)

It hasn't happened yet.

Leave a Comment »