Monthly Archives: November 2014

Time to Give Thanks

It’s that time of the year here in America when we give thanks for the blessings in our lives. I hope you take just a bit of time, away from the TV and craziness of life, to do just that.

I want to thank the Lord for his many blessing in my life…a wonderful wife, great kids, abilities and talents in the IT field, and a great job with a great company. And most importantly, I want to thank the Lord for his son Jesus and his sacrifice on the cross. To Him be the Glory, for Ever and Ever. Amen.

Have a great and safe Thanksgiving holiday.

Security News – Regin and WordPress

Folks, here is one nasty piece of malware: Regin. Symantec has a fascinating and rather detailed write-up on Regin here. Very scary stuff. Most reports show that Regin has been in the wild since 2008, but I’ve seen a report or two that points further back to 2003. Due to the incredible complexity of Regin, consensus is that a nation state is the author, and the best choices are USA, Great Britain, China, or Israel. (Notice that no infections have been reported in USA or China.)

If you run a blog or website on WordPress (like I do), then note that WordPress has issued an update of their software which fixes a number of bugs and security vulnerabilities, including a critical flaw that could be used in a XSS (Cross-Site Scripting) attack. Exploits for this are most likely already out in the wild, so it is highly recommended that you apply the updates. You can view the security notice here.

Book Recommendation – Newton’s Telecom Dictionary

A Great First Book for Your IT Library

A Great First Book for Your IT Library

If you don’t already have Newton’s Telecom Dictionary, then you need to get it. Yes, it’s that good. If you are new to the IT field or a student working towards a career in IT, then this is a must have book. Why?

Put simply, this book is full of the answers you need when you need them. It’s crammed full of definitions for every technical term and abbreviation you will come across, and believe me the IT field is FULL of abbreviations! And more important, the information is presented in a very readable format. Plus, the author has sprinkled throughout the book his unique humor in just the right amount. A couple of examples…

Betazed –  A planet in the second Star Trek TV series, inhabited by Betazoids, beings with great powers of empathy and telepathy.

Bunny Suit –  A layered, hooded outfit that covers every part of your body, except your eyes. Bunny suits are worn by people who work in places where cleanliness is absolute. The human skin sheds about 30,000 particles of skin a second. If one of these particles made it into a semiconductor or a piece of optical fiber it could seriously impair the usability of the device.

This book is a must have book for any IT person. It is full of helpful information and humor. Grab a copy and pick a random page and start reading….you will be hooked!!

Another Busy Week…but Very Successful

Hello again. As a follow-up to last weeks post (read it here), this past week was once again way too busy. I spent the entire week at the new office, working 12 hour days, getting it ready for move in and go-live. The following tasks were accomplished…

  • All network cabling was installed, labeled and tested. This consisted of about 60 workstations, running two data cables to each workstation. The cabling vendor is a company I’ve worked with for many years…they know what they are doing and it shows in the final product. No worries here.
  • A solid wall of backboard was installed in the MPOE, and on that was mounted a swing-out Chatsworth rack. A bit pricey but worth the extra money…the whole rack can pivot to the side, giving access to the rear of the equipment. (Check out Chatsworth’s Swing Gate if you are interested.)
  • Network router (Cisco 2851), Cisco switch stack (3750’s), and several Cisco Access Points were installed, configured and tested.
  • The new PRI circuit was tested and 100 DID’s (Direct Inward Dialing) were ported over from our old PRI circuit.
  • The wireless broadband is working well, but I am still keeping my eye on it. Not sure if the vendor fine tuned it or not, but I am seeing better performance.
  • Security cameras and a key-fob access system was installed.

It was a long, but successful week. I am also glad this type of project does not occur often.

I hope you had a great weekend!!

The Busy Life of a Network Engineer

Sorry about the lack of posts this week…I have just been way too busy, and working some long hours.  I will get back on track this weekend. Here is a quick summary of my week…

  • Suffered a network outage at one of our busiest District Offices. I had to travel to the location and work with the carrier (a major fiber and Internet carrier), and troubleshoot with them over the phone. As always, they said the issue was with my equipment. (Carriers almost ALWAYS say the issue is with your equipment.) And like always, I have to prove to them that it’s their issue…which it was. Somehow, the VLAN carrying my traffic was changed which brought my network down. We finally got the circuit back up at 2 AM, twelve hours after it went down. Ugh.  And 3 days later, they still cannot tell me how that happened. I’m like “Is there really that many people that can make those types of changes? Don’t you track your changes?”  I guess they don’t.
  • I’m the PM (Project Manager) for the IT part of a new District Office which is going live in a couple of weeks. Yes, this is the location in which we have had major issues with the LEC (Local Exchange Carrier). Check out some earlier posts (Part 1 and Part 2) which talks about these challenges. We did finally get a PRI circuit installed, but no fiber Internet. I ended up using a vendor that offers high speed wireless broadband. I was onsite for a couple of days, bringing this up and testing. The circuit is 15 Mb, up and down. It’s working relatively good, but I’m seeing a bit of an issue with large packets (over 1100 bytes)…I have a consistent packet loss of between 1-2%. I know that does not sound like much, but when you are moving large files around, that ends up pushing your through-put way down to around 6 Mb. I will say this…the vendor is very easy to work with, and they already are going to work with me next week to resolve this.
  • I had an MPLS T1 circuit at a very remote site giving me fits all week long. It was taking errors pretty much 24×7, and even going down for several hours at a time almost every day. The carrier dispatched out multiple times before finally getting the issue resolved. (They had to replace multiple jumpers, and redo some splices.) It’s now been running clean for almost 48 hours straight. My thanks to the technician who hung in there and got this fixed.
  • We recently opened up a temporary site out in the boonies…like way out. This site has no copper facilities at all…no phones, no network circuits…nada. However, it is located right next to a major Interstate, and there is a Verizon tower nearby. I was tasked with getting a Cradlepoint router (with a 4G Verizon card attached) to run DMVPN (Dynamic Multipoint VPN), and connect with a Cisco router at my Data Center. This was a challenge, especially since Verizon likes to run double NAT’ing in their 4G networks. Yep, the 4G card gets a valid public IP address, but that’s not what’s seen on the Internet. Somewhere upstream, still within Verizon’s network, it gets NAT’d again with a different public IP. (Way to go Verizon.) Well, I did get DMVPN to work after much trial and error. We are testing now to see how stable it is, and hope to install it at the site in the next week or two.

As you can see, it was a busy week. And next week will be just as busy.  I’m going to be down at the new District Office most of the week, overseeing all of the cabling, cutting over to the new PRI, installing the network equipment, and working on resolving the packet loss issue. Wish me luck.

And, have a great weekend!!

How to Stress Test a T1 Circuit

Most all networks have T1 circuits, the most common being either an MPLS T1 or Internet T1. There will be times when one of your T1 circuits will be acting up in a sporadic manner, causing “slowness” for your end users, and will require you to be more proactive in troubleshooting the root cause. This post will talk about how to stress test a T1 using PING.

First off, understand that using a PING command with the default parameters will tell you if the circuit is up or down, and it may show problems such as large latency or excessive drops. But to really test a T1, you need to modify the use of PING to perform a more thorough test. Common actions are to increase the packet size and frequency of pings to better test throughput, and to use specific data patterns to better test the operation of the T1.

You can use Cisco’s PING which is part of IOS.  Here is an example of an extended PING where we increase the packet size to the max MTU of 1500 bytes, and run all 1’s (which will provide additional stress on the circuit)…

Using Cisco's IOS ping command

Using Cisco’s IOS ping command

A much better PING to use though is Linux, with the “flood” option, as this will allow you to really hammer the T1 circuit. (Note…you need to be root to use the flood option.) The difference is this…Cisco’s PING will send an echo-request, but will wait for the echo-reply before it can send another echo-request. This greatly reduces the amount of ping traffic IOS can send across the T1. Linux however, will immediately start sending as many packets as it can, up to 100 per second. For each echo-request packet it sends, it prints a “.” (dot) on the screen. For each echo-reply it receives, it prints a back-space. So if you only see a couple of dots, then the circuit is handling your ping flood easily. However, if you start seeing dots race across the screen, then there are problems. Here is a Linux PING flood example with 1500 byte packets and running all 1’s…

Using Linux ping -f (flood) option to stress a T1 circuit

Using Linux ping -f (flood) option to stress a T1 circuit

As you can see, there are only three dots…7356 packets were sent and 7353 were received. That leaves 3 missing packets. This T1 easily handled this test. Plus a Linux ping flood will typically load up a T1 in the range of 700-900 Kb (about 1/2 of a T1 circuit). If you really want to fully load up a T1, run two different instances of ping flood, and you will see a T1 circuit fully saturated (or near so). Of course, do NOT do this during normal business operations…you will heavily impact the end users, and they will not be happy. When running the Linux ping flood shown above, the resulting bandwidth impact on the T1 was…

Using "show interface" to see bandwidth impact of ping flood

Using “show interface” to see bandwidth impact of ping flood

In my next post I will give an example of how I used ping flood to troubleshoot a T1 circuit whose performance was impacted by a unique problem.

Killing Those Pesky Child Processes in Linux

Linux is awesome! It is solid and dependable, and you can do most anything you want with it. I use it everyday…for Network management purposes mainly (Nagios, MRTG, SWATCH, SYSLOG, NMAP, etc). If you have not used Linux yet, I would highly encourage you to do so. I will post a feature on Linux soon on how Linux can play a large role in helping you manage your network. But for now, lets kill some pesky child processes.

Although I use Linux a lot, I am in no way a Linux guru. I write simple scripts and hack my way through stuff. However, when I kill a process, I sometimes find one or more child processes that remain. So I researched different ways to take care of these, and there are many ways to accomplish this. For me, this works best…

To help troubleshoot an ISP issue with one of my Internet fiber links, I’m running a ping against the public IP address on a per second basis (very granular), and adding a time/date stamp to each ping reply.  Here is the script…(IP address has been changed to protect the guilty)…

Simple Ping Script with Date/Time Stamp

Simple Ping Script with Date/Time Stamp

Here is a snippet from the log file showing what the ping replies look like…

Ping results from log file

Ping results from log file

When I’m notified of a network bounce, I’m able to dig through the file and see if the Internet circuit did indeed take a hit, or was it just my Virtual Tunnel interface bouncing. Here is an example showing an outage that lasted a bit over 2 minutes…take a look at the timestamp and also the gap in the sequence numbers…

Quick circuit outage lasting a bit over 2 minutes

Quick circuit outage lasting a bit over 2 minutes

Anyway, when I kill the script, I am ending up with two child processes remaining. I found out that I need to kill the PGID (Process Group ID) to properly take care of any child processes. To find the PGID, you can run “ps -ejH”, which shows you a process tree where you can find the PGID (in column two). Then you can kill the PGID using “kill — -PGID”. Here is an example…

Finding the PGID and killing it

Finding the PGID and killing it

This works well for me. And as for Linux, give it a try.