Killing Those Pesky Child Processes in Linux

Linux is awesome! It is solid and dependable, and you can do most anything you want with it. I use it everyday…for Network management purposes mainly (Nagios, MRTG, SWATCH, SYSLOG, NMAP, etc). If you have not used Linux yet, I would highly encourage you to do so. I will post a feature on Linux soon on how Linux can play a large role in helping you manage your network. But for now, lets kill some pesky child processes.

Although I use Linux a lot, I am in no way a Linux guru. I write simple scripts and hack my way through stuff. However, when I kill a process, I sometimes find one or more child processes that remain. So I researched different ways to take care of these, and there are many ways to accomplish this. For me, this works best…

To help troubleshoot an ISP issue with one of my Internet fiber links, I’m running a ping against the public IP address on a per second basis (very granular), and adding a time/date stamp to each ping reply.  Here is the script…(IP address has been changed to protect the guilty)…

Simple Ping Script with Date/Time Stamp

Simple Ping Script with Date/Time Stamp

Here is a snippet from the log file showing what the ping replies look like…

Ping results from log file

Ping results from log file

When I’m notified of a network bounce, I’m able to dig through the file and see if the Internet circuit did indeed take a hit, or was it just my Virtual Tunnel interface bouncing. Here is an example showing an outage that lasted a bit over 2 minutes…take a look at the timestamp and also the gap in the sequence numbers…

Quick circuit outage lasting a bit over 2 minutes

Quick circuit outage lasting a bit over 2 minutes

Anyway, when I kill the script, I am ending up with two child processes remaining. I found out that I need to kill the PGID (Process Group ID) to properly take care of any child processes. To find the PGID, you can run “ps -ejH”, which shows you a process tree where you can find the PGID (in column two). Then you can kill the PGID using “kill — -PGID”. Here is an example…

Finding the PGID and killing it

Finding the PGID and killing it

This works well for me. And as for Linux, give it a try.