Saturday, March 30, 2013

Not So Simple tools - Ping

One of the conversations that I've had many time over the years is regarding one of the most commonly used and misused tools ... ping. Each network operating system seems to handle ping differently, causing frustration for many operators. That said, there are a few things about ping that will make the tool more useful for you in your Juniper journey.

The first is size ... a major difference between Windows (Windows 7 in this case), Cisco and Juniper is what you're specifying when you type the word size. In troubleshooting, knowing the difference can help isolate a link with a misconfigured MTU

In Windows, the length specified is the ICMP Data, so the command "ping -n 1 -l 1000" would result in a ICMP Data field of 1000 Bytes, 8 Bytes of ICMP Header, 20 Bytes of IPv4 header, and 14 Bytes of Ethernet header, giving a total of 1042 Bytes on the wire.

With Juniper (and most variants of Linux), the command "ping count 1 size 1000" would yield 1042 bytes on the wire in a slightly different way. While the Ethernet header and the IPv4 header would remain the same, with 34 Bytes of data, the ICMP header contains an 8 byte timestamp field, reducing the size of the payload by the same giving a ICMP Data field length of 992 bytes.

With Cisco, when you specify the size with the command "ping size 1000"  you wind up with 1014 Bytes on the wire, as the 1000 specified is the combination of the IPv4 header (20 bytes), ICMP Header (8 Bytes), and the ICMP data field (972 Bytes).

Now that we have the confusion cleared up about size, the next thing is source. One of Juniper's best practices involves configuring a system option known as default-address-selection. The purpose of this setting is to have all traffic sourced from the routing engine (syslog, NTP, SNMP, ping etc) use the loopback address. This is great (most of the time) but can have some unintended consequences when it comes to testing and troubleshooting a network. As you can see from the below examples, failing to source the appropriate interface can cause traffic to take a less optimal route around the network, perhaps bypassing the link that you are trying to test. If I was concerned about packet loss on the Ethernet link where the network resides, and ran the first ping without record-route, I would reasonably expect that everything was fine, assuming the rest of the network were operating normally.

cstewart@router> ping record-route count 1                         
PING ( 56 data bytes
64 bytes from icmp_seq=0 ttl=60 time=7.789 ms

--- ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max/stddev = 7.789/7.789/7.789/0.000 ms

cstewart@router> ping source record-route count 1 
PING ( 56 data bytes
64 bytes from icmp_seq=0 ttl=64 time=5.337 ms

--- ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max/stddev = 5.337/5.337/5.337/0.000 ms

The final ping option that I want to cover in this post is rapid, which is a slightly less aggressive version of the linux "ping -f" option. The rapid option, which requires that a count be specified, allows an operator to generate a significant amount of ICMP traffic. Because of that, due caution should be exercised when using this option, especially with low bandwidth links, or on links where bandwidth is constrained. 



Friday, March 29, 2013

JunOS Script Health Check

For my first technical post, I figure I'd share a bit of newly acquired know-how with regard to JunOS script. Often, upon logging into a router, it's good to know a bit of information, whether that login is for troubleshooting or normal configuration and provisioning. For me, the things that I care about knowing first are:

  • How long has the router been up
  • Who is logged in now
  • When was the last commit
  • Is anything broken
    • Chassis/System Alarms
    • Physical Interfaces down
    • LDP neighbors down
    • OSPF neighbors down
    • BGP Peers down
    • L2Circuits Down
I thought to myself, this is the perfect time for a login-script, and an easy opportunity to break into JunOS Scripting. So how do we get to the point, where, on login, the script outputs the health-check. It's actually not too difficult. First, we need to load the script onto the router. This can be accomplished by copying the script over using SCP or FTP. Next, we need to add it as an op script:

cstewart@router# set system scripts op file login-script.slax 

cstewart@router# show system scripts 
op {
    file login-script.slax;

Finally, we need to add the script to the login-class for the users that we want to have see this information.
cstewart@router# set system login class network-manager login-script login-script.slax permissions all 

After these changes are committed, we can test the command by calling the op script directly. Now, I'll caveat this with a warning that I am by no means a programmer, so, for revision 1, my target output was this, when there is a lot broken.

cstewart@router# run op login-script                                   
Minor ALARM - Rescue configuration is not set
LDP Configured and down on fe-0/0/7.0
OSPF Configured and down on fe-0/0/7.0
BGP Peer Down Peer@ is down
L2Circuit to on fe-0/0/7.0(vc 123) is down due to status NC
Physical Interface fe-0/0/7 is Admin Up and Operationally Down to MPLS Network
System Uptime is 7 days, 15:05
cstewart is currently logged in from since 10:39AM
Last commit was 2013-03-29 10:45:21 UTC by: cstewart

Now, when things are going well, you would see an output like this:
cstewart@router# run op login-script                             
System Uptime is 7 days, 15:15
cstewart is currently logged in from since 10:39AM
Last commit was 2013-03-29 10:56:07 UTC by: cstewart

So, without any further adieu, here is the script, free as in beer, for you to use and modify. Hopefully it helps you and your team.

version 1.0;
Version 1.0 of the login-script that does a quick health check

ns junos = "*/junos";
ns xnm = "";
ns jcs = "";

import "../import/junos.xsl";

match / 
<op-script-results> {

var $query0 = { <command> 'show chassis alarms'; }
var $result0 = jcs:invoke($query0);              
<alarm-information> {
for-each($result0) { 
if ($result0/alarm-detail/alarm-class != '') {
<output> $result0/alarm-detail/alarm-class _' ALARM - ' _$result0/alarm-detail/alarm-description; 

var $query1 = { <command> 'show system alarms'; }
var $result1 = jcs:invoke($query1);
<alarm-information> {
for-each($result1) { 
if ($result1/alarm-detail/alarm-class != '') {
<output> $result1/alarm-detail/alarm-class _' ALARM - ' _$result1/alarm-detail/alarm-description; 
var $query2 = { <command> 'show ldp interface'; }
var $result2 = jcs:invoke($query2);              
<ldp-interface-information> {
for-each($result2/ldp-interface[ldp-neighbor-count=='0']) { 
<output> 'LDP Configured and down on ' _$result2/ldp-interface/interface-name; 

var $query3 = { <command> 'show ospf interface'; }
var $result3 = jcs:invoke($query3);              
<ospf-interface-information> {
for-each($result3/ospf-interface[neighbor-count=='0']) { 
<output> 'OSPF Configured and down on ' _$result3/ospf-interface/interface-name; 

var $query4 = { <command> 'show bgp summary'; }
var $result4 = jcs:invoke($query4);              
<bgp-information> {
for-each($result4/bgp-peer[peer-state!='Established']) { 
<output> 'BGP Peer ' _$result4/bgp-peer/description _'@' _$result4/bgp-peer/peer-address _' is down'; 

var $query5 = { <command> 'show l2circuit connections down'; }
var $result5 = jcs:invoke($query5);              
<l2circuit-connection-information> {
for-each($result5/l2circuit-neighbor/connection[connection-status!='Up']) { 
<output> 'L2Circuit to ' _$result5/l2circuit-neighbor/neighbor-address _' on ' _$result5/l2circuit-neighbor/connection/connection-id _' is down due to status ' _$result5/l2circuit-neighbor/connection/connection-status; 

var $query6 = { <command> 'show interfaces terse'; }
var $result6 = jcs:invoke($query6);  
<interface-information> {
for-each ($result6/physical-interface) { 
if ((admin-status=='up') && (oper-status=='down')) { 
<output> 'Physical Interface ' _name _' is Admin Up and Operationally Down to '_description; 

var $query7 = { <command> 'show system users'; }
var $result7 = jcs:invoke($query7);              
<system-users-information> { 
<output> 'System Uptime is ' _$result7/uptime-information/up-time; 
for-each($result7/uptime-information/user-table/user-entry) { <output> user _' is currently logged in from ' _from _' since ' _login-time; 

var $query8 = { <command> 'show system commit'; }
var $result8 = jcs:invoke($query8);
<commit-information> { 
<output> 'Last commit was ' _$result8/commit-history/date-time _' by: ' _$result8/commit-history/user; 



Thursday, March 28, 2013

First post in the Journey

I always wonder where a blog starts. Perhaps with an introduction, and a little context, back story, and my goal. I decided to start this blog almost a year ago, and found myself in this same position, unable to figure out what that first step was. My goal was simple; to share the story of my journey from JNCIA-JUNOS, all the way through my JNCIE-SP. To provide others with that which I have had great difficulty finding. A place where the tips, tricks, and knowledge that I have, and will continue to acquire, can help to guide and encourage others.

All of this said, in the past year, I've completed a large portion of my JNCIE-SP journey, and learned a lot. I'm fortunate to work for an extraordinarily supportive company that has provided me with a small lab environment, on a program where I am heavily exposed to Juniper equipment, and surrounded by a few friends that a wicked smart.

While many folks start a blog to share their lives with the world or simply to express themselves, my goal in this is quite simple. I just want to share everything that I struggle to learn, and move the ball forward for anyone using the Juniper platforms.

Seems simple enough.