Does decoupling the WLAN = no more maintenance windows?
That’s a good question and one that the team over at Cisco is attempting to answer with new features coming in the Cisco 9800 IOS XE code.
Upgrading the code on your WLAN controller & APs has always been one of those things that people simply just don’t want to do. Why you ask? Most network administrators find solace in code that is relatively stable, or at the very least are happy with the code they have running since “the enemy you know is better than the one you don’t”. Code upgrades for reasons such as this have become a common place thing these days. Yet for something like supporting a new AP model, why must one take the entire production network down to do some basic coverage testing before rolling out a new AP? The folks over at Cisco have been working on a way to avoid this very thing. While rolling AP upgrades are something definitely not “new” to the industry, Cisco has decided to take things a step further by totally decoupling the controller and access point software. Simply put, there is no longer a valid reason to interrupt the network in order to support a new piece of hardware that might only be operating in a small test area. The great thing about the Cisco Catalyst 9800 code (IOS XE) is that it has wide support for APs and controllers such as the 5520, 8540, 3504, 9800 series appliances running in Local, Flex, or Fabric mode, while still focusing on the core tenants of resiliency, security, and intelligence.
Within the IOS XE code, the engineers have decided to create multiple options for upgrades of the code that limit the impact to end users by cutting down on the number of network wide interruptions caused by an upgrade. The three upgrade types introduced are controller, AP Service Pack (for PSIRTs & AP bugs), and an AP Device Pack. The AP Device Pack is a small package of code that is loaded on the end user’s existing WLAN controller (running IOS XE) to support new AP hardware. That’s right…you want to run that shiny new Cisco XXXX AP on code version 16.12 that might not support it? Not a problem! The caveat is that while the AP will operate using this new device pack software on the older code, not all features within the new AP may be supported. Yet, for testing things like coverage patterns the reduced feature set may very well be all you need.
Another feature coming is the ability to do per site rolling AP upgrades with a good amount of control over many of the tasks needed. Utilizing standard wireless features such as .11k and .11v, the network has the ability to gracefully move users over to a different access point in order to free up other access points for the code upgrade. Granular control of how many AP to upgrade at once, which ones can be removed without totally killing wireless coverage in an area, and the ability to quickly rollback in case of an issue are just some of the features that help keep administrators from having to wait until those dreaded 2am maintenance windows. Let’s face it, the days of maintenance windows are nearing an end.
Now….what could go wrong? While this all sounds great, a part of me wonders to what extent this could potentially create issues with customers. Recently, many customers testing new 11ax AP have discovered that their Intel clients simply won’t connect to an AP broadcasting an SSID that is 11ax compatible. The issue is that those clients can’t understand the IE being broadcast and the fix is to disable 11ax for the time being (while we *patiently* await new drivers). How does this AP device pack solve that issue? Well it doesn’t. That’s why I mentioned the caveat above. 🙂 But for customers such as healthcare, where it is nearly impossible to find an acceptable time to do an upgrade, this will help immensely in easing the transition over to new hardware.
Be sure to see what industry veteran, Lee Badman thinks about this code revolution at his blog here.
The other piece I’m interested to see is how does TAC handle issues with customers running software version X and an AP device or service pack. While we know this feature has been developed with good intentions in mind, we all know how tricky code issues have been in the past. Having an issue should not create an initial response from TAC that you need to be on code version X (to ensure a stable position for troubleshooting). However, the reality is that software is error prone due to the simple fact that we (humans) make mistakes. No matter how much regression testing is done by the vendors these days, they can’t take into account EVERY single corner case that the product may be used for. Maybe this is something AI/ML can help us with…..but that’s for another blog post. 🙂
Just remember, these new upgrade options aren’t a way to keep running old code forever. You still have to upgrade, but the folks at Cisco have just made life a bit easier. After all, the inevitable always occurs.
If you want to watch the video in its entirety, check it out below.