Just as with anything else in the IT world, there are no one-size-fits all solutions, and no “best practices” that apply to every situation. But there are some common tips and tricks to get clarity around automation decisions and reduce any friction that may be inhibiting (further) adoption of network automation.
1. Choose Whether You Want Flexibility or Simplicity
Automating your network requires treating your network as code. It’s literally programming your network, and when it comes to programming, there are several ways to accomplish the same objective.
Think of flexibility and simplicity as sitting at opposite ends of a spectrum. At the simplicity end of the spectrum, you can automate a task in a way that’s quick and simple, but not very scalable. At the flexibility end of the spectrum, you can automate a task in a way that’s initially difficult and requires a lot of careful thought and testing, but is massively scalable. Whether you go for flexibility or simplicity depends on how comfortable you are with automation and programming in general.
When considering who’s going to be looking at the code, especially in small environments, there’s a good chance a non-network person will need to diagnose a problem. Simple, straightforward code makes it easier for them to understand what’s going on. However, the simple approach isn’t all roses. If you build your automation by copying and pasting, you have a lot of duplication, and this doesn’t scale.
For example, suppose you have a few switches, each with its own unique configuration. You can create a single Ansible Playbook that contains what amounts to a verbatim copy of each of these configurations, and this works fine. But this gets unwieldy when adding more switches or making a sweeping network-wide change. As your network grows and changes, you’ll end up having to refactor your code, which means potentially breaking things that work today.
In a larger environment where you may have a fulltime network team, flexibility is more important. It’s also more complicated. You have to think less like a network engineer and more like a programmer. That means separating the logic of your code from the data that’s unique to each individual device. All automation platforms do this natively, although again, there are many ways to go about it. Regardless of the platform you use, you generally break your code across multiple files. While more difficult to manage, the trade-off is that it’s much more scalable.
Can you settle somewhere between flexibility and simplicity, perhaps enjoying a little bit of both? It’s not that you can’t, but it’s not a good idea. Although combining the ease of copy and-paste with powerful programming logic gives you the best of both worlds, it also gives you the worst. It becomes much more difficult to understand and predict how your automation platform will actually configure your devices. Mixing and matching approaches is more trouble than it’s worth.
2. Build One-Offs into Automation
One of the main barriers to network automation is the inevitable presence of ad-hoc or one-off configurations. Rather than trying to eliminate these, embrace them and make them a part of your automation solution. Adopt the mindset that if it’s not in the automation code, it doesn’t exist in the running configuration.
Going to the trouble of automating a single statement on one device does take time and effort which may seem wasted; but it’s actually quite the opposite. Failing to adopt one-offs into your automation family will inevitably result in a broken network. You’ll eventually encounter a situation where either the automation platform has overwritten a one-off, or the one-off has created a conflict with some new configuration you pushed out via automation.
Such an ugly event has to happen only once before management declares that automation is off the table, and that all changes must be done manually. Investing extra time up front to automate one-offs is preferable to continuing to do everything manually.
3. Use a Single Automation Platform
Automation requires treating infrastructure as code, and every automation platform has its own chosen language. Ansible uses Python, while Puppet and Chef use Ruby. It’s therefore important that everyone using the platform agrees on a common language.
If you have a DevOps team that already uses automation, ask them for recommendations. Find out how they’re using it. If they’re automating only a handful of servers using ad-hoc configurations, they may not be in the best position to advise you on how to automate the network.
Also, be cautious about choosing a platform just because it’s someone’s favourite. The people who are going to use it must like it. At the end of the day, if they don’t like the automation platform, they’re not going to use it.
4. Use Version Control
All automated device configurations should be kept in a centralised repository using a version control system such as Git. This has a couple of advantages.
First, the repository is the authoritative source for all configurations. Although it takes a while to get to this point, ultimately the goal is that if the configuration is not in the repository, it doesn’t exist on any device. This is the ideal and not a rule, because the reality is that if you’re going to introduce automation bit by bit, what’s in your repo will be only a small portion of the actual device configurations.
The other advantage is that version control lets you keep a record of changes so you can roll back easily. If you add one too many spaces or inadvertently delete a line of code, a version control system can tell you exactly what changed. Better yet, correcting the change doesn’t require manually fixing the code. You simply revert to a previous version, and everything is back to the way it was before the mistake.
5. Validate and Monitor Your Network
Regardless of where you are on your automation journey, it’s a smart idea to make sure you have a network operations toolset in place that ensures everything is behaving as it was intended to.
Whereas version control tracks changes to your network configurations, a network operations toolset tracks changes to the state of the network itself, telling you when the state of the network has changed and why. Even if your network is only partially automated, these tools can still track every state change – even the manual ones – eliminating the blind spot left by partial automation. They can also help validate that changes had the expected effect.
Winning Back Time
Implementing automation is a manual process that requires careful thought and planning. It involves a learning curve, but it’s well worth it. If done correctly, you’ll end up with a more stable and predictable network. And in the long run, you’ll get back hours which you can use to devote to other things. After all, the whole point of automation is to let a machine do the work so you don’t have to.