Helping Chester Zoo IT staff ferret out network and application issues fast

The network and application infrastructure at Chester Zoo is just as complex and vital to operations as the enclosure environments that house their wildlife. Therefore, when problems arise, IT staff are responsible for quickly isolating and remediating problems, before they impact staff or visitors. Unfortunately, the team's existing tools delivered limited visibility, were extremely complex and difficult to use, and provided no proactive network or application analysis. 

  • Monday, 14th September 2015 Posted 9 years ago in by Phil Alsop

Chester Zoo is a registered conservation charity located in Cheshire and welcomes more than 1.4 million visitors a year. It was established in 1931 and is the largest zoo in the UK. It is home to more than 11,000 animals and 500 different species, many of which are endangered.

Challenges
Chester Zoo's network services more than 400 users and consists of more than 175 PCs, 300 IP Phones, 100 PC- based EPOS Tills and 50 Printers. All of the locations are connected over an extensive network of single-mode and multi-mode fiber sites. The zoo also leverages a virtual server environment off of a 10 Gbps core. All of the systems provide the infrastructure to not only manage the zoo, but also to service and sell products to visitors. When a problem happens it can have a significant impact on employee productivity, zoo operations and visitor revenue.

The first challenge we face as an IT department is that we're constantly in a reactive state when monitoring and troubleshooting the network, said Martin King, IT manager at Chester Zoo. "When an issue does emerge, our team is typically notified via an end-user complaint. Unfortunately, this means the damage is probably already done and our team spins into a reactive mode. We then either send a high-level engineer into the field to isolate the problem, or we leverage a consultant. Neither approach is optimal for running a high performance network."

When Chester Zoo does use internal IT staff to troubleshoot, they also face the challenge of using very technical tools that lack analysis capabilities that can help simplify the process of isolation and remediation.

Most of our tools require significant training and give detailed packet information or are very one-dimensional in functionality. This means the engineer can spend days hunting down a problem and has to cycle through mountains of data that they have to manually correlate to find a root cause. It can be an extremely tedious and frustrating process that leaves you feeling helpless, said King.

Finally, the IT team has limited network and application monitoring in place. As a conservation charity, the zoo's main focus is on the care for its animals and the safety of its visitors. Therefore, when using performance monitoring solutions the team has relied heavily on cost-effective solutions. This translates into the team constantly probing and looking at key network and application devices to help ensure performance, instead of using the latest automated tools that can streamline the process.

The solutions we have in place give us some really basic 'up or down' insight into key technologies we have deployed, but no real granular readings that can give us foresight into problems that might be building, said King. This plays heavily into our reactive troubleshooting and planning. We address issues when they become problems and make infrastructure or bandwidth changes once we see the end result of poor performance. To meet the expectations the business has for the network, we need better visibility and a solution that allows us to quickly identify, isolate and resolve problems.

Solution
To speed network and application analysis and troubleshooting, and get better visibility into performance, Chester Zoo selected Fluke Networks' OptiView XG. It delivers automated network and application analysis in a portable handheld solution. Team members can use it to easily find problems or gather data remotely, or take it into the field to troubleshoot at the source of an issue. Its unique troubleshooting system is based on proactive analysis, path analysis, and application-centric analysis, which provides expert guidance to Chester Zoo engineers and helps them automatically identify the root cause of problems fast.

The OptiView XG has already paid for itself many times over and continues to save us a significant amount of time and money on a daily basis, said King. It gives us proactive insight into information that previously we could only see by reacting with complex tools. Its built-in intelligence helps us fix the problems fast, and the mobility of the tablet means we can quickly head into the field if we need to get more detailed information from the source. I'm so impressed with the ease-of-use and the depth of information it provides – it's now our go-to networking guru.

Results
The OptiView XG gives Chester Zoo IT staff the ability to monitor network and application performance from a central location, and when problems do emerge, either remotely troubleshoot the problem, or take the tablet into the field.


"Last year, we had a major broadcast storm on our network and the cause was impossible to pinpoint. It had massive repercussions for the entire zoo society. We had to bring in our network consultant to help fix the problem. The first thing he did was pull out an OptiView XG and run a few quick diagnostics. He then instantly identified the offending device – a faulty module – and we replaced it and restored order. It was at that point that I added the OptiView XG to my wish list, and since we got one, we've used it every day," said King.

King and his team leave the OptiView XG plugged into the network most of the time. This allows them to quickly review status each morning to ensure performance. This previously took them several hours each day with a variety of other tools, but now with XG, it takes minutes. This quick review helps them proactively monitor for problems. As a matter of fact, recently the team noticed a decrease in response times on one of the main SQL servers. The event logs showed a hung back up and the processor's memory usage was rapidly growing. The team quickly ended the back up cycle and server performance normalized before other employees began their workday.

When we got the OptiView XG, the goal was to allow all the members of my support team to be able to plug this device in and get instant feedback without the need for years of network training. It might not solve every problem you encounter, but it points you in the right direction and gives you suggestions on how to fix the issues, said King. It helped turn our reactive support structure into a proactive one, and now we can fix problems before our users are affected, meaning better uptime, better user experience and more time to focus on other support activities.

Chester Zoo also uses the OptiView XG for 10 Gbps capturing, path analysis and the inline analysis for the application layer of its in-house databases. "In the past, when problems occurred it was a fight over whether the network or database team was to blame. Now with XG, we can prove if it's bandwidth, end the argument, and upgrade the pipe. It's that simple," said King.

Chester Zoo built its 10Gb virtual environment with help from a major multinational technology company. This network supports more than 50 virtual servers that house the majority of the zoo's key application systems, including DNS, email, EPOS and more. When the network started to experience poor performance and loss of connectivity three to four times a day, King and his team looked to the OptiView XG for some answers.

Our network core was specifically designed to support our key applications, but it was really sluggish and having a lot of problems. We plugged in the XG and were amazed to find out that we were getting only about 25 percent of the performance we should have been getting. We were able to isolate the problem to an issue with the bus on the network cards of the servers. Since the vendor had helped design the core to match up with our existing technology, we went back to them with the categorical proof of the issue and they helped us upgrade and fix the problem at cost, said King.

King and his team use the OptiView XG on a daily basis and it continues to save his staff time and money. Like other organizations, the systems team debates with the networking team regarding the source of problems. But, with OptiView XG, King can now mitigate these battles and has a solution that allows his team to get the information they need quickly to solve problems.

In the end, it's about identifying and solving problems fast so the organization is not impacted. I'm amazed at how simple OptiView XG is to use, and the depth of information it provides to my team and I. When problems happen, we can find the source and prove why it happened. This gives me the ammunition I need to resolve the problem and move on, said King.