AN OPERATIONS & MAINTENANCE (O&M) programme determines to a large degree how well a data centre lives up to its design intent. A comprehensive data centre facility operations maturity model is a useful method for determining how effective that programme is, what it might be lacking, and for benchmarking performance to drive continuous improvement throughout the life cycle of the facility. This understanding enables on-going concrete actions that make the data centre safer, more reliable, and operationally more efficient.
It is important to monitor, measure, and report on the performance of the data centre so that performance, efficiency, and resource-related problems can be avoided or, at least, identified early. Besides problem prevention, assessments are necessary to benchmark performance, determine whether changes are needed and what specific steps are required to reach the next desired performance or maturity level. A maturity model offers a framework for assessing the completeness and thoroughness of an O&M program.
Ideally, an organization would do the first assessment during Commissioning for new data centres or as soon as possible for an existing data centre. Next, results should be compared against the data centre’s goals for criticality, efficiency, and budget. Gaps should be identified and decisions made as to whether any changes need to be made in the program. Once the level of maturity has been benchmarked in this way, periodic assessments using the model should be conducted at regular intervals (perhaps annually) or whenever there is a major change in personnel, process, budget, or goals for the facility that might warrant a significant change in the O&M program.
Developing a maturity model
Schneider Electric propose that a data centre facility operations maturity model (FOMM) has a form and function based on the IT Governance Institute’s maturity model structure. The model is built around 7 core disciplines, with each discipline having several operations-related elements associated with it. Each element is further divided into several sub-elements, ranked on a scale of 1 to 5 (where 1 represents least mature and 5 is the most developed).
For each of these program sub-elements, each of the five maturity levels are defined in terms of the specific criteria needed to achieve that particular score. The score criteria and the model it supports have been tested and vetted with real data centres and their owners. The score criteria represents a realistic view of the spectrum and depth of O&M program elements that owners have in place today ranging from poorly managed data centres to highly evolved, forward thinking data centres with proactive, measurable programmes.
Maturity level characteristics
In order to further clarify the meaning and differences between the maturity levels, the following characteristics are suggested:
A. Level 1: Initial / ad hoc
No awareness of the importance of issues
related to the activity.
No documentation exists.
No monitoring is performed.
No activity improvement actions take
place.
No training is taking place on the activity.
B. Level 2: Repeatable, but intuitive
Some awareness of the importance of
issues related to the activity.
No documentation exists.
No monitoring is performed.
No activity improvement actions take
place.
No formal training is taking place on the
activity.
C. Level 3: Defined process
Affected personnel are trained in the
means and goals of the activity.
Documentation is present.
No monitoring is performed.
No activity improvement actions take
place.
Formal training has been developed for
the activity.
D. Level 4: Managed and measurable
Affected personnel are trained in the
means and goals of the activity.
Documentation is present.
Monitoring is performed.
The activity is under constant
improvement.
Formal training on the activity is being
routinely performed and tracked.
Automated tools are employed, but in
limited and fragmented way.
E. Level 5: Optimized
Affected personnel are trained in the
means and goals of the activity.
Documentation is present.
Monitoring is performed.
The activity is under constant
improvement.
Formal training on the activity is being
routinely performed and tracked.
Automated tools are employed in an
integrated way, to improve quality and
effectiveness of the activity.
Who should perform the assessment?
It is important for the person or team who conducts the assessment to be objective and thorough with an “eye” for detail. Accurately determining to what degree a sub-element exists and how consistently it is used and maintained for the facility can be a
challenge.
Organizations that are low in O&M experience may also have difficulty in determining the best path forward once the initial score baseline has been established. While the model’s 5 defined levels of maturity for each sub-element is specifically designed to help guide “where to go next”, some may not know what constitutes the most effective steps to get there.
Those who determine they lack the required time, expertise, or objectiveness would be best served to hire a third party service provider with good facility operations experience. A third party would more likely play an independent and objective role in the process having no investment in the way things “have always been done”. There’s also value in having a “new set of eyes” judge the programme whose fresh viewpoint might yield more insightful and actionable data analysis.
Experienced service vendors offer the benefit of having knowledge gained through the repeated performance of data centre assessments throughout the industry. Broad experience makes the third party more efficient and capable.
This knowledge makes it possible, for example, to provide their customer with an understanding of how their O&M programme compares to their peers or other data centers with similar business requirements. Beyond performing the assessment and helping to set goals, experienced third parties can also be effective at providing implementation oversight which might lead to a faster return on investment, especially when resources are already constrained.
Scoring the assessment
Useful methods used by Schneider Electric for scoring and reporting elements include spidergrams, mean score graphics and matrices. It may useful to develop and utilise a Risk Identification Chart which visually depicts the threat of overall system disruption.
A simple matrix can be used to highlight sub-elements that are deemed to have unacceptable scores and ranking them based on how easy they are to improve (or implement) vs. their impact on operations. This is an effective way to help organizations prioritise based on FOMM goals, business objectives, time, and available resources. “Quick wins” can be easily identified and separated from items that fit longer term, strategic objectives that might require significant changes in staff competencies and behaviours. Base-lining the current implementation of the O&M programme against the organisation’s desired levels should then lead to a concrete action plan with defined goals and owners.
Conclusions
Preventing or reducing the impact of human error and system failures, as well as managing the facility efficiently, all requires an effective and well-maintained O&M programme. Ensuring such a programme exists and persists over time requires periodic reviews and effort to reconcile assessment results with business
objectives.
With an orientation towards reducing risk, the Facility Operations Maturity Model presented is a useful framework for evaluating and grading an existing programme. Use of an assessment tool will enable facilities teams to thoroughly understand their programme including:
Whether and to what degree the facility is
in compliance with statutory regulations
and safety requirements
How responsive and capable staff is at
handling and mitigating critical events and
emergencies
The level of risk of system interruption
from day-to-day operations and
maintenance activities
Levels of staff knowledge and capabilities
Grading and assessment of results is
best done by an experienced, unbiased
assessor. There are third party vendors
such as Schneider Electric who offer
facility operations assessment services.