Resilient Utility Infrastructure – Part 1: Planning and Design

by Matt Woo, P.E., RCDD, LEED AP BD+C and Steve Grayson, P.E., LEED AP

This article is part of Wood Harbinger’s newsletter series.

Resiliency in the built environment, simply put, is the ability to recover from difficult conditions that have compromised essential utility/infrastructure services. That could mean recovering from a power outage, an earthquake, a system failure, equipment failure, process interruption, and so on. Different types of buildings or campuses require different levels of resiliency—think hospitals, research labs, data centers, industrial freezers, airports, defense facilities, and the like. For some facilities, resilience may mean recovering in a matter of days. Others must be completely fault tolerant by design—a moment’s downtime is unacceptable due to great cost, lost production and life safety.  It’s inconvenient if the power goes down for half a day at a general office, but it’s unthinkable that a critical care hospital or data center could suffer the same.

Keeping the lights on, the life-saving equipment running, the servers online, and so forth is the purpose of utility infrastructure generation and distribution systems. This two-part article will explore the key aspects of creating resilient utility infrastructure: thoughtful planning, design that addresses a facility’s current and future needs, quality construction, and ongoing attention through operations, maintenance, and training. With engaged facility stakeholders, an experienced multidiscipline design team, and diligent contractors and operations and maintenance staff, utility infrastructure can be resilient to the appropriate level.


Planning for resilient utility infrastructure and systems must address key factors including reliability, availability, maintainability, and flexibility. Reliability and availability relate to fault tolerance of equipment and systems. Basically, through what kinds of conditions do you need your infrastructure to continue operating without interruption? Do you need it to withstand a two-hour power outage in normal weather conditions? Do you need it to withstand a two-day power outage and hurricane force winds? Do you have distribution over water that must remain absolutely dry?

Resilient infrastructure and systems come at a cost, which is why it’s important to prioritize resilience by facility type. It’s a balancing act between first and life cycle costs of the infrastructure against the cost of system failure or downtime. In a hospital’s most critical areas, like an operating room, the cost of system failure is potentially death, so anything but highly resilient systems are not an option. In a data center, keeping the servers online is the critical mission. The cost of downtime can be thousands to millions of dollars per second to the owners or operators of the data center itself and countless repercussions to whatever entities rely on the data hosted by the data center. There’s a lot on the line.

We also factor in maintainability and flexibility as it relates to reliability and availability. If we have to take a system offline to perform maintenance, we need to have measures in place for redundancy so that availability isn’t compromised. Similarly, infrastructure must be flexible enough to accommodate expansion and renovation without complete reconfiguration that requires no or minimal interruptions and avoids lengthy downtime for construction. We also must address all the “what if “x” happens?” questions so that essential scenarios for resilience are addressed, by design.

Once we define the required levels of reliability and availability and what systems contribute to achieving them, we can identify and minimize single points of failure and select infrastructure options that prioritize quality, durability, maintainability, and redundancy.

Codes and Standards for Resilient Design

Resilient design is regulated by codes to which certain facilities must comply and standards that guide the decision-making process. Understanding each of these codes is a crucial element of the design process that is made more challenging with these codes and standards integrated.

An example of such codes are National Fire Protection Association (NFPA)-70 Articles 700, 701 and 702 and NFPA-70 Article 517. NFPA-70 Article 700 – Emergency Systems, describes requirements for providing emergency electrical systems, that are essential for human life and are legally required. Article 701 – Legally Required Standby Systems, describes requirements for providing electrical systems, that are required to aid in firefighting, rescue operations, control of health hazards, and similar operations.  Article 702 – Optional Standby Systems, describes requirements for providing electrical systems, to be served by an optional source of power to loads that do not apply to Article 700 and 701.  Article 517 provides codes specifically related to hospital/healthcare facilities that involve patient treatment and examination. It addresses measures such as separation of electrical services, where facilities must have separate emergency power branches for life safety systems like medical gas, critical systems (e.g. systems directly involved in patient care) and equipment systems for HVAC/smoke control, etc. The U.S. Department of Health and Human Services, state-specific Departments of Health, and the Joint Commission (healthcare accreditation organization) have their own codes, criteria, and standards for healthcare facilities as well. Every government body has either their own codes or, more specifically, adopted versions of the National Codes such as the International Building Code (IBC) and NFPA. Cities may also have very stringent fire codes that must be followed under all conditions and types of facilities.

Another example is ANSI/TIA-942, the standard for data center design. It identifies four ratings/tiers that define different levels of redundancy and fault tolerance for achieving reliable and available facilities. For example, a Rated-3/Tier 3 data center is “concurrently maintainable,” meaning servicing and replacement can be accomplished without affecting any end-user capabilities. This means the equipment and systems serving the data center are all redundant and there are multiple distribution paths.

The Department of Defense uses the Unified Facilities Criteria (UFC), which governs design for military installations and includes extensive provisions for system resiliency, due to the high security and sensitive nature of many spaces in military and defense facilities. The UFC includes specialized measures like Anti-Terrorism and Force Protection, Lightning and Static Electricity Protection Systems, Physical Security Measures for High-Risk personnel, and others, that have unique impact on designs. Typically, all resiliency designs must support resiliency without compromising life safety in public buildings but in military installations, that may not be the case.

Designing for Redundancy

The real challenge comes in defining the “right” infrastructure options and configurations that meet the codes and best enable resiliency for the specific facility type. Redundancy is the common theme for designing resilient utility generation systems for chilled water, steam, hot water, and power. It’s commonly referenced as N+1 or N-1.  The terms N+1 or N-1 are determined by what services are being discussed. One Normal Utility or System Service plus one redundant or the Loss of the Normal Utility System or Service to one level of backup or alternate source or system. Another more robust redundant design topology provides two fully redundant, mirrored systems.  In this design any component in one system can be taken down for maintenance without affecting the operation of the other system.  Determining the level of systems redundancy needed requires clear definition of the risks involved with the loss of utility, including lost production, lost revenue, and possible loss of life. Some facilities are known to require N+4 or to N-4 and involve a great deal of planning and funding

For the electrical power system, providing redundancy through multiple service sources can be accomplished with radial distribution served from separate substations, or grid services from separate utilities and substations. Redundant power systems require the designer to have a full understanding of all utility service sources, any possible transfer schemes between these utility sources, and protective relaying and automatic controls and how those could have an effect on the stability and reliability of the full system. Some installations use renewable power sources like photovoltaic systems, hydropower, wind turbines, and fuel cells to supply alternate sources of service. Cogeneration, or a combined heat and water plant, is another option for redundant service. A combined heat and water plant integrates steam/hot water and electricity production, where electricity is a by-product of hot water/steam generation (thermal lead) and is used by the facility or pushed out to the grid, or thermal energy is a by-product electricity production (electricity lead) and is used by the facility or blown off to the atmosphere.

Backup emergency power systems included generators, uninterruptible power supplies (UPS), fuel cells, and battery banks. The different options for UPS systems—battery and flywheel—each have their pros and cons. Check out Matt Woo’s blog for a detailed comparison.

Resilience in mechanical infrastructure is a particular challenge because systems are multi-layered. There are many difference pieces of equipment that make up the whole system, and for it reach a certain redundancy, each component of the system must be resilient, too. A redundant chilled water system, for example, requires multiple chillers and pumps provided in parallel to accommodate variances in load conditions and redundancy for failure, as well as an abundance of isolation valves upstream and downstream of each device so that equipment can be shut down for maintenance or construction without disrupting operations. We’ve never heard someone complain that there were too many isolation valves, but they do if there aren’t enough. Once a section of the system is locked out electrically, it can be isolated mechanically, and service rerouted to the redundant equipment to maintain operations.

Loop-grids, primary-select, main-tie-main, and zone-loop grids are examples of resilient options for normal distribution of utilities including power, water, steam, and compressed air. They are beneficial for flexibility because they enable isolating segments for maintenance constructing new buildings without interrupting service to any other or existing buildings. For example, Wood Harbinger recently proposed a zone-loop-grid electrical distribution system for a college campus that will support redundant capacity to handle campus loads and also enable maintenance and campus expansion without compromising day‑to-day operations. We also used a looped configuration for a compressed air system at a Boeing manufacturing plant. The factory building had an underground, 6-inch diameter compressed air distribution system to provide appropriate pressure to the tools at each plane build position. The loop provided consistent pressure throughout, containing multiple feeds to the loop system. We have also designed fire suppression systems for piers and dry-docks that have the option of utilizing fresh water, salt water, or both.

In addition to providing backup power for equipment in critical care and high-availability facilities, backup power must also be provided for the supporting cooling systems, control and automation systems, fire detection and suppression systems, lighting and control systems, and any other systems required for facility operations, such as security and access control, nurse call, etc.

Coordinated Design

The resilience challenge increases when we apply the need for system protections from events like earthquakes, lightning, or explosions/serious impacts. This requires highly coordinated design between mechanical and electrical engineers like us and our structural engineering partners. Not only must MEP systems be resilient and survivable in order to maintain service and building operation, other building infrastructure must also be resilient and survivable. Structural considerations such as the risk category of a building may affect the component importance factor needed for the seismic restraint design of a mechanical and electrical nonstructural component. Some nonstructural components, such as those required to function for life-safety purposes after an earthquake (fire protection sprinkler systems and fire alarm systems, for example) must meet the higher component importance factor in the seismic restraint design even in a lower risk category building. Mechanical and electrical equipment for designated seismic systems require manufacturer’s certification to show that the equipment will remain operable following an earthquake.

Other examples include fire-stopping of MEP penetrations through fire-resistive floors, walls, and partitions that must remain in place and maintain the floor, wall, and partition fire rating during a fire event. We don’t want any critical system failing as a result of a weak link in some related system.

Additional design considerations include thinking about what happens after an emergency. If systems will need to be shut down, for example, can they be shut down easily? To enable this, things like valves, control panels or disconnects would need to be in accessible locations for easy shutdown or switchover. Points for temporary system connections may be needed if temporary equipment may need to be provided.

From comprehensive planning to codes and standards knowledge to applying all of that information in designing the “right” system, there’s a lot to think about when it comes to resilient utility infrastructure planning and design for critical facilities. Planning and design are only half of the equation. Successfully implementing the design during construction and operating the systems effectively over the life of the facility are multi-step processes unto themselves. Check out Part 2 of this article series to continue exploring how resilient utility infrastructure comes to fruition.

This entry was posted in All Engagements, E-Newsletter and tagged , , , . Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.

One Trackback

Post a Comment

Your email is never published nor shared. Required fields are marked *

You may use these HTML tags and attributes <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>