What is the most resilient system you can think of? The Internet? A utilities provider? First-responder services?
Behind each robust system is a delicate balance of responsiveness, efficiency and highly trained individuals working together to keep the show on the road.
In uncertain times, it is important to think about such examples in order to understand the underlying principles that make some systems more resilient than others.
The Internet, electrical utilities providers and first-responder services are reliable-first: they are built to function when the unexpected occurs or when demand spikes.
So, for business leaders the challenge here is conceptually very simple: how can we develop a best-of-both-worlds flexible solution, where we build reliability into our operations without giving up on efficiency? In graphical terms, are there tactics we can deploy to reach the top-right corner of the matrix in the figure below?
Of course, there is no silver-bullet solution. But there are many steps you can take to resolve the trade-off to some degree. Building on insights from some of these reliable-first settings and from academic studies, here are five specific suggestions. 1. Focus on the weak links
Every system can be divided into subsystems or components, and with some simple analysis you can identify which subsystems are creating the biggest threat to reliability, and then focus your attention on those. It turns out that often it’s not the most expensive or elaborate components that create the biggest problems. An academic study in Ford Motor found that the biggest potential performance impact (of failure) came from its smaller suppliers. In the context of the pandemic, we can see something similar going on – for example, the problems in the UK healthcare response were around lack of Protective Personal Equipment (PPE), and unavailability of hospital beds. 2. Create transparency and trust through the system
If you don’t know anything about a supplier – for example its financial situation or its own internal resilience – you tend to assume the worst and build contingency plans around the possibility of it failing. Moreover, lack of visibility in supply chains often creates huge problems, for example the ‘bullwhip’ effect where small demand fluctuations downstream create huge fluctuations upstream in the chain. These problems can all be reduced with greater transparency – information sharing – among parties, because problems higher up or lower down the chain can then be anticipated and planned for.
Greater transparency also goes together with trust. In 1997, one of Toyota’s suppliers called Aisin suffered a factory fire that threatened to halt production of all Toyota’s vehicles. Within days, Aisin mobilised dozens of firms in its network to manufacture replacement ‘P Valves’ to Toyota’s exacting standards. Knowledge sharing, trust, and mutual dependency among these firms all played their part in keeping the just-in-time production system on track, and it also pays off in normal time when these firms collaborate to improve quality or reduce costs. 3. Simplify your interfaces for quicker coordination
One key reason why the Internet works so well is the ‘interoperability’ of the systems and components that feed into it. There are internationally recognised protocols for coding and sharing data, which reduces the reliance on any one part of the system. The electricity grid, likewise, takes in energy from multiple sources and it has standardised ways of distributing and storing energy. Contrast this with the automotive industry where there are few common standards across manufacturers for electronic components – which means relatively few suppliers and greater risk of disruption. This applies to people and teams, too: many organisations designed for high reliability (such as hospital emergency departments) have standardised roles (e.g., paramedics, nurses, junior doctors, consultants) and team structures (shifts, wards). As a result, individuals and teams can be easily substituted or supplemented when the need arises.
It’s not enough just to have simple and well-defined interfaces – you also need a way to coordinate things so that supply and demand can be matched quickly. Consider Uber’s scheduling system – drivers are matched with customers in seconds, and if one driver rejects the match, another steps in. Uber may not be 100% reliable, but it’s a vast improvement on what we used to put up with. More broadly, many companies face opportunities with their staffing and planning processes to dynamically allocate their resources to quickly changing demands. 4. Invest in fungible (general purpose) resources
When there is a shock to a system, there is often a huge spike in demand for certain resources. If those resources are highly specialised – for example ventilators – you need to stockpile them to cope. But if they are fungible, meaning that they can be deployed in several ways, capacity planning is much easier. Hospital beds were mostly pretty full at the start of the crisis, but they were quickly made available (by sending less-unwell patients home) to cope with the COVID-19 crisis. The same logic applies in other settings. Consider the world of business education: some universities had stronger digital capabilities than others before the pandemic hit, and this allowed them to scale up online learning quickly. But these investments were typically made with a general view of digital learning becoming more important, rather than specific concern over the risk of a pandemic.
There is a human side to this point as well. Investing in the general development of your employees – and cross-training them in multiple activities – is a good thing in general but particularly so as a way of coping with uncertainty. Firefighters aren’t just trained to fight fires – they are skilled at fire prevention and a range of other emergency services as well. 5. Keep people fit and alert
It’s not enough just to have equipment and people that are available to be redeployed at a moment’s notice; they must also be willing and able. People working in the fire service, where active fire-duty is a small part of the job, spend lots of time in training and doing drills. And people working in high-reliability settings like power stations, air traffic control or mines are trained to think holistically about the work they are doing (rather than in narrow silos). They are drilled in safety and security techniques, and they take learning from near-failures (e.g. a lost time injury) very seriously.
How do you get your employees to behave in these ways? A huge part of operational resilience is cultural, so you can encourage these behaviours through what you say and how you say it. You also need good metrics and aligned incentives. Is there a risk, for example, that a buyer in your organisation might select a low-price supplier in a dangerous location or in poor financial health because of an aggressive spend reduction target? High-reliability organisations are very careful about what types of behaviours get rewarded, to ensure that profit-seeking doesn’t drive out safety. Operational resilience in your organisation
There are a range of things you can do to improve the reliability of your operations without spending a fortune on stockpiles and alternative sources of supply. Sometimes it is about homing in on the weak points in your operations and making sure they don’t fail. Sometimes it is about designing the overall system better (greater transparency, standardised interfaces). And at other times it is about investing in the human side of the system to make sure things work as they are supposed to.
Obviously, the mix of tactics depends on your specific circumstances. But hopefully the ideas and examples provided here will help you think more creatively about this important challenge. Jérémie Gallien is Professor of Management Science and Operations and Chair of the Management Science and Operations Faculty at London Business School. Julian Birkinshaw is Professor of Strategy and Entrepreneurship and Deputy Dean for Executive Education and Learning Innovation. We are grateful to our colleague Alex Yang, Associate Professor of Management Science and Operations, for his input.