Software has become a major feature of rolling stock design maintenance and operation. Even in the 1990s, your writer counted over 200 microprocessors fitted to systems and sub-systems on each new eight-car Central line train. That was a comparatively simple train by today’s standards which might see as many as 200 microprocessors per car. With all these systems needing to communicate, the challenge is to make sure that they are all integrated and work together properly.
Another challenge arises from modern trains communicating with the outside world, leading to the risk that bad actors might try and take control of the train and render it unsafe or unusable. To explore this, at the end of February 2024 Rail Partners hosted a seminar at the UK Rail Research and Innovation Network’s facility at the University of Birmingham.
Dr George Bearfield, director of health, safety and cyber security at Rock Rail, and chair of the RSSB Asset Integrity Group gave the keynote address. To illustrate the emerging challenges in rail with a lack of visibility and understanding of software driven systems he used the example of a recent, high profile rolling stock software issue. A regional train operator in Poland had bought some new trains from a domestic supplier. After a nominal 1,000,000km operation, they needed overhaul. This operator sub-contracted overhaul to a third party. When the first train had been completed, they found that it would not work; systems that had worked previously no longer did so. The same thing happened on the second train. The failures included the driver’s display (HMI) indicating that the train was ready to move, brakes would release but it would not motor.
The manufacturer blamed the overhauler. The overhauler thought there was something odd in the software and called in a group of so-called ethical hackers (investigators). The train’s main system controller was an off-the-shelf PLC and the investigators used the PLC supplier’s debugging tools to understand data traffic of working and non-working trains to identify areas of code meriting further investigation. They found numerous issues including: every train had different software; trains would lock up if they hadn’t operated at over 60km/h for a length of time; there were several locations, including, allegedly, the overhauler’s premises that were geofenced to lock the software; sometimes the power inverter was disabled.
In addition, the secondary compressor (necessary to raise the pantograph) was disabled; the emergency stop was sometimes triggered; and there were ‘cheat codes’ that could be entered in the HMI to unlock the software. Needless to say, the manufacturer vehemently denied that it had implemented these features.
George said that this example was extreme and usually software issues are “more cock-up than conspiracy”, but he used it to promote a culture that encourages people to do the right thing or, as he put it “culture is what people do when no one is looking”.
Cyber security was next, and George suggested that there is often insufficient technical competence to be assured that assets are safe and secure, something that has to be continuously reviewed over asset lives. In cyber security people talk about Information Technology (IT) and Operational Technology (OT); terms that will pop up again in this article. There is an increasingly grey area between IT and OT as digitalisation and the internet of things continues to develop.
Operating technology, use of computer/software technology in the factory and in transportation, is using more and more internet connectivity to gather monitoring data and support maintenance. Managing and protecting such systems is normal for IT departments but is quite new for OT. Moreover, IT and OT increasingly overlap especially where trains transmit huge datasets to the cloud from where they are analysed on IT business systems.
George said that the only way to ensure safety critical software was acceptably safe is to work to EN50128 ‘Railway applications – communication, signalling and processing systems. Software for railway control and protection systems’ (now being replaced by EN 50716 Railway Applications. Requirements for software development).
To comply with the standard, the supplier must have a software quality assurance plan detailing its full validation & verification and testing activity. So, he said, “insist on seeing it and, when you review a change, make sure that the evidence you see confirms that the standard has been applied and that the software validator – the key competent party providing assurance – has signed it”. If enforced, this will allow clients to ensure a degree of rigour in software assurance, will drive the correct behaviours, and will help keep liabilities in the right place.
George issued a fairly gloomy warning that we have been through a period of ransomware attacks, that the current unstable geo-political situation increases the likelihood of attacks and clients are likely not to have clear visibility of the supply chain for all aspects of systems. As he put it, “you’re going to need to say what you need in contracts to be secure, and insist on getting it”.
His final comments were about Artificial Intelligence (AI) and he suggested that it should be kept well away from safety systems. He thought, though, that it might be useful in driving cyber security defence as we need to keep ahead of our adversaries who will also use AI to help drive their offensives.
Standards
RSSB’s Darren Fitzgerald expanded on standards and assurance. He said that software itself is only part of the issue that organisations have to consider and he looked at it as a multilayered approach. Clearly assurance is intended to show that software will perform as expected, doesn’t introduce vulnerabilities, doesn’t reverse previous fixes and, of course, operates consistently well. Darren outlined RIS-0745-CCS ‘Client safety assurance of high integrity software-based systems for railway applications’ which includes a great deal of useful guidance and refers to over 30 other standards and other relevant documents. It is intended to help clients be curious about what, hitherto, they might have considered to be black boxes.
Darren said that it is generally not possible to confirm the behaviour of software-based systems by testing all possible combinations of input. The risk of hazardous software failures is therefore controlled using rigorous techniques for specifying, designing, coding, analysing, and testing the software. Such an approach will introduce fewer defects and find more of those that are introduced, so that they can be removed. However, he added, this might not be efficient or necessary for systems that are less critical to safety.
Darren emphasised the importance of rigour in developing the software, including describing the software functions in words and evaluating risks in an event tree, explaining what might happen if any of the functions should fail, which could be fed into Bowtie analysis with different forms of testing applied as barriers that mitigate an operational hazard, through minimising the impact of its threats and consequences.
Operational view
Stacy Thundercliffe and Alex Saxton from Avanti West Coast gave an operator’s experience with new and existing fleets or “super computers on wheels” as Alex described them. Avanti operates the 20 year old Alstom Pendolino fleet, which is soon to be joined by Hitachi bimode class 805 and electric class 807 fleets. All are/will be maintained by Alstom with appropriate support from Hitachi. The Alstom trains are owned by Angel Trains and the Hitachi trains by Rock Rail. As an example of changing technology, Pendolinos were originally fitted with an innovative system – 3.5mm stereo jack sockets to connect to an on-board audio service. Almost as the trains entered service, this was rendered obsolete by smartphones and demand for on-board Wi-Fi. Wi-Fi provided the opportunity to connect more and more onboard systems to the shore IT systems.
Alex said that the IT world is some way ahead of the physical train world and the trains are having to catch up. This is not helped by the fact that the Pendolino is old in computer terms; Avanti has inherited legacy onboard systems running updated versions of software that was originally coded 25 years ago. Alex said that some of these systems are being updated every day and is no easy task. The organisation regularly assesses the risks it faces, and adapts its approach to cyber strategy appropriately to consider IT and OT. He added that visual “heatmaps” have been an effective tool – illustrating GDPR, Compliance with the Network and Information Systems Directive (NIS), Penetration Tests, Vulnerabilities & Management, Safety, Maintenance, Manufacturing and Maintenance contract issues and tasks.
The cyber risks are real. Avanti is a partnership between First Group and Trenitalia. Stacy remarked that in 2022, Trenitalia suffered a ransomware attack on its IT retail systems as a result of someone opening a link on a genuine looking email. This sort of attack leads to the very serious prospect that management might operate a kill switch on systems with much more impact on the operation than on the area affected. With the gradual connection of more and more train systems to the shore, there is a risk with any of those new and improved systems that they could be hacked, leading to misinformation, loss of revenue, loss of personal data, impact on services, safety, penalties, and/or reputational damage.
Stacy said that her organisation is reinforcing its fleet and onboard system technology team with expertise to match that of its IT experts. She added that the new Hitachi Rail AT300 fleet will enter service with state-of-the art cyber monitoring devices linked to First Group’s Security Operations Centre and work is ongoing with Angel Trains to develop the best-in-class cyber threat protection on the Pendolino 390 fleet.
Network Rail and the wider system
Peter Gibbons, Network Rail’s chief security officer, expressed the firm view that managing cyber security is just another part of health and safety management, part of an operator’s duty under the Railways and Other Guided Systems Regulations (ROGS). He added that NIS has similar requirements for cyber security and applies to two groups of organisations: operators of essential services (OES) and relevant digital service providers (RDSPs). OES are organisations that operate services deemed critical to the economy and wider society, e.g., duty holders involved in railway operation. In summary, NIS requires:
- An OES must take appropriate measures to manage risks posed to the security of information systems on which their service relies.
- An OES must take measures to prevent and minimise the impact of incidents affecting the security of the information systems, with a view to ensuring the continuity of those services.
The measures take account of the state of the art technology, to ensure a level of security of information systems appropriate to the risk posed.
Peter said that these requirements are similar to those in the ROGS where an operator is required to hold a safety certificate confirming it has a safety management system that describes how they run their transport system safely which includes identifying and managing risks as low as reasonably practicable. Not only that, he added, but managing cyber security risk is also required to meet contractual obligations. In extremis, a successful cyberattack could put an operator out of business.
Peter suggested three things operators should do about cyber security:
- Stop pretending it’s different/special/difficult/specialist (i.e. not your job).
- Review existing business practices and make sure that managing cyber threats is baked in.
- Normalise talking about cyber risk management, balancing the need to keep secrets with the need to be transparent and to collaborate.
Digital maintenance
Dr Emma Taylor from RazorSecure concluded the event with a talk about digital maintenance. Digital maintenance was described by Emma as “any corrective or preventative maintenance carried out on electronic systems on board the train. It is characterised by the use of electronic equipment (e.g., service laptops, USB devices and digital media) to connect to an on-board system to view data, diagnose performance issues, and test and/or update systems.”
Typical digital maintenance activities might include: updating software, management of onboard network equipment; repair, replacement or upgrade of equipment; management of data files; and management, update and configuration of service laptops, test equipment, and digital media; and finally, of course, cyber security.
Cyber protection techniques should include:
- Control of user credentials and privileges.
- Manage anonymous access, insider threats.
- Central management of files and applications.
- Isolate train from service laptops and potential malware.
- Secure remote access.
- Tracking maintenance activity including session logging, monitoring, and configuration control.
Concluding, Emma said that different projects have different requirements, and are deployed in different ways. Some have a cyber security focus, others an operational focus. The digital systems’ lifecycles can be complex, multi-step and iterative. All require some form of onboard installation (software, hardware) and active monitoring & management of digital systems (including networks). Moreover, there is a greater use of cloud-based systems and rail businesses are increasingly dependent on the data such systems generate.
Summary
This seminar made it clear that any view that software, once created and working well is maintenance free, is somewhat naïve. Increasingly complex systems and continual change to meet more demanding requirements means that continual effort to ensure that trains continue to function as required is necessary. Moreover, software design, safety and cyber security are intertwined, and the latter is an ever-evolving issue.
During the discussion, the wealth of useful guidance from the National Cyber Security Centre (NCSC), available on its website, was mentioned. Designers, suppliers, operators, and maintainers need to be aware of the threats and organisations need to employ people whose job it is to test the defences of the various systems. As one speaker said, when it comes to cyber security it’s best to work on the basis that the glass is half empty.
With thanks to Rail Partners’ Neil Ovenden and Susie Beevor for the opportunity to attend this event.
Software suddenly not working
(from the keynote address):
RazorSecure Digital Maintenance Project
Lead image credit: Malcolm Dobell