2019 Assessment of the Reserve Bank Information and Transfer System 3. Material Developments

This section draws out material developments relevant to RITS that have occurred since the 2018 Assessment. This Assessment covers the period from April 2018 to March 2019. Over this period, there have been material developments that are relevant to the Principles concerning operational risk (Principle 17), legal basis (Principle 1), tiered participation arrangements (Principle 19) and communication procedures and standards (Principle 22). To complement this section, background information on how RITS operates, activity and participation in RITS, and the operational performance of RITS over the assessment period is set out in Appendix A. A detailed assessment of how RITS meets the Principles (incorporating developments discussed in this section) is presented in Appendix B.

3.1 Operational Risk Management

3.1.1 30 August Power Outage

Shortly before 11 am on Thursday 30 August, the Bank experienced a disruption to the power supplying the data centre at one of its sites. The outage was caused by the incorrect execution by an external party of routine fire control systems testing in that data centre, which initiated an unplanned shutdown of all primary and back-up power supplies supporting the data centre. The power loss abruptly cut off all technology systems operating from that data centre, including those supporting RITS.

While payments in the 9 am LVSS batch and the eftpos and Mastercard batches had settled, all other RITS services and connectivity to feeder systems were affected. Payment and settlement systems were gradually restored throughout the afternoon using infrastructure at the Bank's other site. FSS, which settles transactions on the NPP, recommenced normal operations after three hours. Key RITS systems and connectivity to feeder systems were restored progressively throughout the afternoon, with the final connection (with Austraclear) re-established around seven and a half hours after the power failure. Settlement systems were fully operational from this point, but back-up processing capability for RITS at the affected site was not fully restored until the following Saturday. All transactions submitted to RITS on 30 August were settled by the end of the day.

Impact on RITS and feeder systems

The PFMI note that a financial market infrastructure (FMI) should aim for its critical IT systems to be able to resume operations within two hours following disruptive events and to complete settlement on the day of the disruption in extreme circumstances. RITS services took longer to recover than this recovery time objective (RTO) because of the scale of the event, the loss of all ancillary support systems and difficulties in technicians gaining privileged access to systems. Loss of access to documentation systems that store support procedures also impeded the effectiveness of staff working to re-establish RITS.

Since the FSS is required to settle real-time payments via the NPP on a 24/7 basis, the Bank has set the availability target for FSS at 99.995 per cent (compared with 99.95 per cent for RITS), which equates to an average of around 26 minutes of allowable downtime per year. Consistent with this, on 30 August the Bank's executive management gave priority to recovering FSS before commencing the recovery of RITS. Ordinarily, this would not cause a material delay to the recovery of RITS, since FSS is designed to recover automatically when one of the sites becomes unavailable. However, due to a combination of factors, not all systems recovered as expected. The large-scale loss of supporting technology services and related delays in gaining immediate access to highly secure systems to diagnose the issue and restore services meant that full recovery of FSS took three hours. Clearing of NPP payments was able to continue during the outage, and some NPP participants implemented contingency plans to make funds available in beneficiaries' accounts for lower-value payments ahead of FSS settlement resuming or took steps to re-route the clearing of customer payments through the direct entry system.

At the time of the outage, RITS was operating from the affected site as were the servers that automate aspects of the failover of the RITS database. The prioritisation of recovery of FSS caused a delay in commencing work to restore RITS, while the loss of access to RITS monitoring services meant that Bank staff were initially unable to identify the state of RITS operations. Four hours after the power failure, the RITS queue at the alternate site was brought back online and queued transactions began to be settled at this time. With the exception of Austraclear, connectivity with most RITS feeder systems was brought back online over the next two hours.

  • SWIFT Payment Delivery Service (PDS). Connectivity to the SWIFT PDS was restored just under five hours after the outage occurred.
  • CHESS. The CHESS batch settled in RITS just after 3.30 pm and the associated securities movements were completed before 4 pm, a little over three hours behind the usual schedule.
  • Austraclear. Disruption to the Austraclear debt securities settlement system was more prolonged due to a problem re-routing ASX's link to the Bank's alternate site. Connectivity was re-established and Austraclear transactions involving payment began settling just after 6.15 pm (around 7½ hours after the power failure suspended settlement).[12] The Austraclear day session and RITS daily settlement session were extended by 3½ hours until 8 pm to allow all outstanding trades to settle.
  • CLS. AUD settlement members of the CLS foreign exchange (FX) settlement system were unable to submit their pay-ins for AUD settlement via SWIFT until 3.30 pm. This was an hour after the pay-ins would normally commence and half an hour after the start of the normal AUD CLS settlement window. After a delay, the AUD was able to settle in CLS after RITS and the SWIFT PDS were recovered in time for members to make pay-ins before their final pay-in deadlines.
  • Other feeder systems. PEXA, an electronic settlement system for property transactions, was able to resume its batch settlement process at around 4.45 pm, after Community of Interest Network (COIN) connectivity relied on by PEXA users had been fully restored. While the majority of PEXA settlements were completed before the standard cut-off time of 6.30 pm, an extension was agreed to 7.45 pm for the settlement of remaining transactions.[13] Restoration of COIN connectivity also allowed settlement of direct entry payments via the Low Value Settlement Service to recommence in time for the scheduled 4.45 pm multilateral run.

Despite the Bank's communication channels being impacted, communication with the industry was ongoing throughout the day of the incident and continued until Monday, 3 September. Regular updates were sent to RITS members and other stakeholders, with the first external notification issued via SMS and email at 11.20 am. The Bank also participated in conference calls arranged under the incident response frameworks of AusPayNet and NPP Australia (NPPA).

As a result of the extended outage on 30 August, RITS recorded average system availability of 99.83 per cent in 2018. This fell short of the Bank's key operational availability target for RITS to be available to its members in excess of 99.95 per cent of the time. The Bank last failed to meet its availability target in 2015, and has fallen short of the target on only one other occasion since 2007.[14] Further details on the availability performance of RITS, and how this is measured are included in Appendix A, section A.2.

The Bank's response

Departments across the Bank have contributed to a broad review of lessons learned from the incident, delivering a number of recommendations to the Bank's Risk Management Committee and the Reserve Bank Board Audit Committee. This included an incident report prepared by Payments Settlements Department, drawing out lessons learned from the outage and initiating follow-up actions. Overall, the report indicates that, despite missing its RTO, the incident was well managed by the Bank. The potential impact on participants and the broader financial system was greatly diminished by the recovery of systems and completion of settlement on the day of the outage. The Bank identified a number of actions arising from the incident, with all of the initial actions now complete. A detailed incident report was provided to RITS members (including the feeder systems and the FMIs that are members of RITS) and an initial summary of the cause of the incident and its key impacts was sent to members after operations closed on the day of the incident.

The key themes among these lessons learned and follow-up actions are summarised below.

  • Review of testing and maintenance arrangements. The Bank has conducted a review of maintenance arrangements for critical infrastructure across all sites (including the data centres) and adequacy of fire safety system testing procedures and controls. This has addressed the root cause of the outage, and reduces the risk of another maintenance incident impacting the availability of RITS and other critical services during core operating hours. The Bank has also expanded the range of its technical contingency testing scenarios in order to better simulate events in which several system components lose power simultaneously.
  • Arrangements for system restoration. As noted above, because FSS did not recover automatically as designed, staff were diverted from recovering RITS to work on restoring FSS. The prioritisation of FSS over RITS took into account a range of factors, including the Bank's relative tolerance for downtime in both systems and the time of day when the incident occurred. While the relative tolerance for downtime in both systems is documented, the process for determining system prioritisation is not currently documented. Because the power outage occurred after overnight batches had settled and well before the end of the day session, settlement of RITS payments could be delayed without serious impact to the financial system. This prioritisation decision did, however, contribute to RITS taking longer to recover than the two-hour RTO set out in the PFMI. The Bank has implemented or plans to implement a number of actions that support the ability of RITS to recover within two hours of a disruption, even in circumstances similar to those of 30 August.
    • The Bank plans to move a server that supports the automated failover of the RITS database to a third site, to remove the risk that this server is also impacted by the same contingency that affects systems at a production site.
    • The Bank has identified the issue that prevented the automatic failover of FSS on 30 August, and has implemented a software update that addresses this issue. Prior to this the Bank had put in place an updated manual procedure to allow IT staff to quickly respond if a similar circumstance arises again.[15]
    • The Bank has developed a decision tree to supplement the detailed documentation in its existing standard operating procedures covering recovery of FSS. This will enable staff to quickly reference sections dealing with recovery processes in the event of an incident.
    • Some IT services that support RITS and FSS were unavailable in the initial stages of responding to the outage.[16] The Bank is continuing to make improvements to the resilience of these services.
  • Readiness to use contingency arrangements. The extent of the outage revealed a lack of preparedness for contingency arrangements among some industry participants, and has prompted the Bank to renew its focus on the potential for invoking fall-back arrangements in cases where recovery on the day is not possible.[17] The Bank has since commenced a process of engagement with RITS members on the contingency arrangements for individual clearing systems in the event of a prolonged RITS outage. This includes a review of the current fall-back arrangements in the AusPayNet High Value Clearing System (HVCS). The Bank has also been working with ASX to explore potential improvements to contingency plans for Austraclear in the event that settlement in RITS is not available, and has confirmed its contingency arrangements for CLS in the event that RITS is not recovered in time to process CLS payments and an AUD payments holiday is declared.
  • Communications. The extent of the failure of Bank systems resulted in the loss of communications equipment for a period of time on the day, which hampered some channels for internal and external communications, including via the RITS Helpdesk. Despite these challenges, the Bank's post-incident review indicated that communication arrangements worked well during the incident; on 30 August there were 14 service notifications made via SMS and email, 14 teleconference meetings held under the incident response frameworks of AusPayNet and NPPA and regular bilateral communications were undertaken with the individual feeder systems.

3.1.2 Bank-wide resilience review

The Bank is reviewing its broader IT operational practices in light of a number of new systems coming into production and some recent incidents that affected usual operations. The aim of this review is to ensure the reliability of technology services and, in turn, the Bank's business operations. The scope of the review will include processes used to manage applications, software development, infrastructure, changes, configuration, third party risks, releases and testing. The Bank will incorporate any identified improvements to ensure its operational practices remain at a high standard and promote operational resilience and reliability consistent with the criticality of its payment and settlement systems.

3.1.3 Cyber resilience

The Bank has continued work to further strengthen its cyber resilience over the assessment period. This builds on work in the 2016/17 assessment period to review RITS's cyber-security controls, operational resilience, and options to improve the ability to detect and recover from a disruption of service in RITS, or loss of software or data integrity.

The previous Assessment noted that Payments Policy Department would continue to monitor progress in two areas related to this work:

  • the implementation of the remaining recommendations arising out of the completed reviews of RITS's cyber security and cyber resilience.
  • continued exploration of ‘non-similar’ technology that could enable further enhancements to the ability to recover RITS from cyber attacks in a timely manner.

In addition, the Bank has carried out work to verify its compliance with security standards established by SWIFT, and has started work to identify and implement relevant aspects of the CPMI's strategy for reducing the risk of wholesale payments fraud related to endpoint security.

Implementation of additional security measures

The reviews carried out by the Bank in 2016 included a stocktake of existing security controls, a program of penetration testing and a review of recovery capabilities. The highest-priority recommendations from these reviews were addressed in early 2017, and most of the remaining lower-priority recommendations were implemented in 2018. A small number of lower-priority recommendations are being carried forward via related projects and initiatives that are currently ongoing.

In 2018 RITS, together with the Bank's associated payments systems infrastructure, were certified as being compliant with the ISO27001 information security standard.

SWIFT-related security controls

As a user of the SWIFT messaging network, the Bank is required to meet security standards established by SWIFT, including the SWIFT Customer Security Controls Framework (CSCF) that was launched in 2017. The CSCF is a set of mandatory and advisory controls for users of the SWIFT messaging network and provides a baseline security standard across the network. All customers are required to annually attest to their compliance with these controls.

During the assessment period, the Bank commissioned an external firm to conduct an independent assessment of its compliance with the SWIFT controls. The assessment found the Bank to be fully compliant with the mandatory SWIFT controls. After the assessment period, in April, the Bank updated its SWIFT attestation to identify that it will not comply with a mandatory control related to software support for a period of two months, while awaiting the implementation of a large infrastructure refresh project.

Evaluating current and emerging technologies to improve recovery times

Consistent with cyber resilience guidance developed by CPMI and IOSCO, the Bank has undertaken to evaluate current and emerging technology options that may further enhance the capability of RITS to meet the CPMI-IOSCO two-hour RTO; that is, to be able to safely resume critical operations within two hours of a cyber disruption. The first stage of this evaluation was completed in late 2017, with the Bank deciding to further explore a technology option that is ‘non-similar’ to RITS but could provide an additional recovery option. The Bank expects exploration of this recovery option to continue over the next assessment period.

Plans to enhance RITS business continuity procedures and testing

As noted above, the fall-back arrangements in place for HVCS, including in the event of a RITS outage, are being reviewed. An industry work group has now been formed to review the challenges with the arrangements, and analyse how they can be enhanced or a more practical option implemented. Members had previously provided feedback during the Bank's consultation on the resiliency of RITS in 2017, and in subsequent industry-level discussions. Once the working group has recommended its preferred approach, the Bank will provide leadership and coordination of the work required to implement any changes, including testing and documentation.

The Bank also plans to undertake a cyber-related business continuity exercise during 2019. This will likely be a table-top exercise that simulates a cyber-event affecting the RITS ecosystem, and will involve selected RITS members and industry stakeholders.

CPMI Wholesale Payments Endpoint Security Strategy

In May 2018, CPMI released its report Reducing the Risk of Wholesale Payments Fraud Related to Endpoint Security.[18] The report describes a strategy for reducing the risk of wholesale payments fraud related to endpoint security (i.e. security arrangements between wholesale payment systems, messaging networks and their participants).

The Bank is in the process of implementing the CPMI strategy and sees enhancements to endpoint security as an ongoing process of continuous improvement. Over the next assessment period, the Bank plans to progress work to:

  • review the range of RITS endpoint security risks and how these are currently mitigated
  • where appropriate, strengthen the security requirements that apply to RITS members and other users of RITS (such as batch administrators)
  • review the tools and information used by the Bank and provided to RITS members to assist in preventing and detecting wholesale payments fraud.

3.2 Legal Basis

The Bank has a requirement that all overseas-domiciled RITS members provide an independent legal opinion that the RITS Membership Agreement is enforceable in their home jurisdiction. Following the signing of new RITS Membership Agreements in 2017, the Bank has continued to work with foreign members on the provision of legal opinions that meet the Bank's requirements, in cases where members had not provided a legal opinion previously or their previous opinion required updating. The Bank has received and accepted legal opinions received from the majority of foreign members and expects to complete this process during the next assessment period.

3.3 Access and Participation

3.3.1 Tiered participation

The Bank's policy on direct participation in RITS requires authorised deposit-taking institutions (ADIs) with RITS RTGS transactions that are at or above 0.25 per cent of total RITS RTGS transactions to settle their wholesale RTGS transactions using their own ESA. The Bank collects data on the use of indirect settlement arrangements by ADIs to ensure that it is aware of any ADIs that exceed the 0.25 per cent threshold and can take steps to transition these ADIs to settlement via their own ESA. Since December 2018 the Bank has collected this information from RITS members that are settlement agents rather than from indirect-settling ADIs that held an ESA. The new reporting arrangements follow changes made to the Bank's ESA policy in response to amendments made to the Banking Act 1959 in May 2018 that allow all ADIs to call themselves banks. The Bank decided to remove the requirement for indirect-settling banks to hold an ESA for contingency purposes rather than extend this requirement to all ADIs.

3.3.2 Non-ADI Exchange Settlement Accounts

The Bank allows non-ADIs to hold ESAs for the purpose of settling payments on behalf of third parties, subject to meeting operational and liquidity standards, in order to ensure that non-traditional payment service providers are not at a competitive disadvantage by being dependent on an institution that would otherwise be a competitor (for details of participation in RITS see Appendix A, section A.1).

During the assessment period, one third-party transaction processor for card payments opened an ESA.

3.3.3 LCH Protected Payments System arrangements

LCH Limited Ltd (LCH) is a UK-based central counterparty (CCP) that offers central clearing for a range of products, including over-the-counter interest rate derivatives and inflation swaps through its SwapClear service. LCH settles AUD transactions, typically variation margin payments, across its ESA in RITS. LCH also operates an Australian ‘Protected Payments System’ (PPS) that enables the settlement of AUD obligations directly between ESAs held at the Bank.[19] During the assessment period, a fifth Australian bank began to use the Australian PPS, allowing the settlement of AUD obligations with LCH directly using its ESA. Going forward, the Bank expects any direct participants of LCH's SwapClear service with an active ESA (and which joined since LCH Ltd was licensed to offer the SwapClear service in Australia) to settle their AUD obligations using the Australian PPS.

3.3.4 CLSClearedFX service

CLS is an international payment system for settling FX transactions in 18 currencies, including the AUD. The CLSClearedFX service is a payment-versus-payment settlement service for cleared deliverable FX products and operates in a separate session to the CLS core service. It is designed to eliminate the principal risk associated with the settlement of obligations in each currency arising from FX derivatives. LCH commenced using this service in July 2018 for FX options and currently settles five currencies including AUD.

3.4 Communication procedures and standards

3.4.1 Strategy for ISO 20022 Payment Messaging Migration

SWIFT has recently announced plans to cease ongoing support of some categories of MT messages, used in cross-border and correspondent banking payments, after November 2025 and migrate them to the International Organization for Standardization (ISO) 20022 standard. SWIFT's end goal is to fully migrate all payments and reporting traffic to ISO 20022, allowing the community to use the same standard for all payments flows. A date has not yet been announced when the MT messages used in closed user groups, including those used for HVCS, will cease to be supported. Nonetheless, given SWIFT's end goal for full migration to ISO 20022 and the number of international projects underway, it is an appropriate time to consider adoption of the ISO 20022 standard in Australian payment systems.

The Payments System Board has identified this as a key strategic issue for the industry and endorsed the Bank conducting a consultation with industry on the migration to the new standard. In April 2019, the Bank published a joint consultation paper with the Australian Payments Council (APC) setting out the key strategic issues involved in a migration from MT messages to the ISO 20022 messaging standard. The Bank and APC intend to release two further consultation papers as part of a consultation program during 2019 and 2020. The second paper will provide a summary of initial consultation responses and suggest options for the key strategic issues. The third paper will present final conclusions from the consultation program and implementation plan. The Bank and APC have targeted the end of 2024 to complete ISO 20022 migration, well ahead of the end of support for the affected MT message types.

Footnotes

Austraclear remained operational throughout the outage and free-of-payment transactions continued to settle. [12]

A small number of property refinancing transactions were deferred. [13]

In 2015 RITS availability was 99.831 per cent, while in 2012 availability was 99.948 per cent. Availability prior to 2017 was calculated using a different methodology, although the impact of this change is not material. [14]

A repeat of the combination of factors that obstructed automatic failover on 30 August, which resulted from the complete loss of power at the precise moment that a particular process was running, is considered highly unlikely. [15]

Including documentation services that store support procedures. [16]

Such arrangements involve processing transactions prior to interbank settlement, which would then occur on a deferred-net basis. [17]

For an overview of the report, see Box A in the 2018 RITS assessment https://www.rba.gov.au/payments-and-infrastructure/rits/self-assessments/2018/pdf/2018-assess-rits.pdf. The full report is available at <https://www.bis.org/cpmi/publ/d178.htm>. [18]

For more information on LCH's SwapClear service and LCH's PPS, see the 2018 Assessment of LCH Limited's SwapClear Service, available at https://www.rba.gov.au/payments-and-infrastructure/financial-market-infrastructure/clearing-and-settlement-facilities/assessments/lch/2018/pdf/lch-assess-2018-12.pdf. [19]