November 14, 2018 — Advancements in the field of data science are presenting new opportunities for ship owners looking to improve fleet utilization by combining advanced analytics with lessons learned from operations. It is now possible to quantify the reliability of maritime assets, improve decision-making for fleet operations, identify emerging risks and, ultimately, improving vessel availability and scheduling flexibility.
With the use of advanced data analytics, operators can move beyond calendar-based regimes for vessel maintenance into condition-based models, where maintenance and classification schedules are driven by the current condition of equipment.
Central to this new model is the detection of ‘anomalies’ that help to identify the early onset of the conditions that lead to component and systems failures. Detection of these ‘early warnings’ can reduce operating costs and maximize the duration of assets and their components.
Transitioning to condition-based maintenance
Shipowners, many of whom operate fleets of high-value assets, have historically struggled to gain meaningful visibility of the working condition of their ships in order to consistently reduce the expense of operational failures.
To help them assure the safety of life, property and the environment (and now data), they have traditionally turned to the rigors of classification, which require their assets to obtain periodic renewals and undergo detailed surveys and inspections, the timing of which has been dictated by set schedules.
Calendar-driven maintenance models are broadly based on the preventive recommendations of original equipment manufacturers (OEM), who set inspection schedules after testing components during production. Periodic schedules have served a purpose in that they offer certainty about the time and scope of maintenance activities and the associated costs; they have also been responsible for the high fixed costs of maintenance, a lack of warning about operational failures and the inefficient replacement of parts.
However, advances in data science and technology now support a transition to condition-based maintenance (CBM) models, wherein maintenance interventions are performed when needed, and not according to what can be arbitrary factory dates set by OEMs.
Today’s onboard equipment has hundreds of sensors to detect features such as temperatures, pressures, etc.; combined with high-speed connectivity, these allow large quantities of data to be continuously generated and assessed.
All this will have an impact on designs as advanced data analysis provides asset owners unprecedented visibility into the causes of failure.
Traditionally, how marine equipment was used largely has stayed within the original intent of the design. However, the operational insights brought by data analytics have many OEMs and operators predicting significant shifts in component usage, operating conditions and operator skills.
In this new data-enabled world, where calendar-based maintenance models will be found wanting, demand will grow for CBM models. It is simply the next logical step for as fleet strategies evolve from corrective, to preventative, and now condition-based regimes.
However, while CBM models will advance maintenance practices, they will not prevent machines from degrading or failing.
The ability to analyze multiple types of data to reveal the real condition of equipment is also promoting a condition-based class model, in which surveys and audits are becoming more efficient, less intrusive and ensuring high safety standards for classed assets (see graphic below).
The transition is allowing survey planning to be tailored to a specific asset and improving the efficiency of the classification process, while maintaining overall risk and safety metrics.
Condition-based class model
Building this class capability requires a data model that can capture, aggregate and integrate the divergent data types that are extracted from an asset throughout its design, operation and service history.
- sensor data: time series, e.g., temperature
- maintenance logs: transactional
- digitized inspection reports
- design changes, and their impact
- survey reports
- data from wearable inspection devices and drones
These are some of the data lenses through which modern class can now identify and analyze the anomalies that signal the potential for component failure.
Anomaly detection and interpretation
The aim of anomaly detection is to pinpoint unusual patterns of behavior. If abnormal conditions are identified, further analyses can confirm findings such as equipment damage, changes in operating conditions and modes, or simply a degraded sensor or other issues related to data quality.
Below is an illustration of the basic workflow for detecting an anomaly. Data from the equipment is fed to an anomaly-detection ‘engine’, which includes the definition of a ‘normal’ pattern. ‘Normal’ conditions are ‘learned’ from the data by simultaneously analyzing the correlations and relations between multiple variables or single parameters, and their various states under multiple operating conditions.
The next step is choosing a technique to detect anomalies, with most methods falling into two broad categories: ‘supervised’ or ‘unsupervised’.
Unsupervised methods find patterns in data by identifying commonalities among sub-groups of the data that are unlabeled; supervised methods usually require labeled historical data in which past anomalies are identified and categorized into root causes under specific operating conditions.
To identify anomalies in operational data, single- and multi-variable approaches are used. For complex equipment such as engines, pumps, etc., using a multivariate method is more robust, as it accounts for different operating modes, and the interaction between parameters.
A model for the ‘normal’ state must be constructed, as well as a measure for the ‘distance’ to normal. Therefore, most methods calculate an ‘outlier’ score to estimate a data point from which a ‘normal’ determination is made. The methods used to detect anomalies can include:
Model-based methods: if a data point does not fit a field of known data, it is considered abnormal. Models that summarize data — such as regression models, probability-distribution models, or cluster models — are employed to detect anomalies. For example, if you’re testing to see if two sets of data came from the same probability model, a test for anomalies can be constructed, such as using a likelihood ratio test. Even if the data prove not come from the assumed distribution, these tests are still effective in pointing to regions of interest.
Density-based methods: methods that find natural ‘clusters’ of related data also detect data points, which are not part of known clusters. Regions in the data space, with sparse density surrounding them, often point to potential anomalies.
Distance-based methods: various techniques to determine the distance between two data points or sets of data have been used to develop methods for detecting anomalies. For example, to examine if a test data point occurs at the extreme edge of probability distribution, a measure of the distance from a known distribution can be used.
In the graphic below, data have been taken from three sensors, measuring the parameters of interest from a propulsion system of a ship. These parameters were chosen for being known to have affected the performance (shown below as parameters 1, 2 and 3) of the system over time, based on experience of normal operations and several failures.
A combination of different anomaly-detection methods, probability models and distance-based methods were used to detect anomalies in multiple variables simultaneously. The outputs are then combined using a weighted scheme to confidently identify an anomaly.
As highlighted in the elliptical area, data movements such as these can point to a potential event (shown by the vertical line). The time on the x-axis between the detection of the event (shown by the ellipse) and the event (the vertical line) illustrates the ‘lead time’ before a corrective action is required to prevent a failure, and downtime for the asset.
There are several important lessons to be learned in developing anomaly-detection processes, broadly related to the availability of data from the sensors; the design of algorithms for anomaly detection; and consumption of the output from the process.
Sensor Variation: the units of measurement and location of installation of sensors on the components usually vary across the fleet. Corrections accounting for this must be deployed.
False positive and false negative errors: this bears directly on the assumed risks from either missed anomaly alerts (false negatives), or the effort to interpret and respond to all alerts (false alarms). The methods must be optimized based on the acceptable levels of risk.
Selecting the parameters: in a typical operational marine asset, there could be several thousand parameters being measured. Deciding which parameters to include for anomaly-detection processing for specific equipment poses a data-dimensionality challenge. This can be addressed using the historical knowledge of the equipment’s design and operations.
Algorithm deployment: deploying anomaly-detection algorithms at a central location helps to gain insights from across the fleet. However, deploying at the edge can provide earlier threshold-based alerts to onboard personnel.
Anomaly consumption: a deliberate process to consume the output of the algorithms must be developed. These processes include: characterizing actual alerts vs. sensor issues; the feedback cycle from on-board personnel; and the operating procedures to respond to specific alerts for effective anomaly detection.
Advances in data science are already helping ship-owners and operators to improve their maintenance practices. They hold many of the keys to speeding the transition from calendar-based to more condition-based models for maintenance strategies. This, in turn, will reduce the cost of operations and the uncertainty of sudden downtime for high-value assets, further assuring their availability.
Fundamental to this transition is the process and role of using data to help detect the anomalies that serve as the early-warning systems for component failure.
To improve on-the-ground benefits of the science, more work needs to be done to discover the inter-connectivity of advanced data-driven methods, data acquisition and the connectivity with business operations.
After that, next step will be to explore the relationship between data-driven methods and ‘soft’ factors such as the human element, and their impact on the overall success of the condition-based process.
By Subrat Nanda, Chief Data Scientist at the American Bureau of Shipping (ABS)
* An edited version of this piece appears in the November 2018 issue of Marine Log.