Catching Faults with Centralized Condition Monitoring

In 2007, Exelon Corp. began the Centralized Performance Monitoring (CPM) pilot program. The goal was to reduce downtime costs and lost revenue associated with the 25% of unplanned forced losses across its fleet of 17 nuclear power units without additionally taxing existing personnel or adding new personnel. Exelon determined that at least 2% of these losses could be caught with a centralized monitoring program.

Exelon selected InStep’s PRiSM online condition-monitoring software and interfaced it to Exelon’s existing real-time plant data historian infrastructure. Together, they provide a fleetwide centralized solution requiring only two corporate-level individuals.

Within two months Exelon had developed 500 different models with the InStep solution and identified $540,000 in avoided faults. According to Chris Demars, Exelon corporate engineering CPM project manager, a conservative annualized estimate of failure avoidance is $3.3 million. This program also won the 2008 NEI Top Industry Practice "TIP" award. Here is the story behind this success.

1.     Quantify the failure. The top graph indicates a 25% deviation from normal condensate pump operation as determined by the multidimensional PRiSM model analytics. The second graph indicates that measurement sensor readings BRW01V_T2560 (inboard bearing temperature) and BRW01B_T2561 (outboard bearing temperature) are the main contributors to the deviation. Source: InStep Software

Selecting the Solution

Several factors drove Exelon to look for an alternative to the decentralized monitoring model. As staff resources and senior expertise to monitor its plants shrunk, the value of intelligent monitoring grew. A solution that lessened the burden on personnel, extracted more data, found more faults, and was quickly installed and scaled was necessary to increase Exelon’s efficiency and meet business goals. Additionally, early detection of equipment failures prevents the creation of a hazardous environment that accompanies rotating equipment failures or the release of industrial gases and process fluids; it also improves nuclear and radiological safety.

The Exelon CPM pilot focused on monitoring a 17-unit fleet without hiring additional personnel. The software vendor selection process lasted three months, from March to May. Eight vendor products using a variety of different technologies were reviewed and assessed on 35 different factors.

InStep Software’s PRiSM online condition-monitoring software was chosen for the pilot project. PRiSM was interfaced to Exelon’s existing real-time plant data historian infrastructure, which includes InStep’s eDNA and another historian software application. According to Demars, "InStep’s experience in the nuclear industry combined with their data historian specialization, allowed them to develop an effective, intelligent, easy to install and use anomaly detection tool that fit our selection criteria better than any other on the market."

InStep’s PRiSM software is a self-learning analytic application for real-time online monitoring of critical assets for condition-based maintenance. The software uses pattern recognition and advanced data-mining technologies to provide for advanced early warning of equipment problems and failures. PRiSM learns from an asset’s individual operating history and develops a series of normal operational profiles for that piece of equipment. PRiSM then compares the known operational profiles with real-time operating data to detect the subtle changes in system behavior that are often the early warning signs of pending equipment failure. "One of the hallmarks of PRiSM is its ability to quickly develop models; this made it a strong candidate for Exelon early on," says Demars.

Deploying the Solution

One of the big savings of the InStep solution involved minimizing the number of plant personnel needed to monitor the fleet. The original calculation for a decentralized program manned by 300+ part-time corporate-level individuals included an allowance of time for training, individual software installations, and periodic updates and refresher courses. This would have led to a commitment of seven man-years every 12 months to reduce the 2% of catchable unplanned forced losses. The InStep solution allows a user to quickly assemble and train a group of related plant process computer points in a model that, when deployed, will constantly monitor those points for behavior that deviates from the norm.

InStep’s PRiSM learns nominal system behavior from archived system data and automatically develops a model of nominal operation that is stored in a knowledge base. This system knowledge base is continuously compared to the online system to alert personnel of pending equipment problems. The system alerts an individual to parameter relationship changes that should be investigated for potential adverse equipment conditions that could lead to equipment failure. The system features an online graphical user interface, web-based alarm management, and easily read anomaly visualization.

"One of the critical elements for success with this project was the speed at which we are able to model. It was literally a half an hour or less and a model would be ready. We put together 500 models and the pilot began," says Demars. "PRiSM was developed for installations like Exelon’s with many distributed, complex assets and a premium on personnel. The PRiSM technology provides for the ability to best capture and use the knowledge of a few key senior personnel, allowing their experience to be applied across a fleet of assets in place of just a single plant," says Sean Gregerson of InStep Software, who worked with Exelon on the project.

Due to the design of the technology infrastructure and the integration of project functions within an existing corporate engineering group, the centralized performance monitoring concept was implemented with comparatively little staffing.

Fault Catching and Cost Savings

Personnel savings were just the beginning. Within two months, the catching of one major fault and two smaller faults saved Exelon over $500,000. The following descriptions of these faults were provided by Demars.

First Catch. We discovered that a condensate pump motor’s bearing oil temperatures were not within the allowable range as defined by the multidimensional PRiSM model. The cause was found to be an improperly assembled coupling that was seizing and approaching mechanical failure. Had this gone undetected, the coupling would have resulted in damage to both the motor and the pump, requiring a replacement time of four to six weeks.

Replacement cost, expediting fees, and craft overtime were estimated at $700,000. The probability of this failure was estimated as 0.70 or $490,000. Online loss of the pump with a failure of the standby pump to start would have resulted in a power reduction of 34% for 12 hours or about $100,000. The probability of this failure scenario was estimated as 0.10 or $10,000. Additionally, potential fatalities or injuries resulting from the ejection of coupling material were completely avoided.

Second Catch. The second significant catch was a service water temperature controller failure that would have resulted in a $30,000 loss. The main turbine vibration model was alerted by a small step change on the number 11 bearing. The vibration level itself was not significant enough to cause an alarm in any of the normal plant-monitoring systems. The step change was caused by a change in the generator hydrogen temperature, which is controlled by stator water cooling and then by service water. The stator water turbine trip function had not been blocked, as had been done in other plants. The temperature/flow control valve was gagged at most plants to limit travel and not induce huge swings in temperature and potentially cause a turbine trip.

Although staff were planning to do that in the upcoming refuel outage, Peach Bottom Atomic Power Station was still susceptible to that type of turbine trip. In other words, the conditions identified by the software may have prevented a turbine trip, which results in a reactor plant shutdown as well as a couple of days of lost revenue. Trip of the main turbine would have resulted in a loss of generation for 24 hours or $600,000. The probability of a turbine trip was estimated as 0.050 or $30,000.

Third Catch. The third catch was a reactor feed pump (RFP) lube oil cooler temperature controller failure that would have been a $20,000 loss. A nuclear unit was recovering from the effects of a transformer failure – induced voltage transient that caused some system isolations and momentary power losses. Shortly after the transient, the RFP bearing the models for all three pumps went into alert. The plant was notified the following day that one of the controllers did not recover from the initial transient and was continuing to cycle significantly. The station determined that the controller for the RFP oil cooler had failed; staff were able to stabilize temperatures manually until the controller was replaced.

The worst case scenario is bearing damage due to rapid overheating and loss of the RFP. The physical damage was estimated at $100,000 with a probability of 0.10 and lost generation of 33% for 24 hours, or $200,000 with a probability of 0.05.

Avoided Costs Add Up

According to Demars, "The total avoided costs for the two-month period was $540,000. If detected failures of a similar magnitude continue to be revealed by the InStep CPM solution, we expect an annualized avoidance of $3.3 million. Avoidance of a failure of a generation critical component could also easily exceed this amount, but the cost avoidance calculation methods are conservative."

The PRiSM software application has now become a critical part of Exelon’s fleet-monitoring solution. According to Demars, "The use of this intelligent monitoring technology within a centralized group monitoring a fleet of generating stations would apply across the industry."

— Contributed by Steve Lundin (slundin@bigfrontier.org) BIGfrontier Communications Group.