We can tune PID controllers, but what about tuning the operator?
by Stephen E. Rubin, President & CEO Longwatch, Inc.
The purpose of tuning loops is to reduce errors and thus provide more efficient operation that returns quickly to steady-state efficiency after upsets, errors or changes in load. State-of-the-art manufacturers in process and discrete industries have invested in advanced control software, manufacturing execution software and modeling software to “tune” everything from control loops to supply chains, thus driving higher quality and productivity.
The “forgotten loop” has been the operator, who is typically trained to “average” parameters to run adequately under most steady-state conditions. “Advanced tuning” of the operator could yield even better outputs, with higher quality, fewer errors and a wider response to fluctuating operating conditions. This paper explores the issue of improving operator actions, and a method for doing so.
Over the past decade we’ve spent, as an industry, billions of dollars and millions of man-hours automating our factories and plants. The solutions have included adding sensors, networks and software that can measure, analyze and either act or recommend action to help production get to “Six Sigma” efficiency. However, few, if any, plants are totally automated. Despite a continuing effort to remove personnel costs and drive repeatability through automation, all plants and factories have human operators. These important human assets are responsible for monitoring the control systems, either to act on system recommendations, or override automated actions if circumstances warrant.
Most of the time, operators let the system do what it was designed and programmed to do. Sometimes, operators make errors of commission, with causes ranging from misinterpretation of data to poor training or errors of omission attributed to lack of attention or speedy response. An operator’s job has often been described as hours of boredom interrupted by moments of sheer panic. What the operator does during panic situations often depends on how well he or she has been trained, or “tuned.”
The focus on reducing human error isn’t trivial: multiple studies by the US Department of Energy (DOE) and Electric Power Research Institute (EPRI), based in Pali Alto (CA US), have identified the probability of errors in power plants and nuclear facilities. System availability, quality of output and operator safety will become even more important as industry restructuring (such as the Smart Grid Initiative) take hold. We can expect similar issues across the broad spectrum of other manufacturing and process industries.
We know from life experience (and management training) that “a chain is strong as its weakest link.” The irony in our factory automation strategy is that while we’ve invested heavily to improve our data sensing and automation systems, we haven’t made similar investments to strengthen and improve the human element (the operators) in our systems.
There are instances where time is wasted just figuring out “what happened” before remedial action is taken. Often, the plant is down while troubleshooting occurs, or the problem can’t be determined until the engineers are allowed in to do the troubleshooting. In the meantime, the plant must operate at reduced output or quality.
While we can view the conditions that occurred during an incident with process historians, we don’t really know what the operator response was. Operator console monitoring software, which records what the operator saw on the HMI and what his or her response was, makes it possible to reconstruct exactly what happened for diagnostic, operator training or process improvement
Figure 1: One way to “tune” operators is to observe and critique their actions during a plant alarm or incident. Modern video and historian tools allow engineers and operators to “go back in time” to view what happened, and what actions the operator took at the HMI during the incident. This photo shows a Wonderware ActiveFactory screen (center left), with historical data on the top and video from the plant floor on the bottom. The center right screen shows a video recording of the HMI screen, and what actions the operator took during the incident. With all this information, it is possible to reconstruct exactly what happened during an incident, and what the operator did about it.
For example, years ago I was doing a plant start-up involving a new distributed control system at a cement plant. We received a call saying some “ductwork had collapsed” and the control system was suspected as the culprit. After arriving at the plant with a colleague, we spent a full day poring over alarm printouts and talking with operators, even decoding register stacks in the software – looking for the problem (the “ductwork” was a 15-foot diameter 30-foot long steel duct connecting the preheat tower with the kiln). Ultimately, we discovered that the induced draft fan damper’s hydraulics were connected backward (when the operator commanded “open,” the damper closed and thus the duct collapsed like a plugged drinking straw). Had we only had a video recording of the operator’s console’s display, we could have seen the command, the damper position indicator and the vacuum measurement. It would have saved man-hours of engineering time, and would have had the plant up and running many hours sooner!
Figure 2: In this oil spill, the operator was not notified until the floor in the pumping station was inches deep in fuel oil, and the leak detector finally went off. Observing the video absolved the operator, because it became obvious that the leak detector was in the wrong place.
Another example involves an oil leak at a pumping station (Figure 2). While the remote video camera recorded the number 6 fuel oil spilling onto the floor, the leak detector didn’t go off until the floor was flooded. When it did, the operator quickly switched to the video feed, saw the leak, and called in a cleanup crew. If a console recorder had been installed, it would have helped diagnose the problem immediately—that is, the leak detector was in the wrong place. By putting the remote video on one screen and the operator console video on another screen, engineers could have seen exactly what conditions existed in the pump house when the operator was finally alerted. Watch video of this incident by clicking on picture (Fig 2).
A console recorder can clear an operator of possible errors. In the case of the oil spill, it would show that the operator did exactly what was necessary given the information available at the time. If regulations require an operator to perform certain functions at certain times or after certain activities, the console recorder can document that the procedures were done properly. It would give operators confidence to know that they are being “backed up” by the console recorder.
The Approach and the Obstacles
The simplest way to provide operators with a means for improving their performance is to give them visual feedback of their activities. Professional sports players, coaches and teams watch “game films” and “scouting films” on a regular basis. Films of their own game performance provide an efficient means of reviewing game conditions, plays, actions, errors, and “what we want to do differently next time.” Scouting films help teams develop strategies for winning, and feed information into practice sessions so that “muscle memory” can be called upon readily and confidently. (This same technique is used for initial and recurrent training of airline pilots, as well).
Until now, the use of operator console playback to support better operations has been extremely limited. A few systems allow the playback of historically archived data through the human machine interface (HMI) display. This technique is cumbersome because it requires the historical archive to have collected all necessary data prior to playback. This method re-creates the display at the sample interval of the historian, not necessarily at the frequency of the display, thus raising the possibility of aliasing display data. Worse yet, it doesn’t answer the question, “what was the operator actually looking at when the event of interest occurred?”
Another method involves putting a video camera over the shoulder of an operator to record the screen. Unfortunately, the video camera typically does not have enough resolution to see everything on the screen and, if the operator has multiple screens, it requires a videocam on each screen. Also, there is no way to coordinate this video with other plant systems, such as the historian and video from the plant floor. In some venues, work rules discourage the use of cameras in the control room.
The ideal solution is a system that can coordinate what the operator actually saw, video from the plant area where a situation occurred, and data from an historian. Such a system can recreate the exact conditions that occurred during an incident.
For example, “alarm floods” are an ongoing problem. Studies show that 90% of alarms are due to incorrect system configuration and poor alarm strategies. Alarm floods usually occur at the worst possible time for a control room operator, such as start-up, shutdown and trips. Eliminating alarm floods requires a complete analysis of alarm priorities. This would be much easier if the analysts could see what the operator was presented with during an alarm flood.
A 21st Century Solution
With the advent of inexpensive computers, digital video recording and specialized software, there are new ways to improve operator training to reduce errors and improve uptime and quality. The Longwatch Console Recorder software is a small module that is loaded into the computer running the HMI or DCS/SCADA console display software. This “software camera” takes the image generated by all programs using the computer’s display (including HMI software) and presents it to the Longwatch Video Engine where it is recorded as a digital video stream. The result is a video file identical to that which would be recorded by a real camera but, in this case, contains the image of the HMI display, as well as the mouse movements as the operator moves and clicks around on the screen. (See also our article last Nov 2009: The next HMI revolution!).
Because the Console Recorder is recording exactly what is being shown on the operator’s display, there is no ambiguity about what the operator is seeing. Thus, ex-post-facto analysis of the display might offer insight as to whether the sensed data (measurements and statuses) were displayed properly, and whether the operator took appropriate action. The Console Recorder’s video will also show if the operator was looking at the appropriate display or was otherwise distracted.
The Longwatch Video Historian expands the analysis capability further. The unique “message mapping” capability of the Video Historian enables the user to see exactly what was being displayed on a variety of recorded consoles when particular events occurred. Many plants use multiple HMI displays—typically the “three CRT” configuration pioneered by DCSes years ago. Multiple HMIs can display events and data ranging from plant alarms to messages generated by other applications (such as workflow tracking or quality/lab testing subsystems). Being able to see what was on all the displays at the time of an event can be very useful for reinforcing good manufacturing and safety practices, and for training the operators which screens to watch under various conditions.
Recording the console displays is feasible at last due to the advances in networking and video management technology. The foundation of the system is digital video recorders, called Longwatch Video Engines (LVEs). These collect video from the consoles or real cameras, archive the video in distributed data stores and monitor for real-time events that indicate areas of interest in the video.
The HMIs or DCS consoles are connected to the LVEs with a small piece of software (the actual Console Recorder) that captures the video on its way to the computer’s graphics adapter and sends a copy to the LVE via the network (Figure 3).
Figure 3: The Longwatch Console Recorder can play back what operators were watching at the time of an incident. Each HMI or DCS workstation is connected to Longwatch Video Engines using a small “Screen Grabber Applet.” The SGA takes a copy of the screen image of the HMI console (including mouse movements), compresses it and sends it efficiently over the network to the Longwatch Video Engine, where it is processed and recorded. Video from the operator consoles can be combined with historian data and video from plant cameras to reconstruct an event. See Figure 1.
Being able to play back what the operator was actually seeing – along with a video recording of the operator’s actions and even combined with actual video from the plant – can be valuable for other purposes. Consider the classic experience of an engineer getting the phone call in the middle of the night saying that something isn’t working right. After arriving in the control room, the first series of questions is, “what happened, what did you do, and did you change anything?” The answers to the last two questions are often “nothing” and “no.” A video playback of the display might corroborate the answers, or it might shed light on things forgotten.
Flight data recorders and cockpit voice recorders have become standard equipment in U.S. airliners. They have had a material effect in helping reconstruct accident scenarios and, in doing so, offer real information for improvements in airplane and airport design, flight control systems, and pilot/crew training. As a result, the safety and efficiency of air travel has improved in the United States to the point where it is safer than and cost-competitive with automobile travel.
The same opportunities exist in the operations of our plants. Large, complex facilities (think: nuclear power plants) have simulators to help train operators. But with the Console Recorder, every plant, regardless of its size or complexity, can collect real information about operations and operator actions – so that improvement in operations, design and training can be achieved.
The Console Recorder could be combined with simulator training, so the instructor can play back what the operator did during a simulation and critique the operator’s actions.
Every plant has one or more excellent operators that always know what to do in every situation. With the Console Recorder, it’s possible to analyze an expert operator’s response to various process conditions, and use it to train new operators.
System integrators (and plant engineers) often put together HMI displays based on their own designs and experience. The same holds true for computer programmers building programs that require input from and display results to “mere mortals.” In the “old days,” there were two ways to build HMI displays: one was to have the engineer guess what the operator needed and wanted. The other was to have the engineer stand over the operator and observe the operator’s actions to see if the HMI display was working properly. In the former scenario, the chances that the operators would embrace the output of an engineer who “wasn’t one of us” were slim. In the latter scenario, a lot of time consuming observation yielded marginal results.
For example, many HMI displays use high resolution graphics, 3D icons, flashing colors and other modern techniques to display information. But do operators actually use these displays? Or are they only used when VIPs are touring the control room or an engineer is watching? Analysis of operator actions might reveal that operators actually prefer a simple faceplate display to exotic graphics and, as soon as the design engineers leave the control room, they switch over to the tried-and-true displays. If so, then the display engineers should concentrate on improving those displays instead of developing “VIP displays.”
With a Console Recorder, the engineer can zero in on particular operations, alarms and events and see what was displayed and how the operator reacted. In this learning environment, the designer can quickly determine if the man-machine interface display and command sequence is clear and unambiguous. In the spirit of Kaizen, or continuous improvement, the engineer can use these console recordings to refine the data display and command entry techniques. Better yet, the recordings can show the designer what happened during unusual events: things that would be difficult to observe unless you were living in the control room, but easy to observe because the Console Recorder captured the “game highlight.”
Integrators who use console recorders to demonstrate their commitment to continuous improvement can develop a competitive advantage and additional value for their clients.