HOWTO: How to document downtime and maintenance for OEE improvement?

To improve your OEE, you need to know what issues occur in your operation that distracts from the target score. We will look into how to look at the downtime and waste that impacts your OEE. We provide a common pattern / template that is the basis for any subsequent pareto analysis.

What is OEE?

Manufacturers around the world strive each day to keep up with their planned schedule to make their products on time and within the allotted budget/price. This can become very complicated as unseen breakages in machines or other factors like labor shortages can affect the finished product in terms of both time and money, preventing them from getting maximum OEE.

For manufacturers who seek to understand where losses may occur in their business and to increase productivity over time in a standardized way, OEE is one way to track that. Overall Equipment Effectiveness or OEE in short is a measure of a machine's potential that brings into light opportunities or areas of concern so that they can be improved.

OEE is implemented in Manufacturing Industries to identify, monitor, and reduce production losses. As such, it has become a universal Key Performance Indicator (KPI) for manufacturers and a lean manufacturing best practice. This is why steps should be taken to improve it.

OEE is very helpful for processes that are machine led, such as automated MES-driven manufacturing lines. Production processes with large manual assembly parts, are often more driven by the worker’s pace and not as much by the machine’s tact.

So how can we implement OEE in the manufacturing industry?

Simple, by documenting downtimes, maintenance work and losses, because without them you do not have the required data to study your operation.

Why should we document downtime or maintenance work?

You should document your downtime and losses to check which issues/problems are arising regularly and at what intervals. In that way, we can do preventive maintenance before an issue arises or tackle it at an early stage so that the small problem that you are currently facing cannot become big and you end up using extra resources whether it may be time, labor, material, etc.

The value of documentation is very important because without having good documentation that shows the problems any machine is having, we cannot predict.

  • What occurs?
  • How often does it occur?
  • What was the solution that was applied to it?
  • What can be done to prevent it from happening again?
  • Or what can be done to learn at an early stage that the problem has arisen?

OEE is based on three components or factors which are Availability, Performance, and Quality. Availability: Availability incorporates all the events that halted planned production long enough so that it affects the overall production schedule and should be noted in the documentation (Typically it can be several minutes). Performance: Performance is comprised of anything that causes the manufacturing process to run at less than the maximum possible speed (it includes both slow cycles and small stops). Quality: Quality makes up all of those manufactured parts that do not meet quality standards and need to be reworked or reproduced/remanufactured.

Importance of Downtime Documentation

If you do not use documentation, you rely on memory about the major issues in the production. If you have a small operation with a handful of people, that might be good enough. If you have multiple shifts and larger teams, it’s important to write things down to share knowledge and to remember later.

With documentation you can determine what issues happen with what frequency and you can prioritize your efforts to improve your OEE.

Methods to track Downtime and Waste

So, the question now arises “How to track downtime?”

It can be tracked by two means:

  • Manual Tracking
  • Automated Tracking

Manual tracking

Manual tracking means that stops and waste are recorded by hand by the team on a paper shift book or software tool. There is however no automation to remind operators to write things down but only a convention to document each “exception”.

This method is prone to human error, as it is easy to forget to record data and downtime is often not measured correctly. There is a high risk of a skewed view of the actual production performance of the plant.

  • Unreported downtime periods
  • Actual downtime duration based on “guess”

Operators might not bother reporting downtimes that are common, such as tooling changeovers. Even when downtime is reported, critical details might be forgotten or left out of a report. Busy operators might simply document that the machine was down with a gut feeling about the duration.


Manual Downtime Tracking Sample after inputting in Excel Spreadsheet by smartsheet.

However, manual tracking is much cheaper than automated tracking and requires no special equipment.

It’s a good way to start tracking and the data can still be used to get some insights.

Automated machine tracking

Automated machine tracking can provide a more accurate picture of what’s going on in your factory. By tracking data points such as equipment utilization, work order completion rates, and downtime events, you can get a clear understanding of where your bottlenecks are and what needs to be done to fix them. Usually a sensor or machine status will report a stoppage at the line electronically. The operator is typically prompted on a tablet, computer or HMI to qualify the stop:

  • Why did the machine stop
  • What actions they have taken
  • Common tasks are often categorized (“setup”, “weekly cleaning”, …)

The software will record the duration of the stop and if available also the waste along the with operator’s notes.

The automation of data collection via machine monitoring addresses many of the problems encountered with manual methods.

Using software linked to a machine’s control system guarantees that accurate start and stop times are tracked automatically for all downtime events. Some systems even prevent restarting the machine until operators enter reasons for the downtime, and the choice of reasons is limited to a standardized list to provide exact information. In this way, all stops are recorded whether they are of just a few minutes or last longer than an hour.

Normally used downtime tracking systems include:

  • A company-wide integrated management system (IMS)
  • Manufacturing execution systems (MES)
  • Computerized maintenance management systems (CMMS)
  • Specialized downtime tracking software connected to a machine’s PLC
  • Machine data loggers with periodic data dumps or telemetry

Organizations normally do not restrict themselves to one mode of automated downtime tracking or single tracking software. Most of the options listed above can be used with each other in order to provide better and exact data on downtime. As the saying goes, "What gets measured gets managed." Calculating downtime might seem elementary, and should be done. So, all stops should be tracked whether they are unplanned, planned, small stops, slow cycles, Production rejects or startup rejects.

The kind of downtime and maintenance stops that occur

Whenever a downtime or maintenance stop occurs a comment should be added in the context of documenting the stop explaining what was the type of stop and the solution applied (whether it may be labor, spare part, etc.) in order to solve the problem that occurred.

This is very valuable information as the text provides a brief summary of what kind of downtime occurred its causes and maintenance work done. These can be studied later to improve OEE. Usually, these faults/stops that a machine faces fall under these categories.

Planned Stops are those that are downtimes in which equipment is scheduled for production/ working but is not running due to a planned event. Examples include changeovers, tooling adjustments, cleaning, planned maintenance, and quality inspections. Breaks and meetings are also categorized as Planned Stops by some companies.

Unplanned Stops are those downtimes in which equipment is scheduled for production but is not running due to an unplanned event. Examples include equipment breakdowns, tool failures, unplanned maintenance, lack of labor/operator, lack of materials, being starved by upstream equipment or being blocked by downstream equipment.

“Microstops” happens on automatically tracked lines. There are thousands of small micro stops where the machine is waiting for an upstream process to forward work-in-progress pieces. The line isn’t stopped but the intake is just waiting for a few secs or millisecs. It’s not practical to annotate each millisecond stop with a comment but the sum of these microstops can be significant and should be measured as a sum. This data can be used against the “Performance” part of the OEE.

Slow Cycles happen when equipment runs slower than the Ideal Cycle Time (the theoretically fastest possible time to manufacture a single piece). Examples include dirty or worn-out equipment, poor lubrication, substandard materials, poor environmental conditions, operator inexperience, startup, and shutdown. This can be related to “Microstops” as the result is also a lower throughput.

Production Rejects are defective parts produced during stable (steady-state) production. This includes parts that can be reworked since OEE measures quality from a First Pass Yield perspective. Examples include incorrect equipment settings, operator or equipment handling errors, or lot expiration (e.g., pharmaceutical).

Startup Rejects are defective parts produced from startup until stable production is reached. They can occur after any equipment startup, however, are most commonly tracked after changeovers. Examples include suboptimal changeovers, equipment that needs “warm-up” cycles, or equipment that inherently creates waste after startup (e.g., a web press).

Who should document a stop?

The first person to be on the line in case of a stop is usually the operator. The operator is well suited to do these entries as he/she is closely associated with the machine and knows about its ins and outs.

If the maintenance engineer gets involved in the trouble shooting, he/she should also document his/her work. Ideally the maintenance work is linked to the original event, documented by the operator.

What to document?

We have talked about this in the proceeding chapters but to summarize.

For each event (stop/waste) you should:

  • Simple classification as mentioned above (unplanned, planned, changeover, break …. )
  • Free text comments (can be 2 fields for operator and maintenance engineer)
  • Waste amount (if not logged automatically)

Adding comments to documentation is very important as it can help in explaining further all the possible work applied to a problem, the cause of it, and the solution applied to it.

In this way, you can get insights into what is happening in real by reading in detail the documentation logs, identifying which problems are recurring the most through text analysis software, and making specific criteria that can be used in your industry on that particular machine so that the chances of the most recurring issue can be minimized.

What’s the process of documenting downtime

An issue occurs (say a stop)

  • The operator works on the issue
  • The operator documents his/her work on a tablet/HMI
  • If the issue is resolved, the stop is confirmed and production restarted
  • If the issue is not resolved, maintenance is called
  • Maintenance tries to rectify the issue and also documents their work on the same log-entry.
  • The stop is confirmed as soon the machine restarts.

Such a process creates one record for each event and operator and maintenance work is logged on the same record. If the stop is recorded automatically, you also get downtime and

What not to do

Record each event as an own line in excel. Some companies keep shiftbooks and make one summary log entry per shift. If you have multiple events mixed together in one record it’s difficult for the analysis software later to figure out what action has been done for what event.

Having each event record separate makes it also clearer for you as a reader what belongs together.

Pareto Analysis

One of the best ways to identify major problems is through Pareto charts drawn by compiling the data in a downtime document.

Pareto analysis is used to identify problems within an organization. As an overwhelming amount of impact is often tied to a relatively smaller proportion of a company, Pareto analysis strives to identify the more material issues worth resolving or the more successful aspects of a business.

See our article about how to create your pareto charts.

There is also an excel add-in which can help you create pareto charts from your downtime documentation.

Learn about process, roles and organizational setup how to turn analysis into actions