Tech Writers at PF
Discover how Root Cause Analysis (RCA)helped us to learn from our mistakes and reduce our bugs by 37%.
Hello “Bug Hunters”!
Why did we decide to go for Root Cause Analysis?
Working with agile methodology we were implementing new features as fast as the speed of light! Indeed, the quality was good but we did not analyse what bugs we had. So as a team we realised that solving bugs without understanding the origin of them is a big waste of time. By saving this time we could better focus on implementing new features instead of blindly fixing them. That’s when we came up with the idea to try Root Cause Analysis.
NOTE : You can find all details about RCA itself in various articles such as here or Wiki. In this blog I will explain how exactly we implemented it and what benefits it gave us , so you might find it useful to introduce into your team as well.
Long story, short — we reduced our bugs by 37% simply by using RCA in our process. Quite impressive, right?
Our Journey
1. Identified “Bug Categories” as a team.
As a team we gathered and discussed how can we divide our bugs by categories. The team included developers, product owner and me, the QA.
Note: Initially we had 10 categories to make it easier from the beginning but later we came up with additional one. So feel free to start with less categories and add them as needed in future.
It can really differ from team to team depending on the project, technology and so on — there are many factors. We did it for the Property Finder Mobile App Team and here are the categories we came up with as an example:
- Basic Scenario: Spec — when the bug is due to not looking in the spec or user story
- Basic Scenario: Design — when the bug is due to not looking at the design
- Basic Scenario: Test Case — when the bug is due to not looking into test cases (happy pass)
- Basic scenario Localization —when bug is due to not looking in translations
- Older Version of OS
- Legacy — when the bug came from old code that was reused
- 3D Party Library — bug in any of 3D party libraries that are used
- Migration
- Undefined Scenario — A bug due to a scenario that was missed on grooming by everyone, or edge cases that are not documented
- Lack of Test Cases — The bug appeared for a scenario that did not have written requirements covered in the test cases
- Lack of Design — Bug appeared on the design that was missed
- Missing Unit Tests —Could a unit test have helped to avoid this bug?
- Integration — Bug due to side effect of another feature/bug or after integration of several components
- Non-Clean Code — bug appeared due to wrong pattern choice or lack of good code practices
- Other — specify if none of categories above matching
2. Documented it on Confluence so that everyone is on the same page. Documentation is key!
3. Made the “Defect Root Cause” field mandatory for each defect to be moved to QA Column in JIRA*
*we use JIRA but it can be any other similar tool!
4. Each sprint the QA creates a diagram for the team to discuss and analyse the data. We choose to do this during the Retrospective meeting.
Important: The QA should consider only valid defects for RCA that were actually fixed. Defects with “Not a Bug”, “Not Valid”, “Not Reproducible”, “Won’t Fix” are not considered here.
Example of Diagram:

5. During the Retrospective meeting with a team, or RCA-specific discussion, we discuss the top 3 categories with the highest number of defects and put actions for these not appear in future.
Of course you can define any number of categories to talk about — depending on complexity of your project and how much time you want to spend in meetings.
Example based on diagram above:
The first 3 categories were :
- Version of OS — This happened because the team did not take into account that a new version was released for iOS devices which would contain a lot of UI kit changes. That affected our app after the OS release.
- Legacy — As many of you will empathise with, we have a lot of legacy code that we wish to replace with something nicer one day!
- Missing Root Cause — people were forgetting to input the Root Cause. We included this into this diagram although it is not really the root cause of the defect. However we figured that as team members were forgetting to put it at the beginning so we had to measure it too! So it is good to measure it on early staging until the team gets used to the new process and we actually fix it.
Actions has been taken to prevent these:
1. Version of OS
- We decided to use a Beta version of xCode on one of our machines once it is available and a Beta version of iOS because Apple always provides it in advance. This way we can build and see if our app is affected by OS changes
- Documented the steps on Confluence with a plan of what to do once iOS Beta is available
2. Version of OS
- Have a plan to refactor the parts of the legacy code that are most affected
- Mention in coherent comments what exact component was affected by a bug in the future
3. Missing Root Cause
- Make this field mandatory so that we won’t have the issue of people forgetting to put it in future
We noted these actions on Confluence as well, to make sure we won’t forget!
To Conclude
- First of all we reduced 37% of defects having theRCA mechanism and using this PR Template which clearly helps avoid bugs occurring in future. I wonder how what percentage of bugs RCA can reduce for you and your team?
- The team aware of where bugs are occurring so it gives a clear picture about vulnerabilities.
- This data can be good evidence for upper management to re-prioritize some tech work such as refactoring, because often these type of activities lag behind creating fancy new features in terms of priorities.
- Also this data can be used for personal mentoring so that managers can work with their employees better. It is not finger pointing — it is just a way to measure and improve your personal work and guide you towards becoming better
That’s it! Thanks for reading!
I would be happy if you can share some feedback here in comments. If you use RCA in your team already — please share how it helps your team and how you use it. And please do share that percentage of bugs reduced!
If you liked this article and want to be part of our brilliant team of engineers that produced it, then have a look at our latest vacancies here.