Understanding the PPDAC Framework

Version: 1.0.0


I have been reading, ‘The Art of Statistics: Learning from Data’ and came across the PPDAC framework fairly early on in the book. David Spiegelhalter, the author, is a big proponent of the framework. I have outlined it below with a view as a data leader.

What is PPDAC?

Having a structured framework in statistical analysis is important to help bring order to complex statistical analysis problems. It does this by bring clarity, a step-by-step approach, consistency, effective solutions aswell as documentation and communication.

The PPDAC framework was originally developed by R. J. MacKay and R. W. Oldford. The framework is useful for teaching people about statistical literacy and data-driven problem solving.

Problem

Define the problem you are trying to solve.

Questions to help define the problem:

Plan

Questions to help define the plan:

Data

Data is the at the centre of any statistical or data science problem. Here you need to understand the collection, management, cleaning and preperation of the data you will need.

Questions to help understand the data:

Analysis

Here you will do data exploration, apply statistical techniques and interpret and visualise your data.

Questions to help with your analysis:

Conclusion

The final step is to take your analysis and visulisations and communicate the answer to the problem to your stakeholders.

Some questions to consider at this stage:

Conclusion

The PPDAC framework is an iterative process. Depending on the problem you are trying to solve and the data you have, you may need to iterate through the process multiple times. Whether the problem needs to be reassessed, the plan tweaked, or more and better data collected are just a few of the challenges that may arise. The key here is that it provides clarity and consistency, giving you a structure to deal with complex problems that allow you to drive data-driven decisions.