It is often commented that 80% of the work of data science is data cleaning, while only 20% is analysis (Browne-Anderson, 2018). Despite this, the actual contents of what data cleaning entails is largely obscured, often dismissed as a tedious and laboursome yet necessary exercise (Rawson and Muñoz, 2019). While
Data broadly, and particularly ‘big data’, is increasingly leveraged to develop a wealth of digital projects and products across academia, government and the commercial sector. While often celebrated for being ‘frictionless’ and enabling efficiency through automation, these processes of computation and digitisation implicate human labour at every stage of the