The following are some introductions and evaluations of my personal experience in exploring how to collect data from different sources and how to clean different types of data. Text is an important information carrier for human beings (Dragulanescu,2002; Yang et al.,2018), and text analysis is an important tool for
It is often commented that 80% of the work of data science is data cleaning, while only 20% is analysis (Browne-Anderson, 2018). Despite this, the actual contents of what data cleaning entails is largely obscured, often dismissed as a tedious and laboursome yet necessary exercise (Rawson and Muñoz, 2019). While