From the course: Cleaning Bad Data in R
Unlock the full course today
Join today to access over 24,100 courses taught by industry experts.
Suspicious multiples
From the course: Cleaning Bad Data in R
Suspicious multiples
- [Instructor] Another common source of suspicion in datasets is when you see unusual recurring multiples. The most common example of this is when all of the values in a dataset end in several zeros. This may be the result of rounding or it may come from extrapolation. For example earlier in this course we used this dataset containing the number of acres of public land in each state. Did you notice anything suspicious about this dataset when we first looked at it? Well all of the values here end in three zeros, and there's a good reason for that. I built this file using a government source document. Let's take a look at that document. The data in our file comes from the first and third columns of this page. Take a look at the third column. It contains the total area of national forest system land but it's showing the data in thousands of acres. So Alabama is listed as 665, which represents 665,000 acres. Back here in the dataset we have the round number 665,000, that's not the exact…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.