From the course: Learning Amazon SageMaker (2019)
Unlock the full course today
Join today to access over 24,200 courses taught by industry experts.
Data summary tools
From the course: Learning Amazon SageMaker (2019)
Data summary tools
- [Instructor] In the previous example we walked through a number of simple ways to create visualizations in a Jupyter Notebook but you can also create a number of data summary outputs as well, so these are raw summaries about the pandas DataFrame. And the main two tools that are used is one is the describe function, so again using the churn DataFrame that was imported earlier, you can run the describe function straight from it and that will take all the numerical values and generate a count, create some basic statistics, the mean, standard deviation, min/max and interquartile ranges and you can review these and see how the data has been shaped. Sometimes I find depending on the number of columns that it can be quite difficult to see what's happening from this point of view, so after the describe function, if you run capital T, that will transpose, so you can start scrolling down, so the first check that I like to look at is are the counts all the same? Usually if the counts are…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.
Contents
-
-
-
-
(Locked)
Data analysis tools5m 45s
-
(Locked)
Download and import data4m 2s
-
(Locked)
Investigate data4m 1s
-
(Locked)
Data visualization: Categories3m 18s
-
(Locked)
Data visualization: Numerical3m 19s
-
(Locked)
Data summary tools3m 7s
-
(Locked)
Challenge: Describe a dataset40s
-
(Locked)
Solution: Describe a dataset4m 23s
-
(Locked)
-
-
-