LII:Computational Thinking and Big Data

Title: Computational Thinking and Big Data

Author for citation: Lewis Mitchell, Markus Wagner, Simon Tuke, Gavin Meredith, and Ian Knight of the University of Adelaide

License for content: Unknown

Publication date: 2020

This is an advanced University of Adelaide-created course that is released on the edX platform, and is part of a university MicroMasters Program. The scheduled 10-week course is designed to help learners to better "apply computational thinking in data science" using tools such as "mathematical representations, probabilistic and statistical models, dimension reduction and Bayesian models." The course is free; a certificate of completion costs $249. The course requires on average eight to 10 hours a week of effort. Access to the class is on-demand.

The edX course description:

"Computational thinking is an invaluable skill that can be used across every industry, as it allows you to formulate a problem and express a solution in such a way that a computer can effectively carry it out.

In this course, part of the Big Data MicroMasters program, you will learn how to apply computational thinking in data science. You will learn core computational thinking concepts including decomposition, pattern recognition, abstraction, and algorithmic thinking.

You will also learn about data representation and analysis and the processes of cleaning, presenting, and visualizing data. You will develop skills in data-driven problem design and algorithms for big data.

The course will also explain mathematical representations, probabilistic and statistical models, dimension reduction and Bayesian models.

You will use tools such as R and Java data processing libraries in associated language environments."

By the end of this course, you will be able to:

"Understand and apply advanced core computational thinking concepts to large-scale data sets
Use industry-level tools for data preparation and visualisation, such as R and Java
Apply methods for data preparation to large data sets
Understand mathematical and statistical techniques for attracting information from large data sets and illuminating relationships between data sets"

About the authors

Five instructors are affiliated with this course in some fashion. To learn more about each instructor, go to the edX course page and click on the name of each instructor.

General layout and contents of the course

The opening week begins with an overview of the R language and RStudio. The second week looks at visualizing relationships between variables, whereas week three looks at how to better manipulate and join those variables and other data. In week four, students will learn various methods for transforming data and reducing a dataset's dimensions. Week five then goes into tools for better summarizing datasets and their characteristics. The following week's session steps back a bit and examines how Java relates to computational thinking and big data analysis. Weeks seven, eight, and nine address graphs, probability, and hashing, respectively. The final week pulls everything together in a practical manner.