Dear Student,

This course is scheduled to be retired on Aug. 30, 2024. You may continue to work on this course until then. We are not replacing this course at this time.  Please browse this subject to find other comparable courses.

 

NOTICE: This is an older course recorded with Adobe Connect and/or Vimeo recordings. We are currently working to replace the recordings with new Zoom recordings.  Please don't hesitate to email us at homeschoolconnections@gmail.com with any questions.

How to get the most out of Data Science, Part Two with Domenico Ruggiero:

  • First, have a notebook ready and available for class notes each live session.  Although much of the time will be spent programming the computer, you may still want to make notes for yourself.
  • Have the Python Jupyter Notebook installed and up and running before we get to that portion of the lecture.
  • Do the assignments, quizzes, and any extra work assigned for that week.  Assignments are typically submitting a copy of what was performed during the lectures.  Parents may request a copy of what the assignment should look like or sign up for "Instructor Access" for personalized grading and feedback.  Extra work includes watching some videos related to Data Science.  1 Final exam at the end of the course is comprehensive all the semester's content.
  • Be willing to practice what you learn on other datasets and to explore what is possible.  Your learning should focus on what is delivered in class, but should not be limited to that.  Practice makes perfect! 
  • Have fun while you learn!
  • Once the course is completed to the parent's and professor’s satisfaction, there is a Certificate of Completion at the end to be filled in for your records.

Total Classes: 15

Duration:  90 minutes

Prerequisite:

  1. An understanding of algebra is recommended for understanding of polynomial equations, algebraic reasoning, and problem-solving.

  2. An understanding of matrix mathematics and statistics is helpful but NOT required – they will be discussed in the lectures.

  3. Previous computer programming experience -- Python programming preferred but other programming languages are acceptable.  Computer Programming 101 (available as a recorded course through Unlimited Access) and/or Introduction to Computer Science (also available as a recorded course through Unlimited Access) would provide sufficient prerequisite experience.  Much of the analysis will take place using Python-based computer programs.

  4. General familiarity with computers including the ability to open applications, use menu-driven commands, and type using the keyboard so that the emphasis of the lessons is on specific programming assignments and related data-science topics

Suggested Grade Level: 9th to 12th grade.

Suggested Credit: One full semester Computer Science or Math

Instructor: Domenico Ruggiero, MS-EM

Course Description: This valuable course is the first in a two-semester exploration of many topics associated with data science.  In many industries – agriculture, medical fields, cyber-security, manufacturing, and more – and from within the small-scale family business to big-data corporations like Google, the availability of data is almost everywhere.  The ability to work with that data to gain insights into correlations, the visualization of that data in a variety of charts and plots, to be able to identify data that appears to be an outlier from the larger dataset and/or from the trends, and to predict future outcomes based upon variable inputs, these are all just some of the ways that data is used to assist people in determining valuable insights in otherwise chaotic and disconnected pieces of information.

Because data science can be applied to so many working environments, the study of it is no longer just limited to those who are interested in a career in Information Technology (IT).  Data science is becoming one of the fastest growing professional careers available because of its ability to find a “home” in so many industries.

Course Outline:  Topics subject to minor changes.  Topics will be interspersed throughout lectures and will span multiple weeks.Topics subject to minor changes.  Topics will be interspersed throughout lectures and will span
multiple weeks.

  • Data Modeling
    • Data classification
    • Linear Regression (covered in Part 1 of this course)
    • Logistic Regression
    • Bias - Variance
    • ...and more
  • Machine Learning
    • Natural Language (text mining)
    • Decision Trees
  • Getting data from external website APIs
  • Data Analysis
    • Exploring data sets of various types (sports data, traffic data, feedback reviews, etc.)
    • Working with relational datasets (one-to-one, one-to-many, and many-to-many)
    • Review of (or introduction to) statistical math methods
    • Data visualization in Python and spreadsheet applications

Course Materials:  All course materials are to be provided by the professor.  Software to be installed -- Anaconda (https://www.anaconda.com) with Python 3.x version which is available for Windows, Mac, and Linux operating systems.  Within Anaconda, ensure that the Jupyter Notebook and Spyder add-in applications are installed.  The open source Anaconda Distribution is the easiest way to do Python data science and machine learning.

Homework:  Computer-generated quizzes, at-home analytical exercises, and exploration of methodologies applied towards items of personal interest.  Spreadsheet applications like Microsoft Excel and/or Open Office (https://www.openoffice.org) may also be utilized.  Students can expect 2-6 hours of studies outside of class depending upon their proficiency with programming in Python and their previous familiarity with algebra, matrix mathematics, and statistics.  If some of the math is new, then naturally there’s time that would need to be spent on learning math before it can be effectively programmed.

Answer Key: An answer key book is not included with this course, though assignment guidance is provided where appropriate.