Final Thoughts

I hope you’ve enjoyed the semester. My goal was introduce you all to the process of doing data science. I think, though please tell me if this is not true, that if we think of “Introduction to Data Science” as you would any other “Introduction to …” course (i.e., Psychology, Rhetoric, Art History) the class has been generally successful. I believe that you can all talk meaningfully about concepts such as:

  • grabbing data through API queries
  • working with JSON data
  • working with CSV data
  • building and using simple interactive graphics
  • the basic of coding with Python notes
  • functions, for loops, if statements, and other Intro CS concepts
  • using regular expressions
  • hands-on knowledge of how to document, test, lint, and debug code
  • working with network data (vertices, degree, centrality measures, visualization)
  • building and evaluating basic predictive models
  • building and evaluating unsupervised learning models
  • working with textual data (TF-IDF, Topic Models, the elastic net)
  • understanding the concepts behind neural networks
  • working with image data (displaying images, computing similarity scores)
  • parsing and manipulating HTML pages
  • thinking critically about data and communicating the results

If, like many students taking an introductory course, what you wanted to get was a general hands-on understanding of the ideas I think you’re now in great shape. We need many more biologists, lawyers, ecologists, physicians, public policy experts, teachers, ect., to understand the process of data science as these ideas and their output continue to pervade our lives.

On the other hand, if I’ve just whet your appetite for this material and you want to learn more I have both good and bad news. On the negative side, we do not have a larger data science curriculum/major/minor/concentration. I am happy to direct you to one-off courses that may be of interest (Note: most of these are in other departments from mathematics and computer science); these can be great, but unfortunately won’t fit together into a cohesive whole. The positive side is that there are many opportunities for internships, full-time jobs, and graduate studies in data science and applied statistics. I am also more than happy to discuss these options.