Due Date: 22 March 2023
This page outlines the instructions for the first project. You should
have a file project03.Rmd
in your RStudio workspace where
you can work on the project. I find that students prefer having a
consistent format for the projects, so I will attempt to keep the format
the same throughout the semester.
Your group is responsible for completing two elements:
The slides must be uploaded by the first day of presentations regardless of when you present. As described on the syllabus, the project will be graded using a rubric, which can be found here.
The data for this project is similar to the previous one, but this time reviews come from Yelp. I created the data set based on what is provided by the Yelp Open Dataset project. As with before, I have selected reviewers with a large number of reviews. However, this time you have a number of different variables to work with. Variables that are available to focus on (either as a prediction or as an unsupervised grouping variable) are:
As with the first two projects, you are encouraged to take the data in whatever direction you find most interesting. The only thing I require is that you focus a non-trivial amount of your work on integrating the unsupervised approaches covered in Notebooks 8 and 9 into your analysis. Note that gender has too few categories to use unsupervised on its own and business names likely have too many to do supervised learning.
If you need some questions to get started, I am happy to provide some of these. However, there are so many directions to go in, I only want to do this if you are struggling with an idea.
Each group is working with a different city’s data. You should be
able to download your data set from within the
project03.Rmd
file.
Group 1: charlotte
Group 2: calgary
Group 3: pittsburgh
Group 4: phoenix
Group 5: las-vegas
Group 6: madison
Group 7: cleveland
Group 8: toronto
While working through the project, I typically find that many groups ask for help writing the same bits of code. Any notes that I want to share about how to do specific tasks will be added here as we work through the project.