Due Date: Noon, 22 October 2020 (Thursday)

Total Points: 60

This page outlines the instructions for the second project. You should have a file project02.Rmd in your RStudio Cloud workspace where you can work on the project. Note that the form of this assignment differs from the first.


For this project you will generate a thesis statement and supporting data-driven argument using the movies datasets that were introduced in class. Note that a thesis statement is not the same an hypothetical thesis statement, also known as an hypothesis. You will write your analysis in an RMarkdown file; the only thing to hand-in is your knit html film. There should be enough writing (in full, proofread sentences) intermixed with the output datasets and plots to understand your argument and the meaning of the output without having to understand the code.

We will be working on this project in class in your groups, but each student will submit their own copy of the assignment. These can range from carbon-copies of everyone in your group to a completely re-done version of the project. If feasible, you’re welcome to work together with your group or a subset of your group outside of class. However, you are not allowed to work directly or share code with students outside of your assigned class group. Asynchronous students should work on and submit the project on their own.

Your data-drive argument should involve a thesis statement that requires putting together two or more data tables in the movies dataset but can be addressed without any additional external data. Here are several directions to start thinking about where you may want to go with this project:

These are just suggestions to get you started. Feel free to consider other relationships that you can investigate with this data. Note that will need to start looking at these questions and then formulate a thesis statement based on what the data shows.

Some advice that you may find helpful (note: these are requirements):

I will be happy to answer general questions about the project in class, and am always happy to help with R code questions (i.e., “We want to make a plot that does X, but are not sure how because of Y”.). However, coming up with an interesting thing to look at is generally your responsibility. I have already provided a number of possibilities above to get everyone started.


The project will be graded out of 60 points, according to the following rubric:

As noted above, please submit your knit Rmd file as an HTML document on Box. Note that you will not be able to properly preview the file on Box, but should be able to view in locally on your machine. Everyone must submit their own copy of the project, even if it is exactly the same as others in your group. You will receive a grade for your work through the shared Box folder. A current participation grade will also be included.