Storm dataset again

Today, we are going to look at a NASA weather dataset. This particular one contains information about Atlantic storms. Read it in with the following:

storms <- read_csv("https://statsmaths.github.io/stat_data/storms.csv")

It may also be useful to have a dataset giving the borders of countries:

borders <- read_csv("https://statsmaths.github.io/stat_data/nasa_borders.csv")

Using the arrange function, sort the dataset in descending order of wind speed. Which two storms had the largest wind speeds:

storms %>%
  arrange(desc(wind))
## # A tibble: 10,010 x 13
##    name   year month   day  hour   lat  long status category  wind pressure
##    <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>     <dbl> <dbl>    <dbl>
##  1 Gilb…  1988     9    14     0  19.7 -83.8 hurri…        5   160      888
##  2 Wilma  2005    10    19    12  17.3 -82.8 hurri…        5   160      882
##  3 Gilb…  1988     9    14     6  19.9 -85.3 hurri…        5   155      889
##  4 Mitch  1998    10    26    18  16.9 -83.1 hurri…        5   155      905
##  5 Mitch  1998    10    27     0  17.2 -83.8 hurri…        5   155      910
##  6 Rita   2005     9    22     3  24.7 -87.3 hurri…        5   155      895
##  7 Rita   2005     9    22     6  24.8 -87.6 hurri…        5   155      897
##  8 Anita  1977     9     2     6  24.2 -97.1 hurri…        5   150      926
##  9 David  1979     8    30    18  16.6 -66.2 hurri…        5   150      924
## 10 David  1979     8    31    18  17.9 -69.7 hurri…        5   150      926
## # … with 10,000 more rows, and 2 more variables: ts_diameter <lgl>,
## #   hu_diameter <lgl>

Create a new variable in the dataset called doy (day of year) defined as the month plus the day divided by 32 minus 1.

storms %>%
  mutate(doy = month + day / 32 - 1)
## # A tibble: 10,010 x 14
##    name   year month   day  hour   lat  long status category  wind pressure
##    <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>     <dbl> <dbl>    <dbl>
##  1 Amy    1975     6    27     0  27.5 -79   tropi…       -1    25     1013
##  2 Amy    1975     6    27     6  28.5 -79   tropi…       -1    25     1013
##  3 Amy    1975     6    27    12  29.5 -79   tropi…       -1    25     1013
##  4 Amy    1975     6    27    18  30.5 -79   tropi…       -1    25     1013
##  5 Amy    1975     6    28     0  31.5 -78.8 tropi…       -1    25     1012
##  6 Amy    1975     6    28     6  32.4 -78.7 tropi…       -1    25     1012
##  7 Amy    1975     6    28    12  33.3 -78   tropi…       -1    25     1011
##  8 Amy    1975     6    28    18  34   -77   tropi…       -1    30     1006
##  9 Amy    1975     6    29     0  34.4 -75.8 tropi…        0    35     1004
## 10 Amy    1975     6    29     6  34   -74.8 tropi…        0    40     1002
## # … with 10,000 more rows, and 3 more variables: ts_diameter <lgl>,
## #   hu_diameter <lgl>, doy <dbl>

Create a scatter plot of wind speed as a function of doy. Use a pipe to do this.

storms %>%
  mutate(doy = month + day / 32 - 1) %>%
  ggplot(aes(doy, wind)) +
    geom_point()