Logistic regression

Bird dataset

To briefly switch things up, I want to look at a new dataset giving details of nearly 2000 bird species.

set.seed(1)
x <- read.csv("../../data/avian_ssd_jan07.txt", as.is=TRUE, sep="\t")
x <- x[x$Clutch_size > 0,]
x <- x[x$M_mass > 0,]
x <- x[x$Egg_mass > 0,]
dim(x)
## [1] 1685   44

As a sanity check, do the birds with the largest egg mass make sense to us?

x$English_name[order(x$Egg_mass,decreasing=TRUE)[1:10]]
##  [1] "Emu"                 "Wandering Albatross" "Royal Albatross"    
##  [4] "Brown Kiwi"          "Great Spotted Kiwi"  "Emperor Penguin"    
##  [7] "Mute Swan"           "Whooper Swan"        "King Penguin"       
## [10] "Little Spotted Kiwi"

Emu