Question 2

– use a multiple linear regression. You can do this stepwise if you’d like, in order to describe how each predictor makes the model fit better.

Model Fitting using the cherry trees example from Lab 7

How does each predictor variable influence cherry tree volume?

Calculate a series of linear regressions to explain it.

library(boot) #this package contains the trees data set
library(ggplot2)
library(dplyr)
data(trees)
#compute a girth:height ratio
trees <- trees %>% mutate(GH_ratio = Girth/Height )
head(trees)

##   Girth Height Volume  GH_ratio
## 1   8.3     70   10.3 0.1185714
## 2   8.6     65   10.3 0.1323077
## 3   8.8     63   10.2 0.1396825
## 4  10.5     72   16.4 0.1458333
## 5  10.7     81   18.8 0.1320988
## 6  10.8     83   19.7 0.1301205

#before running the model, test the collinearity of the predictors, to see which model you should start with.

#
#model1 <- lm(Volume~height +GH_ratio + lc, data=trees)
#AIC(model1)

#model2 <- lm(Volume~height +GH_ratio, data=trees)
#AIC(model2)

Question 3: Maximum Likelihood (MaxLH) estimation

This question includes negative log likelihood

Check stats with Josh Stamer on Youtube (StatQuest) for an explanation of maximum likelihood. Max Likelihood begins to take us into Bayesian territory, where we are calculating the probability of a model given the data
::rather than the frequentist approach of finding the probability of the data given a model [typically the null model])

Maxmium likelihood is similar to bootstrapping, where an analysis involves fitting multiple models (that often contain multiple parameters) and seeing which one best fits. -> uses up a good deal of computing power.

What is negative log likelihood (-LLH)?

Most optimization methods we use in computers are set up to search for minimum values rather than maximum values (so we are effectively just inverting the LH values we get from a MaxLH fitting procedure).
When -LH values become very close to zero, they become too small for computers to handle reliably, SO we take the log of the negative likelihood that will increase it to a size that is more manageable for computers. -> Note that very small -LLH values are increasingly common and problematic as your sample size increases.

How to do MaxLH estimation in R

library(stats4) #package for running MLH

#dummy data
set.seed(1001)
N<- 100
x<-runif(N)
y<- 5*x+3+rnorm(N)


#fit our data using OLS first for comparison
fit<- lm(y~x)
summary(fit)

## 
## Call:
## lm(formula = y ~ x)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.91541 -0.75171  0.02844  0.75250  2.17175 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   3.0332     0.1808   16.78   <2e-16 ***
## x             4.8772     0.3117   15.65   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9501 on 98 degrees of freedom
## Multiple R-squared:  0.7142, Adjusted R-squared:  0.7112 
## F-statistic: 244.9 on 1 and 98 DF,  p-value: < 2.2e-16

#plot it.
plot(x,y)
abline(fit,col="red")

#Now do it using -LL

#first set the negative log likelihood to a funciton
NLL <-function(a,b,sigma) {
  y.pred=a+(b*x)
  LL=(dnorm(y,mean=y.pred,sd=sigma, log=T))
  -sum(LL)
  
}

#now we run it using mle
?mle

#run the MLE but note that this can take a lot of computing power in some cases.
#the nelder-mead method is one of several that you can use to converge on a solution.

fit2 <- mle(NLL, start=list(a=mean(y), b=0, sigma=sqrt(var(y))), method="Nelder-Mead")
summary(fit2)

## Maximum likelihood estimation
## 
## Call:
## mle(minuslogl = NLL, start = list(a = mean(y), b = 0, sigma = sqrt(var(y))), 
##     method = "Nelder-Mead")
## 
## Coefficients:
##        Estimate Std. Error
## a     3.0332596 0.17897472
## b     4.8768778 0.30854689
## sigma 0.9405478 0.06650392
## 
## -2 log L: 271.5342

#we can look at the NLL estimate
logLik(fit2)

## 'log Lik.' -135.7671 (df=3)

#Now we can compare the two models.
fit$coefficients

## (Intercept)           x 
##    3.033213    4.877196

fit2@coef

##         a         b     sigma 
## 3.0332596 4.8768778 0.9405478

Other tips

For running PCA…

check out datacamp.com/community/tutorials/pca-analysis-r

For Q4 - Logistic Regresion is needed.
Bayesian Priors example

Nothing to do with the HW, but if you want to learn more about this topic, check out:

Example

Lab 8 - HW4 notes

Fred Clavel

March 5, 2019

Question 2

Question 3: Maximum Likelihood (MaxLH) estimation

How to do MaxLH estimation in R

Other tips