STAT 339 (Spring 2020): Homework

- HW7: Clustering
- The final grade will be calculated based on the highest of the following three calculations:
- 50\% of the average of the top five out of seven homework grades + 20\% of the project grade + 20\% of the midterm grade + 10\% of the participation grade (closest to the syllabus: two lowest homeworks dropped, and relative weight of homework preserved)
- 60\% of the average of the top five out of seven homework grades + 15\% of the project grade + 15\% of the midterm grade + 10\% of the participation grade (increased weight to homeworks if they are higher than the average of the midterm and project grades)
- 50\% of the average of the top four out of six homework grades + 20\% of the project grade + 20\% of the midterm grade + 10\% of the participation grade (equivalent to if this homework didn’t exist)

- You don’t need to do a formal proposal writeup, but make sure you’ve had a conversation with me about your proposed topic

- Join the class Slack workspace here
- I encourage you to fill out your profile with your preferred name and pronouns, as well as a photo of your face (this is optional but will help me match names and faces more quickly)

- Fill out the Background Survey

- Warmup Assignment for coding, calculus, and linear algebra

- The pdf of the assignment is here
- You will be implementing KNN as well as a cross-validation algorithm to select the value of K. This is a coding-heavy assignment, and pretty much a direct implementation of what we went over in class on Wednesday 10/6 and Friday 10/8.

- The pdf of the assignment is here
- You will be implementing a linear regression solver that takes arbitrarily many input features and finds either the OLS weights or regularized weights. For the special case of polynomial regression, you will implement cross-validation to select the polynomial order K. The last two problems are linear algebra and calculus exercises with no coding.