MATH 3823 Survival Rate Analysis of Adelaide Dataset Project

MATH 3823 Survival Rate Analysis of Adelaide Dataset Project

MATH 3823 Survival Rate Analysis of Adelaide Dataset Project

 

MATH 3823/5824M Page 1
MATH3823/5824M Practical Coursework
Year 2020/21
General assessment information:
• This practical coursework is included in the assessment for MATH3823/5824M. It
comprises 20% of your overall mark for the module.
• Questions on this sheet that are only relevant to MATH5824M students are indicated
by “[Level 5 only]”.
• The written report (which must be typed, not handwritten) must be clearly marked
with your name, student ID and module code, and should be no more than 8 pages
for MATH3823 students and no more than 10 pages for MATH5824M students.
• Your report should be handed in on, or before, Wednesday 31 March 2021 by 17:00
in Minerva (go to Learning resources → “Assessment Coursework” folder, and then
look for MATH3823 submission or MATH5824M submission). Late submissions will
carry a 5% penalty for each day up to 14 days.
• We have set up a drop-in session on Monday 22 March, 10:00 – 12:00 to offer help and
advice on this practical. Links for the drop-in session will be available in Minerva
(Learning resources → “Assessment Coursework” folder).
• The analyses in the practical should be performed using the statistics program R.
Background to the problem:
The dataset adelaide.txt has been taken from Dobson and Barnett (2008), pp. 145–146.
It contains information on the numbers of graduates surviving for 50 years, as a function
of the following explanatory variables,
• year — year of graduation, 1938–1947
• faculty — M = Medicine, A = Arts, S = Science, E = Engineering
• sex — M=Male, F = Female
with two response variables,
• survive — the number of graduates surviving for 50 years, and
• total — the total number of graduates with this combination of explanatory variables.
MATH 3823/5824M Page 2
Tasks you need to perform:
1. Read the data into R using the commands
adelaide=read.table(“adelaide.txt”, header=TRUE)
attach(adelaide)
2. Fit a model for which the survival probability depends on year+faculty+sex.
3. To begin with, ignore questions of statistical significance, and explain what each
parameter represents (i.e., how do they affect the survival probabilities). In particular,
what is the fitted probability of survival for each of the following combinations:
sex=M, year=1941, faculty=M
sex=F, year=1938, faculty=E
4. However, you might have noted that there are no women recorded doing Engineering!
So what does the second fitted probability mean?
5. Comment on whether or not the model fitted in point 2 is appropriate. Investigate
whether or not a more complicated (e.g., interactions) or a simpler model would be appropriate to describe the data. You might use parameter estimates and their p-values,
deviances and residuals. If you do include any interactions, you should count the degrees
of freedom carefully.
6. Next analyze the male and female data separately. Compare the outputs to the model
involving the same explanatory variables but fitted with the original data. Do you get
any differences in interpretation? [This question is motivated by the fact that women did
not do the full range of subjects.]
7. [Level 5 only] Now compute a new variable, say y, which is the total survival proportion
in year t, including both male and female students from all faculties. For example, in
1938 there were 102 students of whom 68 survived for 50 years, so we have t1 = 1938 and
y1 = 68/102 ≈ 0.667. Fit a smoothing spline f(t) to these proportions using smoothing
parameter λ = 1, and fit a model which uses the smoothing spline as an explanatory
variable in place of the year; also include sex and faculty in your model. Compare this
model to the model from task 2 above, and comment on any similarities and/or differences.
Writing up:
You should take some care with the presentation of your results. This includes using a
clear structure and layout, careful explanation of your conclusions and how you arrived
at them, meaningful plots with appropriate labels, etc. You should start your report
MATH 3823/5824M Page 3
with a short (1 paragraph) summary of your findings, written in a style suitable for a
non-statistician.
The aim of this practical is to explain the analyses that you have performed and simply
giving R commands does not do this. Any R output should be carefully explained, with
extracts of output inserted in the text. It should be clear what R commands have been
used to produce the output.
In addition to the report, attach an appendix that includes all of the R commands that
you have used. The appendix does not count towards the number of pages in your report.

Do you need a similar assignment written for you from scratch? We have qualified writers to help you. You can rest assured of an A+ quality paper that is plagiarism free. Order now for a FREE first Assignment! Use Discount Code "FREE" for a 100% Discount!

NB: We do not resell papers. Upon ordering, we write an original paper exclusively for you.

Order New Solution