MHA610 Introduction to Biostatistics Assignments and DQs

Want create site? With you can do it easy.

MHA610 Introduction to Biostatistics Assignments and DQs

MHA610 Introduction to Biostatistics Assignments and DQs

Course Guide

This course explores the application of fundamental statistical methods to the healthcare environment. Course content includes both descriptive and inferential methods including: data analysis, statistical estimation, regression analysis, analysis of variance, hypothesis testing, and analysis of longitudinal data.

Note: This course uses software that is not Mac OS compatible. Access to a Windows PC or a Windows-based platform is required.

Table of Contents

Course at a Glance

Course Description

This course explores the application of fundamental statistical methods to the healthcare environment. Course content includes both descriptive and inferential methods including: data analysis, statistical estimation, regression analysis, analysis of variance, hypothesis testing, and analysis of longitudinal data.

Course Design

The purpose of this course is to provide an introduction to statistics relating to health care research. Students will be introduced to various health data sources, and will proceed to analyze, assess, and evaluate these data using basic statistical concepts and methodology. Students will learn how to use statistical software in order to obtain and interpret descriptive and inferential statistical results.

Students will also learn basic principles of probability theory and how to draw conclusions from available data utilizing statistical tools, assessing whether observed statistics could occur by chance alone. Notions of probability are of fundamental importance, and students will utilize both frequentist and Bayesian probability concepts in their evaluations.

Statistics not only plays a crucial role in undertaking and interpreting research in the health sciences, but also arises in quotidian settings. For example, what does it mean to have a 20% chance of precipitation today? At the conclusion of this course, the students’ increased understanding of statistics and probability will empower them in their studies, their work, and their daily lives.

Prerequisites

There are no prerequisites for MHA610.

Course Learning Outcomes

Upon successful completion of this course, students will be able to

  1. Apply basic statistical principles for describing, analyzing, and interpreting health
  2. Apply statistical methods of estimation and hypothesis testing in biostatistics and
  3. Analyze relationships between quantitative variables using correlation and linear
  4. Evaluate health care delivery and services using epidemiological data and appropriate statistical
  5. Communicate the findings and implications from statistical analyses to health care

Course Materials

Required Text

Triola, M. M., & Triola M. F. (2006). Biostatistics for the biological and health sciences. Boston, MA: Pearson Education, Inc.

Triola, M. M., & Triola M. F. (2006). [Student companion website].

Boston, MA: Pearson Education, Inc. Retrieved from:

  • The software and data sets for the course may be accessed through

Note: This course uses software that is compatible with both Mac and Windows-based platforms. In addition, Microsoft Excel and Word will be used extensively throughout the course.

Required Resources

Supplemental Materials

Koziol, J. (2014). (2014). [PDF]. College of Health, Ashford University: San Diego, CA.

Websites

Centers for Disease Control. (2014). . Retrieved from

Centers for Disease Control and Prevention. (2014). Retrieved from

(2012). Retrieved from by-age-and-gender

Recommended Resources

Multimedia

Koziol, J. (Producer). (2014). [Video file]. Retrieved from

Koziol, J. (Producer). (2014). [Video file]. Retrieved from

Koziol, J. (Producer). (2014). [Video file]. Retrieved from

Koziol, J. (Producer). (2014). [Video file]. Retrieved from

Koziol, J. (Producer). (2014). [Video file]. Retrieved from

Koziol, J. (Producer). (2014). [Video file]. Retrieved from

Koziol, J. (Producer). (2014). [Video file]. Retrieved from

Koziol, J. (Producer). (2014). [Video file]. Retrieved from

Koziol, J. (Producer). (2014). [Video file]. Retrieved from

Koziol, J. (Producer). (2014). [Video file]. Retrieved from

Koziol, J. (Producer). (2014). [Video file]. Retrieved from

+births%29/0_zipwy4i7

Koziol, J. (Producer). (2014). [Video file]. Retrieved from

+births%29/0_ignho54w

Koziol, J. (Producer). (2014). [Video file]. Retrieved from

+brainsize%29/0_qhhcxu1d

Koziol, J. (Producer). (2014). [Video file]. Retrieved from

+crossover%29/0_srbv0wj1

Koziol, J. (Producer). (2014). [Video file]. Retrieved from

+HCCtest%29/0_4qwt7r8z

Koziol, J. (Producer). (2014). [Video file]. Retrieved from mortality%29/0_snyfubtn

Course Grading

Multiple measures of assessment are used in the course, allowing students opportunities to demonstrate their learning in more than one way and giving consideration to individual learning styles. Course components that will be assessed include:

Discussions

Each week students will participate in online discussions with classmates, which are related to the week’s readings. These discussions replace the interactive dialogue that occurs in the traditional classroom setting. Each week, students’ initial discussion posts are due by 11:59 p.m. (in the time zone in which each student resides) on Day 3 (Thursday). Students will have until 11:59 p.m. on Day 7 (the following Monday) to make the required minimum number of response posts to classmates. Discussions represent 26% of the overall course grade.

Quizzes

In Weeks Three and Six, students will demonstrate and reinforce their understanding of the week’s content by taking open-book quizzes. There is no time limit to complete the quiz, and each quiz can be taken two times. The quiz must be completed in one sitting, by Day 6 of the week in which it is due. The questions are multiple choice and true/false. Each quiz is worth 5 percent. Quizzes represent 16% of the overall course grade.

Assignments

There are written assignments due in Weeks One through Five of this course. These assignments must reflect college- level writing. Assignments represent 40% of the overall course grade.

Final Project

The final assignment for this course is a Final Project. The purpose of the Final Project is for you to culminate the learning achieved in the course by taking a new approach to the datasets that you have looked at throughout the course. The Final Project represents 18% of the overall course grade.

Grading Percent Breakdown

 

Activity

Grading Percent
Discussions 26
Quizzes 16
Assignments 40
Final Project 18
Total 100

 

 

Week One

 

 

Course Content

To be completed during the first week of class

 

Overview

 

Activity Due Date Format Grading Percent
Post Your Introduction Day 1 Discussion 2
Hospital Data Day 3 (1st post) Discussion 4
U.S. Mortality Rates Histogram Day 7 Assignment 8

 

Weekly Learning Outcomes

This week students will

  1. Calculate summary statistics from
  2. Create appropriate graphs and charts for nominal and ordinal

 

 

Introduction

During Week One, you will be introduced to quantitative (continuous) and qualitative (discrete or categorical) data. You will learn appropriate graphical techniques for displaying and summarizing both types of data. You will learn about descriptive statistics for location (i.e., mean, median) and scale (i.e., standard deviation, range), for reporting purposes. You will begin to learn fundamentals of probability theory.

Required Resources

Text

Triola, M.M., & Triola, M.F. (2006). Biostatistics for the Biological and Health Sciences. Boston, Ma: Pearson Education, Inc.

  • Chapter 1: Introduction
    • After reading the chapter, review your grasp of the material in Chapter 1 by solving the odd- numbered questions in the Review Exercises and the Cumulative Review Exercises at the end of Chapter 1. Solutions to these problems are given at the end of the
  • Chapter 2: Describing, Exploring, and Comparing Data
    • After reading the chapter, review your grasp of the material in Chapter 2 by solving the odd- numbered questions in the Review Exercises and the Cumulative Review Exercises at the end of Chapter 2. Solutions to these problems are given at the end of the

Triola, M. M., & Triola M. F. (2006). [Student companion website].

Boston, MA: Pearson Education, Inc. Retrieved from:

 

Supplemental Materials

  1. Koziol, J. (2014). MHA610_Week 1_Discussion_Hospital data [Excel file].
  2. Koziol, J. (2014). MHA610_Week 1_Discussion_Hospital data [Statdisk file].

 

 

Website

World Life Expectancy. (2012). Retrieved from

  • This website houses the data that will be used for the U.S. Mortality Rates histogram assignment for this week.

 

 

Recommended Resources

Multimedia

Koziol, J. (Producer). (2014). MHA610 Week 1 Assignment (Part 1) [Video file]. Retrieved from

Koziol, J. (Producer). (2014). [Video file]. Retrieved from

Koziol, J. (Producer). (2014). [Video file]. Retrieved from

 

 

Koziol, J. (Producer). (2014). [Video file]. Retrieved from

  • These screencasts help explain the Week One

 

 

Discussions

Participate in the following discussions:

 

 

  1. Post Your Introduction. 1st Post Due by Day 1. Post a brief introduction on the first day of class. Share any past experiences (academic or professional) that you have had with epidemiology, biostatistics, or

health data analysis. What topic are you most interested in as it relates to epidemiology and biostatistics? Briefly explain why you are interested in this topic. Additionally, describe what you are looking forward to learning in this course.

 

Guided Response: Review several of your classmates’ posts. Welcome at least three of your peers to this course. What similarities in experience did you note between you and your classmate? Did your colleague’s description of his or her topic of interest differ from your own? If so, did the description spark your interest in that topic as well? If so, how? Introduction should be at least 250 words in APA format.

 

 

  1. Hospital Data. 1st Post Due by Day 3. The MHA610_Week 1_Discussion_Hospital Data Excel file (available in the classroom) and MHA610_Week 1_Discussion_Hospital Data Statdisk file (available in the classroom) contains basic demographic information on 250 patients admitted to a community hospital over a two week period. The first row of the worksheet indicates the variable names:

 

 

Gender Male (M) or female (F)
Ethnicity
SevIllnessCode These are All Patient Refined Diagnosis Related Groups (APR-DRG) categories of severity of illness, ranging from:
SevIllnessDescr Mild (Category 1) to extreme (Category 4)
Age In years

 

 

Wt Patient weight in kilograms
Ht Patient height in centimeters
BMI Patient body mass index (BMI) where BMI = wt/ht*2, with weight in kilograms and height in meters
APR-DRG Denotes All Patient Refined Diagnosis Related Group, a widely used inpatient classification system.

 

 

 

For this discussion, describe and summarize the demographic information on these patients. You may use tables or graphs (or both) for this purpose. Your goal is to convey to the reader an accurate snapshot of these patients. Support your response with correct scholarly sources. You initial post must be at least 250-500 words.

 

Guided Response: Respond to at least two of your peers by Day 7, 11:59PM. Review your colleague’s summary of the data. Did the method of presentation provide you with any new insights? If so, what are they? If not, what suggestions might you make to your colleague that could improve his or her representation of the data? All initial and peer postings should be at least 250-500 words in APA format supported by scholarly sources. MHA610 Introduction to Biostatistics Assignments and DQs

Assignment

U.S. Mortality Rates. Due by Day 7. Examine the burden of disease in the United States to provide important information on which parameter is to base decisions on public health priorities.

To do this, we will utilize mortality data for the United States. In the first part of this assignment, you will download and examine mortality data for your home state.

Go to

  • Choose your home state under the Choose State option (panel on left hand side)
  • Select BOTH under the Choose gender option in the middle
  • Scroll down to the bottom of the page, and read the fine print to learn for which year the mortality data have been
  • Copy and paste the relevant mortality data into
    • Drag your mouse over all of the Cause of Death rows, (50 rows), right click, and select Copy,
    • Open Excel and paste your selection into Excel. You should have a spreadsheet with 50 row and 19 columns (Columns A-S).

For the first part of the assignment, you will prepare a histogram of the leading causes of death (regardless of age) in your state. Follow the steps below in order to prepare your histogram:

  • Sort the Data
    • The numbers of deaths, all ages, are given in Column
    • Select all the
    • Then, select Data>Sort>Sort by Column C, Values, largest to smallest. (Make sure that my data has headers is not
    • You now have the leading causes of death in your state in Column A (cause) and Column C (frequency).
  • If you already know how to draw a histogram in Excel, proceed to do so with Columns A and C, making sure to truncate the data to the 30 leading
  • If you do not know how to draw a histogram in Excel, here’s one method:
    • Choose the Chart Wizard, chart type column, chart sub-type clustered column (Step 1).
    • Click Next for Step
    • At Step 2, Click the Series button, which will open a new
    • Click Add under Series,
      • Enter Causes of Death in the Name box;
      • Clear the Values box, then
      • Drag your mouse over the 30 largest frequencies for the Values; and,
      • Drag your mouse over the first 30 causes of death (Column A) for the Category (X) axis labels box.
    • Click Next, and you’ll be brought to the Chart Options
      • Add a suitable title (e.g., Leading Causes of Death, 2010, your state)
      • Label the Y axis Frequency.
    • Click Next, and place the histogram in a new

You now have a histogram with the leading causes of death for your state. This presents one picture of the burden of disease in your state, but it isn’t the only picture. We shall now look at a different metric: years of life lost due to each cause.

To do this, we will assume that the average life span is 80 years, and we will calculate how many years of life are lost for each cause of death, according to the age at death.

Please note that the ages are in categories (0 – 14, 15 – 24, 25 – 34, …, 65 – 74, and 75+). For this exercise, we will assume that the average age of death is at the middlepoint of each of these intervals (eg., 7.5, 19.5, 29.5, …, 69.5, and 80 for the last age category respectively). For example, an individual death in the 15-24 (19.5) age group incurs equals 60.5 years of life lost (80-19.5 = 60.5).

To make this histogram, we will compute a new column of values, years of life lost for each cause of death. (This entails writing a simple formula in Excel for the calculation corresponding to the first row of data, then dragging the formula down that column. If you have never done this calculation before in Excel, consult the screencast for detailed instructions.)

  • Go back to the original Excel spreadsheet that contained your
  • Using the formula above, create a column that calculates the years of life lost.
  • Now, sort the data by the years of life lost column, in descending order, before drawing a histogram of the results.
  • Finally, create a histogram of the 30 leading causes of death, in decreasing order of years of life
    • Do not forget to label the y-axis and provide a title for the

You now have two histograms representing the burden of disease in your state. The first histogram orders the causes of death in terms of overall mortality, and the second orders causes of death in terms of years of life lost.

Create a report of your findings that contains both of the histograms. The report should be at least 250-500 words supported by scholarly resources and in APA format. Assume that your task is to assess and prioritize public health needs in your state, and you need to inform and persuade policy makers for improving the well-being of your state’s constituents. Describe which findings are most relevant for this.

You should also explain any methodological or data limitations that exist in either histogram. In particular, describe your conclusions would be altered if you were to refine your findings by reanalyzing mortality rates based on gender and race in addition to age. The assignment should be at least 500 words in APA format supported by scholarly sources.

Week Two

Course Content

To be completed during the second week of class

Overview

 

Activity Due Date Format Grading Percent
Game of Chance Day 3 (1st post) Discussion 4
Sex Ratios Day 7 Assignment 8

 

Weekly Learning Outcomes

This week students will

  1. Calculate probabilities of events using fundamental notions and rules of probability
  2. Apply the binomial distribution to discrete data
  3. Apply the Poisson distribution to discrete data

 

 

Introduction

In Week One, you were introduced to some fundamentals of probability. You will continue your exploration of probability theory in Week Two, including Bayes theorem for determination of posterior probabilities on the basis of prior and marginal probabilities. You will examine properties and parameterization of the two basic discrete probability distributions, the binomial and the Poisson. You will begin to use the binomial and Poisson distributions for inferential procedures.

 

 

Required Resources

Text

Triola, M.M., & Triola, M.F. (2006). Biostatistics for the Biological and Health Sciences. Boston, Ma: Pearson Education, Inc.

 

 

  • Chapter 3: Probability
    • After reading the chapter, review your grasp of the material in Chapter 3 by solving the odd-numbered questions in the Review Exercises and the Cumulative Review Exercises at the end of Chapter 3. Solutions to these problems are given at the end of the
  • Chapter 4: Discrete Probability Distributions
    • After reading the chapter, review your grasp of the material in Chapter 4 by solving the odd-numbered questions in the Review Exercises and the Cumulative Review Exercises at the end of Chapter 4. Solutions to these problems are given at the end of the

Triola, M. M., & Triola M. F. (2006). [Student companion website].

Boston, MA: Pearson Education, Inc. Retrieved from:

 

Supplemental Material

Koziol, J. (2014). (2014). [PDF]. College of Health, Ashford University: San Diego, CA.

  • This document provides an example that will be used in the Game of Chance discussion for this

Website

Center for Disease Control. (2014). Retrieved from

  • This website houses the data that will be used for the Sex Ratios assignment for this

 

 

Recommended Resources

Multimedia

Koziol, J. (Producer). (2014). MHA610 Week 2 Assignment (Part 1) [Video file]. Retrieved from

Koziol, J. (Producer). (2014). MHA610 Week 2 Assignment (Part 2) [Video file]. Retrieved from

  • These screencasts help explain the Week Two

 

Discussion

Participate in the following discussion:

 

 

Game of Chance. 1st Post Due by Day 3. For this discussion, select a game of chance, explain it briefly if it is likely to be unfamiliar to your classmates, then calculate probabilities of various outcomes like winning or losing

 

 

in this game. For example, you might choose your state lottery, scratch card game, a card game like poker, or a dice game like Craps or Yahtzee, as your game of chance.

 

As illustration, read a lottery analysis in

 

 

Guided Response: Respond to at least two of your classmates who chose a different game of chance than you by Day 7 at 11:59PM. Did your colleague provide enough explanation of the game to allow you to understand the analysis? Was the analysis provided by your classmate correct? If so, what optimal strategy for playing that particular game was described? If not, what suggestions would you make to your colleague to amend any issues?

 

 

Assignment

Sex Ratios. Due by Day 7. The normal male to female live birth sex ratio ranges from about 1.03 to 1.07. The sex ratio is defined as the ratio of male births to female births. You might expect boy and girl births to be equally likely, but in fact, baby boys are somewhat more common than baby girls.

 

Higher sex ratios are thought to reflect prenatal sex selection, especially among cultures where sons are prized more heavily than daughters. We will review sex ratios in the United States as a whole, as well as in individual states, to determine whether sex ratios vary significantly among various ethnic and racial groups.

 

To do this analysis, we will utilize natality data for the United States, provided by the Centers for Disease Control.

 

In the first part of the assignment, we will look at sex ratios for your home state, over the time period 1995 to 2002, by race. To obtain this information:

  • Go the
  • Click on Births under the WONDER Online Databases to bring you to the Natality Information screen
  • On this screen, click Natality for 1995-2002.
  • On the following screen, click I Agree in order to agree to abide by the government rules for data use (primarily, concerning confidentiality).
    • This will bring us to the Natality, 1995-2002 Request
    • In the block Organize table layout, group results by year, followed by race, and then gender.
    • In the block Select maternal residence, choose your state.

 

 

  • You can leave blocks 3 through 6 at their default values (i.e., All).
  • Click Send.
  • A new screen will open, with data (births) tabulated by Year, Race, and Gender.
  • Click Export, click Save, and a text file named Natality, _1995-2002 .txt or something similar will be downloaded onto your computer.

 

We can now process the downloaded data in Excel.

  • Load the text file into Excel. This will probably open the Text Import
    • Accept the defaults, and you should have a spreadsheet with the natality data
  • We will need to edit the data slightly before calculating sex ratios and drawing graphs of the sex ratios. To do this:
    • Scroll down to the end of the spreadsheet, and delete the rows with the extraneous information about the dataset. (This starts on or about row )
    • You may also delete the columns with headings Year CodeRace Code, and Gender Code since we will not be using them, however this is not
    • Next, sort the data, in order to delete some extraneous rows. Select the remaining columns, choose Data > Sort, then sort by Race in ascending
    • Scroll down to the end of the worksheet, and delete all rows with blanks for Race.
    • We will now add a new column to the worksheet for
      • Go to the first blank column in the worksheet: this column should be immediately to the right of a column labeled Births.
      • In the first row of this column, type Ratios.
    • Now, we will calculate different proportions of births, using formulas in excel. It is important to use excel to do the calculation, because it will allow you to quickly complete all of the
      • First, calculate the ratio of female births to total births for the American Indian race (female births/total births).
      • Next, calculate the ratio of male births to total births for the American Indian race (male births/total births).
      • Finally, calculate the ratio of male births to female births (male births/total births)
      • If you don’t know how to do this calculation easily in Excel, please check out the screencast, which reviews

 

 

  • Once you have completed the first three cells in the ratio column, you can select them and copy
  • Select the remaining cells in the column and
  • You have now completed calculating all of the ratios, however, you may wish to double check to ensure that the formulas have adjusted for each
  • Once you have the Ratio column filled out, select that column, then Copy.
  • With the column still selected you want to select, click Paste Special and then Values. This will convert the formulas you entered to numbers, so they do not change when you do the next
  • Select all the columns, then Data>Sort>Notes in ascending order. We will be graphing the sex ratios for the years 1995 to 2002, by
    • Feel free to drop the two to four races that have the fewest numbers of births in your
  • Draw a line chart with markers with the year along the X-axis (we are looking at 1995 through 2002) and sex ratio along the Y-axis (with sex ratios typically between 1 and 1.1, though this may vary in your state).
    • If your version of excel has the Chart Wizard:
      • In step two of the Chart Wizard, choose the Series tab; in this window you’ll be adding all the information for the various
      • Under category (X) axis labels, drag your mouse over the cells 1995, 1996…
      • For values, draw your mouse over the seven successive sex ratios for the particular racial group you chose; in the name box, enter the racial group; do this for each of the groups you want to display.
      • Select Next when you have finished with all the racial groups, and you will be brought to the Chart Options
      • Here, you can customize your graph, with a title and X and Y axis labels (i.e., your state births, year, and sex ratio respectively).
      • Continue with Next, and finish the
    • If your version of excel does not have the Chart Wizard, you will need to do some reformatting of your data before you can create a line chart. It is good practice to create a new worksheet in order to preserve your original
      • Your data should mimic the way you want your line chart to look. In this case, you want to create horizontal labels for each of the years (1995 through 2002) and vertical labels for each of the races. It should follow this format:

 

 

 

 

Year 1 Year 2 Year 3
Race A Ratio for Race A in Year 1 Ratio for Race A in Year 2 Ratio for Race A in Year 3
Race B Ratio for Race B in Year 1 Ratio for Race B in Year 2 Ratio for Race B in Year 3

 

  • After you have reformatted your data, select all of the data, then select Insert, then Line, then Line with Markers.
  • You should now have a line chart with each race having its own line, the ratios on the Y- axis, and the years on the X-axis.
  • You may wish to modify the Y-axis by right-clicking on it. Your upper and lower values on the axis should be just above and below your highest and lowest ratio
  • In a Word document, paste the graph you created (or, alternatively, submit your Excel workbook along with the Word document) and describe your findings, making sure to:
    • Summarize the sex ratios for each of the racial
    • Explain whether the sex ratios are relatively constant through the 1995 to 2002 period for all of the racial groups or if there are trends?
    • Explain any racial groups that have noticeably higher or lower sex ratios than other
    • Explain the conclusions you are drawing from your

 

In the second part of this assignment, you will undertake some formal statistical procedures with the natality data. We will repeat the previous steps, with some slight modifications.

  • Return to the
  • Click on Births under the WONDER Online Databases to get to the Natality Information
  • Select Natality for 2007 – 2012.
  • On the next screen, click I Agree in order to agree to abide by the government rules for data use (primarily, concerning confidentiality).
  • This will bring us to the Natality, 2007-2012 Request
    • In block Organize table layout, group results by race and then gender (not year).
    • In block Select maternal residence, choose your state.
    • You can leave block 3 at its default values (typically, All).

 

 

  • In block Select birth characteristics; select All Years under Year, and 1st child born alive to mother under Live Birth Order.
  • Blocks 5 and 6 can be left at their default
  • Click Send. A new screen will open, with data (births) tabulated by race and
  • Click Export, click Save, and a text file named Natality 2007-2012.txt (or something similar) will be downloaded onto your computer.

 

We have only four racial groups in this dataset: American Indians or Alaska Natives, Asian or Pacific Islanders, Black or African Americans, and Whites.

 

Using the normal approximation to the binomial distribution (without continuity correction), calculate z statistics for assessing whether the proportion of boys is .51 in each of the 4 racial groups, where n is the total number of births in a particular cohort, p = .51, q = 1 – p = .49, and x is the number of boy births; z = ((x np) / sqrt(npq) ).

 

Under the null hypothesis that the proportion of boys should be 0.51, and under the normal approximation to the binomial distribution, the z statistics should have (approximately) standard normal distributions, (mean 0, standard deviation 1). Do any of the z statistics suggest that the proportion of boy births in any particular racial group differs significantly from .51?

 

Comment on your findings in your written report. Describe whether you think your results would change if we hadn’t limited consideration to the first-born. This assignment should be at least 250-500 words in APA format supported by scholarly sources.

 

 

Week Three

 

 

Course Content

To be completed during the third week of class

 

Overview

 

Activity Due Date Format Grading Percent
Confidence Intervals Day 3 (1st post) Discussion 4
Week Three Quiz Day 6 Quiz 8
Immune Responses Day 7 Assignment 8

 

Weekly Learning Outcomes

This week students will

  1. Apply the normal distribution to continuous data
  2. Explain the use of statistical estimators in practice.
  3. Construct confidence intervals for sample

 

 

Introduction

In Week Two, you examined two fundamental discrete probability distributions, the binomial and the Poisson. In this week, you will be introduced to the fundamental continuous probability distribution, the normal or Gaussian. You will learn how the normal distribution is parameterized by the mean and the variance, and how to undertake probability calculations based on the normal distribution. You will be introduced to the central limit theorem, and how it relates to the normal distribution.

 

You will also learn about sampling distributions (especially, the t distribution), and properties of estimators. Estimation is a key concept in statistics, and you will learn how to construct confidence intervals for sample estimators. You will learn about planning of experiments, for which sample size and power are fundamental notions.

 

 

Required Resources

Text

Triola, M.M., & Triola, M.F. (2006). Biostatistics for the Biological and Health Sciences. Boston, Ma: Pearson Education, Inc.

  • Chapter 5: Normal Probability
    • After reading the chapter, review your grasp of the material in Chapter 5 by solving the odd- numbered questions in the Review Exercises and the Cumulative Review Exercises at the end of Chapter 5. Solutions to these problems are given at the end of the
  • Chapter 6: Estimates and Sample Sizes with One
    • Review your grasp of the material in Chapter 6 by solving the odd-numbered questions in the Review Exercises and the Cumulative Review Exercises at the end of Chapter 6. Solutions to these problems are given at the end of the

Triola, M. M., & Triola M. F. (2006). [Student companion website].

Boston, MA: Pearson Education, Inc. Retrieved from:

 

Supplemental Materials

  1. Koziol, J. (2014). MHA610_Week 3_Assignment_Data [Excel file].
  2. Koziol, J. (2014). MHA610_Week 3_Assignment_Data [Statdisk file].

 

 

Recommended Resources

Multimedia

Koziol, J. (Producer). (2014). MHA610 Week 3 Assignment (Part 1) [Video file]. Retrieved from

Koziol, J. (Producer). (2014). MHA610 Week 3 Assignment (Part 2) [Video file]. Retrieved from

  • These screencasts help explain the Week Three

 

Discussion

Participate in the following discussion:

 

Confidence Intervals. 1st Post Due by Day 3. In this discussion, we will investigate confidence intervals for binomial probabilities. The discussion is in two parts.

  • Return to the data you had generated in the second part of the Week Two assignment. You should have total numbers of first-born boys and girls in your state between the years 2007 and 2012 separately by racial group: American Indians or Alaska Natives, Asian or Pacific Islanders, Black or African Americans, and Whites. For the first part of this discussion, construct and report the 95% confidence intervals for the proportions of first-born boys, separately for each racial group. (Use the normal approximation to the binomial distribution.) Comment on the confidence intervals: can you infer from the confidence intervals that the proportions of first-born boys differ among the racial groups? Explain what the widths of the confidence intervals tell
  • Leading up to elections, you often hear results of polls of voters’ preferences, with statements such as: “This poll was taken from a random sample of 600 potential voters, and has an accuracy exceeding 96%.” You may want to interpret the accuracy statement in terms of “margin of error”, as explained in the text, Section 6-2. Remember, the width of a confidence interval is a measure of the precision of the estimate

Guided Response: Respond to at least two of your peers by Day 7, 11:59PM. Consider the 95% confidence intervals your colleague presented. Do all the intervals overlap with those you presented in your initial post? Did the inferences presented by your colleague match with yours? Compare the proportion of boy births in his or her state with those in your state. What statistically significant differences can you note? Do you concur with your colleague’s interpretation of the polling statement? What suggestions might you make to aid your colleague in evaluating this type of polling result? All initial and peer postings should be at least 250-500 words in APA format supported by scholarly sources.

Quiz

Week Three Quiz. Due by Day 6. Complete the 10-question quiz on the readings from Weeks One through Three. You may wish to review all of the odd-numbered questions from the text that you have completed in Weeks One, Two, and Three. There is no time limit to this quiz. You will have two attempts to take the quiz. If multiple attempts are made, eCollege will take the last grade earned not the highest grade earned.

 

 

Assignment

Immune Responses. Due by Day 7. Background: Abnormal immune responses can trigger a range of autoimmune diseases, in which an individual’s immune system is attacking normal tissues in the body. Well- known examples of autoimmune diseases are type 1 diabetes mellitus, lupus, and multiple sclerosis.

 

Ideally, one would like to harness the immune system to attack abnormal substances or tissues like cancer, while sparing the normal (unaffected) tissue. Many tumor cells produce antigens (proteins) that theoretically ought to trigger an immune response: that is, one’s immune system ought to recognize cancer cells as somehow foreign or abnormal, and thereafter eliminate these cells from the body. The field of cancer immunotherapy is actively pursuing this study.

 

Tumor antigens may also be useful for diagnostic tests; high levels of tumor antigens could be taken as markers or indicators of cancer. In this assignment, you will be examining levels of tumor-associated antigens (TAAs) as determined from immunoassays (i.e., biochemical tests that measure the concentrations of the tumor-associated antigens in serum samples).

  • Download the Excel file MHA610_Week 3_Assignment_Data.xls (available in the classroom), and open it.
  • The spreadsheet contains data on 250 individuals: 90 normal individuals from San Diego (the controls), and 160 individuals from Korea and China, all of whom were diagnosed with hepatocellular carcinoma (HCC).
    • Serum samples were taken from the controls and from the cases at time of diagnosis of HCC. Levels of a panel of 12 tumor-associated antigens (TAAs) were assessed via immunoassays in all individuals;
      • The levels are given in the columns with headings Ab14, HCC1, IMP1, KOC, MDM2, NPM1, P16, P53, P90, RaIA, and Survivin. (These are the designations of the 12 TAAs, all of which were thought to be potentially predictive of )
    • The underlying question is whether we can effectively discriminate between the cases and controls on the basis of the levels of these TAAs. This is sometimes termed a classification problem in the statistics and biostatistics literature: we wish to classify individuals as normal or cancer patients on the basis of their TAA
    • We will examine these data in Statdisk. Use the MHA610_Week 3_Assignment_Data.CSV file (available in the classroom) to upload this information into

 

 

  • If you choose the latter option, Start Statdisk, then choose File>Open and select the .csv file you created (unless you changed the name, it ought to be csv)
  • Check the box that specifies the data contains column titles or headers, select Comma separated for how the data are delimited, click finish, and the dataset will have been successfully imported into
  • NOTE: you may want to read through the remainder of the assignment first, before proceeding with this step. This may save you some work afterwards!
  • Note that Statdisk operates on columns of data, and that both cases and controls are contained in each column of TAA levels. It will be necessary to separate the cases and controls for further analyses. This can be accomplished either by copying within Statdisk or by reverting to the original Excel workbook, copying in Excel, exporting as a .csv file, and then importing into Statdisk. (Don’t say you weren’t warned!)
  • Explain if you would characterize any or all of the TAA levels as approximately normally distributed for the controls and for the
    • Provide plots and statistics in support of your
  • Explain if any of the TAAs are useful for discriminating between the cases and
    • Provide plots and statistics in support of your
  • All writing assignments should be at least 250-500 words in APA format supported by scholarly

 

BONUS. In the above, we pooled all cases together. Summarize whether you think this is legitimate or whether the levels of any of the TAAs appear to differ significantly between the cases from China and the cases from Korea. Provide evidence in support of your conclusion.

 

 

Week Four

 

 

Course Content

To be completed during the fourth week of class

 

Overview

 

Activity Due Date Format Grading Percent
Exploring t-Tests and Confidence Intervals for Continuous Data Day 3 (1st post) Discussion 4
A Crossover Clinical Trial Day 7 Assignment 8

 

Weekly Learning Outcomes

This week students will

  1. Explain general principles of hypothesis
  2. Calculate z tests and t

 

 

Introduction

In this week, you will learn general principles of hypothesis testing, including, type I and type II errors, significance level and power. You will then be introduced to inferential statistics arising in hypothesis testing, including, notably, z tests and t tests. You will learn how to undertake z tests and t tests for single populations and for comparisons of two populations. You will explain and give examples of their use.

 

 

Required Resource

Text

Triola, M.M., & Triola, M.F. (2006). Biostatistics for the Biological and Health Sciences. Boston, Ma: Pearson Education, Inc.

  • Chapter 7: Hypothesis Testing with One

 

 

  • After reading the chapter, review your grasp of the material in Chapter 7 by solving the odd-numbered questions in the Review Exercises and the Cumulative Review Exercises at the end of Chapter 7. Solutions to these problems are given at the end of the
  • Chapter 8: Inferences from Two
    • After reading the chapter, review your grasp of the material in Chapter 8 by solving the odd-numbered questions in the Review Exercises and the Cumulative Review Exercises at the end of Chapter 8. Solutions to these problems are given at the end of the

Triola, M. M., & Triola M. F. (2006). [Student companion website].

Boston, MA: Pearson Education, Inc. Retrieved from:

 

Supplemental Materials

  1. Koziol, J. (2014). MHA610_Week 4_Assignment_Crossover_Trial_Data [Excel file].
  2. Koziol, J. (2014). MHA610_Week 4_Assignment_Crossover_Trial_Data [Statdisk file].

 

 

Recommended Resource

Multimedia

Koziol, J. (Producer). (2014). MHA610 Week 4 Assignment [Video file]. Retrieved from

  • This screencast helps explain the Week Four

 

Discussion

Participate in the following discussion:

 

 

Exploring t-Tests and Confidence Intervals for Continuous Data. 1st Post Due by Day 3. In this discussion, we will investigate t-tests and confidence intervals for continuous data. To do this, we will revisit the TAA data that you studied in the Week Three assignment.

 

You may recall from the Week Three assignment that you have available data on 12 TAAs, from 90 normal individuals (controls) and 160 hepatocellular carcinoma patients (cases). These data are in the Excel file MHA610_Week 3_Assignment_data.xls (available in the classroom); the levels of the 12 TAAs are given in the columns with headings Ab14, HCC1, IMP1, KOC, MDM2, NPM1, P16, P53, P90, RaIA, and Survivin.

 

 

 

  • First, randomly select three of the 12 TAAs for further

 

  • Next, perform two sample t-tests for comparing the levels of each of your three TAAs between the cases and the

 

  • Then, Use the t-tests to order the TAAs in terms of relative ability to discriminate between the cases and controls, from best to worst discriminator. Is this ordering helpful if you want to select a subset of TAAs to discriminate between cases and controls? Assume for now that you can judge the relative merits of your three TAAs by the magnitudes of their respective two-sided p-values from the two sample t-tests, so that your best discriminator is the TAA with the smallest p-value.

 

  • Lastly, Construct and report 95% confidence intervals for the mean level of your best TAA discriminator in the controls, the mean level of your best TAA discriminator in the cases, and the difference in mean levels (cases – controls). Discuss whether your confidence intervals are concordant with the t-tests.

 

Guided Response: Respond to at least two of your peers by Day 7, 11:59PM. Do your t-tests and ordering coincide with those of your colleague? If not, why? Do you agree with your colleague’s assessment of the usefulness of the ordering to discriminate between cases and controls? Why? Did your best TAA discriminator agree with that of your colleague? If not, why not? Are your confidence intervals identical to those of your colleague? If not, can you determine where a mistake was made? All initial and peer postings should be at least 250-500 words in APA format and supported by scholarly sources.

 

 

Assignment

A Crossover Clinical Trial. Due by Day 7. Background: Randomized controlled trials are the gold standard for clinical research. Biostatisticians are heavily involved in such trials, from the planning stage (e.g., sample size and power considerations) through the analysis of findings (e.g., estimation of treatment effects). In this assignment, we will examine treatment outcomes in a two treatment, two period (two-by-two) crossover design.

 

In the two-by-two crossover design, subjects are randomly assigned to one of two groups. The first group initially receives treatment A in the first period of the trial followed by treatment B in the second period of the trial, and the other group initially receives treatment B in the first period of the trial followed by treatment A in the second

 

 

period. The response, or primary endpoint of the trial, is measured at least twice in each patient, at the end of the first period and again at the end of the second period. Each patient is his or her own control for comparison of treatment A and treatment B.

 

Crossover designs are used when the treatments alleviate a condition, rather than effect a cure. After the response to the treatment administered in the first period is measured, there is a washout period in which any lingering effect of the treatment administered in the first period dissipates, and then the response to the second treatment is measured.

 

An advantage of a crossover design is increased precision afforded by comparison of both treatments on the same subject, compared to a parallel group clinical trial (in which patients are randomized onto different treatment arms). Disadvantages of crossover trials are complex statistical analyses of findings (typically, by complex analyses of variance), potential difficulties in separating the treatment effects from the time effect (patients may respond differently in the first period and the second period), and the carryover effect (the effect of the treatment given in the first period may not totally wash out, but may carry over onto the second period).

 

We will give a simple example of a two-by-two crossover trial, and undertake analyses of the trial results via t

tests. The trial was meant to assess the efficacy of a new experimental therapy for interstitial cystitis (IC). Interstitial cystitis is a chronic bladder condition affecting primarily women; symptoms include bladder pressure and pain, urgency, and occasionally pelvic pain. The new experimental therapy was meant to reduce pain and urgency relative to standard therapy. A total of 24 patients were enrolled in the trial; trial results are given in the Excel workbook titled MHA610_Week 4_Assignment_Crossover_Trial_Data.xls (available in the classroom).

 

Open the workbook, and examine the worksheet. The first row contains column headings, and the next 24 rows represent the 24 patients entered into the trial. The group one patients received experimental therapy in the first period of the trial followed by standard therapy in the second period of the trial. The group two patients received standard therapy in the first period of the trial followed by experimental therapy in the second period.

 

The primary outcome of the trial was an area under the curve (AUC) calculation of relative pain and urgency the patient experienced following therapy: the smaller the AUC, the less severe the patient’s pain and urgency.

AUC_period1 denotes each patient’s AUC during the first period of the trial, and AUC_period2 denotes the

 

 

patient’s AUC during the second period of the trial. The column headed Rx denotes the treatment each patient received during the first period of the trial.

  • We will first test for carryover
    • The t test formulation for the test for carryover proceeds as follows: calculate the total (sum) of the AUC_period1 and AUC_period2 values for each patient in group one (12 patients) and separately for each patient in group two (12 patients).
    • The test for carryover is the two sample t test for assessing whether these AUC totals differ significantly between group one and group two under the assumption that the variances of the AUC totals in the two groups are
    • Calculate the sample means and standard deviations for the AUC totals for each group, and perform the two sample t Analyze whether there is a significant carryover effect in this clinical trial.

 

  • We will next test for treatment
    • The t test formulation for assessing treatment effects proceeds as follows:
      • Calculate the difference of the AUC values for each patient in group one, that is, the 12 individual AUC_period1 – AUC_period2 values, and similarly calculate for each patient in group
      • If there is no treatment effect, one would expect the AUC_period1 and AUC period 2 values to be similar, except perhaps for an offset due to period effects; we need to account for potential period effects when we compare the group one and group two AUC
      • It turns out that the ttest for a treatment effect is the two sample t test for assessing whether these AUC_period1 – AUC_period2 differences differ significantly between group one and group two, under the assumption that the variances of the AUC differences are the same in the two
    • Calculate the sample means and standard deviations for the AUC differences as defined above in each group, and perform the two sample t Analyze whether there a significant treatment effect in this clinical trial.

 

Here’s an informal explanation of this t test. Consider the following schematic representation of the two-by-two crossover trial.

 

 

Group Period One Period Two
1. AB Sequence Treatment A + Period One Treatment B + Period Two
2. BA Sequence Treatment B + Period One Treatment A + Period Two

 

In this representation, Treatment A is the direct effect of treatment A on each patient’s response (AUC value) and similarly for Treatment B; Period One is the effect of period one on each patient’s response and similarly for Period Two. (We are assuming there are no carryover effects.)

 

Now, consider first the individuals in group one. During Period One, their responses, (i.e., AUC_period1 values), are estimating effects due to treatment A and period one. During Period Two, their responses (i.e., AUC_period2 values) are estimating effects due to treatment B and period two. So when we take the average of the group one AUC_period1 – AUC_period2 values, (let’s call this average ), we have a combined estimate of the effects (Treatment A – Treatment B) + (Period 1 – Period 2).

 

Next, consider the individuals in group two. When we take the average of the group two AUC_period1 – AUC_period2 values (let’s call this average y), we have a combined estimate of the effects (Treatment B – Treatment A) + (Period 1 – Period 2).

 

Lastly, consider the random variable Z =  – . This random variable estimates solely the quantity (Treatment A

– Treatment B); the period effects (Period 1 – Period 2) cancel out. Under the null hypothesis of no treatment effects, (Treatment A – Treatment B) = 0, so the mean of Z should be zero. The two sample t test for treatment effects outlined above is equivalent to the t test of whether the mean of Z equals zero. Note that since we have equal numbers of patients in group one and group two, there was no need to take sample means when we constructed our t test; but in general, with unequal sample sizes, you should work with sample means when performing the t tests.

 

Briefly summarize your findings from this trial. Explain whether the new treatment appears promising in 500 words in APA format supported by scholarly sources.

 

BONUS. Graphical representations of the findings can be quite illuminating. As a bonus, you are asked to prepare graphical representation(s) of the data. For example, you might prepare a simple plot of mean responses (mean

 

 

AUC values) for each treatment arm and for each period. Or, you could give patient profile plots of individual AUC values by period and treatment. Describe whether histograms, boxplots, or scatter plots would work with these data. If you assume that there are no significant carryovers or period effects in this trial, explain how you would display the treatment effects in a 250 words in APA format supported by scholarly sources.

 

 

Week Five

 

 

Course Content

To be completed during the fourth week of class

 

Overview

 

Activity Due Date Format Grading Percent
Graphs Day 3 (1st post) Discussion 4
Brain Size and Intelligence Day 7 Assignment 8

 

Weekly Learning Outcomes

This week students will

  1. Distinguish between correlation and regression with multivariable
  2. Apply univariate and multiple regression analyses to
  3. Evaluate chi-square tests for goodness of fit with multinomial
  4. Evaluate chi-square tests for contingency table

 

 

Introduction

You will be introduced to the notions of correlation and regression: you will utilize these techniques, and then you will evaluate and interpret results of your analyses. You will learn that correlation does not imply causation; whereas in regression, the notions of dependent variable and independent variable connote more of a cause-and-effect association.

 

You will learn about goodness-of-fit tests for multinomial distributions, and chi-square tests for contingency tables. These are statistical tests for discrete data and are also meant to reinforce the concept that different statistical procedures need to be utilized in research, depending on what study questions are asked, and what data are available to you for analysis and assessment.

 

 

Required Resources

Text

Triola, M.M., & Triola M.F. (2006). Biostatistics for the Biological and Health Sciences. Boston, Ma: Pearson Education, Inc.

  • Chapter 9: Correlation and
    • After reading the chapter, review your grasp of the material in Chapter 9 by solving the odd- numbered questions in the Review Exercises and the Cumulative Review Exercises at the end of Chapter 9. Solutions to these problems are given at the end of the
  • Chapter 10: Multinomial Experiments and Contingency
    • After reading the chapter, review your grasp of the material in Chapter 10 by solving the odd- numbered questions in the Review Exercises and the Cumulative Review Exercises at the end of Chapter 10. Solutions to these problems are given at the end of the

Triola, M. M., & Triola M. F. (2006). [Student companion website].

Boston, MA: Pearson Education, Inc. Retrieved from:

 

Supplemental Materials

  1. Koziol, J. (2014). MHA610_Week 5_Discussion_regression_data [Excel file].
  2. Koziol, J. (2014). MHA610_Week 5_Discussion_regression_data [Statdisk file].
  3. Koziol, J. (2014). MHA610_Week 5_Assignment_Brain_Data [Excel file].
  4. Koziol, J. (2014). MHA610_Week 5_Assignment_Brain_Data [Statdisk file].

 

 

Recommended Resource

Multimedia

Koziol, J. (Producer). (2014). MHA610 Week 5 Assignment [Video file]. Retrieved from

  • The video helps explain the Week Five

 

Discussion

Participate in the following discussion:

 

Graphs. 1st Post Due by Day 3. It is important to look at data in a graphical form. Patterns are the essence of data exploration, and the eye’s ability to discern forms and patterns makes visual display integral to the process. The visual display of quantitative information can help us see connections and relationships in the data, which are oftentimes difficult to detect in tables of numbers. We should look at data in a graphical form, and not rely solely on computational or statistical metrics.

 

In this discussion, we will explore graphs in linear regression. Our data are taken from an article by Frank Anscombe in a 1973 article in The American Statistician, which discusses scatterplots in relation to regression analyses.

 

First, download the dataset MHA610_Week 5_Discussion_regression_data.xls (available in the classroom). This is a simple Excel workbook, with data on one sheet. There are eight columns of data, with headings X1, Y1, X2, Y2, X3, Y3, X4, Y4. Import the data into Statdisk using the MHA610_Week 5_Discussion_regression_data.csv file (available in the classroom), and perform the following analyses.

 

  • Calculate the regressions of Y1 on X1, Y2 on X2, Y3 on X3, and Y4 on X4, and compare the results (summary statistics). Explain what, if anything, you find unusual about these
  • Plot each set of data, along with the fitted regression line. Describe what the graphs tell you about the relationships between the X’s and the Y’s.
  • Explain what lessons you draw from this

 

Place the summary statistics and the plots in a separate Word document and attach that document to your initial post. Address the questions in the body of your initial discussion post.

 

Guided Response: Respond to at least two of your peers by Day 7, 11:59PM. Do your summary statistics and plots agree with those of your colleague? If not, how and why do they differ? Did your colleague’s conclusions broaden your perspective on linear regression? All initial and peer postings should be at least 250-500 words in APA format supported by scholarly sources.

 

 

Assignment

Brain Size and Intelligence. Due by Day 7. Background: Is brain size a measure of intelligence? Brain size tends to vary with body size: for example, sperm whales and elephants have brains up to five times as massive as

 

 

human brains. So across species, brain size is not a perfect measure of intelligence. And within species, the underlying organization (complexity of connections) and molecular activity of the brain are likely to be more directly associated with intelligence than mere size.

 

In this assignment, we will investigate relationships between physiological measures of the brain, and intelligence. Download and open the Excel workbook, MHA610_Week 5_Assignment_Brain_Data.xls (available in the classroom). The workbook contains data on 20 youths, in rows two through 21. Eight variables (the columns) were recorded on each individual; the column headings are given in row one. The column headings are as follows:

 

IQ                    the individual’s IQ

ORDER            the birth order (1 = firstborn, 2 = not firstborn) PAIR                marker for genotype

SEX                 gender, 1 = male, 2 = female

CCSA                        corpus callosum surface area (in cm2) HC                   head circumference (in cm)

TOTSA total brain surface area (in cm2) TOTVOL total brain volume (in cm3) WEIGHT body weight (in kg)

 

The neuroanatomical measures CCSA, TOTSA, and TOTVOL were determined from magnetic resonance imaging (MRI) of the brains, followed by automated image analyses of the scans. The corpus callosum is a bundle of neural fibers beneath the cortex, connecting the left and right cerebral hemispheres of the brain; it is the communication highway between the two hemispheres. (The more lanes to the highway, the faster the traffic ought to flow.)

 

The following questions can be answered in Excel, StatDisk, or other statistics software you may have available.

  • Examine all of the pairwise correlations among the physiological measures CCSA, HC, TOTSA, TOTVOL, and WEIGHT. Which two variables have the strongest correlation? Report the correlation, and plot the scattergram for these two variables. Also, report the correlation and plot the scattergram for the two variables that have the weakest

 

 

  • Determine whether the physiological parameters CCSA, HC, TOTSA, TOTVOL, and WEIGHT are significant predictors of That is, run a sequence of univariate regressions, with IQ as the dependent variable, and the physiological parameters as the independent variables. Report the best univariate regression with statistics and a graph of the regression. Describe whether IQ can be accurately predicted from any of these brain measures individually or in combination.

 

BONUS. Power law distributions, that is, functional relationships between two variables in which one variable is roughly a power of the other, are often used to model physiological data. One of the oldest power laws, the square-cube law, was introduced by Galileo in the 1600’s: empirically, the square-cube law states that as a shape grows in size, its volume grows faster than its surface area. We shall investigate the square-cube law with two variables from our dataset, CCSA and TOTVOL. If CCSA varies with some power of TOTVOL, for example,

CCSA = k * (TOTVOL)  (k is an unknown constant here), then a simple way of estimating the exponent is via

linear regression: take log(CCSA) as the dependent variable and log(TOTVOL) as the independent variable; the fitted regression coefficient (slope) is an estimate of the exponent. (Do you see why this is true?) Perform this linear regression, and report your results. Describe whether the regression coefficient is significantly different from 2/3. (The 2/3rd power law occurs often in nature.)

 

 

Week Six

 

 

Course Content

To be completed during the fifth week of class

 

Overview

 

Activity Due Date Format Grading Percent
Health and Nutritional Status Day 3 (1st post) Discussion 4
Week Six Quiz Day 6 Quiz 8
Final Project Day 7 Assignment 18

 

Weekly Learning Outcomes

This week students will

  1. Explain analysis of variance as a generalization of two sample z and t
  2. Apply one-way and two-way analyses of variance to

 

 

Introduction

In this week, you will be introduced to a class of statistical procedures known collectively as analysis of variance (ANOVA). In its basic form, ANOVA is a statistical procedure for assessing whether the means of several groups are equal. So, for example, a one-way ANOVA is a straightforward generalization of t tests when you are presented with more than two groups of observations. Two-way and higher-order ANOVAs allow simultaneous inference on separate groupings and are qualitatively similar to multivariable linear regression in concept and aims.

 

 

Required Resources

Text

 

 

Triola, M.M., & Triola M.F. (2006). Biostatistics for the Biological and Health Sciences. Boston, Ma: Pearson Education, Inc.

  • Chapter 11: Analysis of Variance

o  After reading the chapter, review your grasp of the material in Chapter 11 by solving the odd- numbered questions in the Review Exercises and the Cumulative Review Exercises at the end of Chapter 11. Solutions to these problems are given at the end of the text.

Triola, M. M., & Triola M. F. (2006). [Student companion website].

Boston, MA: Pearson Education, Inc. Retrieved from:

 

Supplemental Materials

  1. Koziol, J. (2014). MHA610_Week 6_Discussion_NNYFS_workingdata [Statdisk file].
  2. Koziol, J. (2014). MHA610_Week 6_Discussion_NNYFS_workingdata [Excel file].

 

 

Websites

Center for Disease Control and Prevention. (2014). Retrieved from

  • This website houses data that is useful in the Health and Nutritional Status discussion for this week. Center for Disease Control and Prevention. (2014). Retrieved from
  • This website will assist you with the final project for this

 

 

Recommended Resources

Multimedia

Koziol, J. (Producer). (2014). [Video file]. Retrieved from

+births%29/0_zipwy4i7

Koziol, J. (Producer). (2014). [Video file]. Retrieved from

+births%29/0_ignho54w

Koziol, J. (Producer). (2014). [Video file]. Retrieved from

+brainsize%29/0_qhhcxu1d

 

 

Koziol, J. (Producer). (2014). [Video file]. Retrieved from

+crossover%29/0_srbv0wj1

Koziol, J. (Producer). (2014). [Video file]. Retrieved from

+HCCtest%29/0_4qwt7r8z

Koziol, J. (Producer). (2014). [Video file]. Retrieved from

 

Discussion

Participate in the following discussion:

 

 

Health and Nutritional Status. 1st Post Due by Day 3. Since 1971, the National Center for Health Statistics had been assessing the health and nutritional status of both children and adults in the United States, through periodic National Health and Nutritional Examination Survey (NHANES) surveys. These surveys are an invaluable resource to epidemiological and public health research; the surveys can be used to determine the prevalence of major diseases and risk factors, to assess nutrition and health promotion, and to guide public health policy.

All initial and peer postings should be at least 250-500 words in APA format supported by scholarly sources.

 

 

In 2012, the NHANES National Youth Fitness Survey (NNYFS) was conducted in conjunction with NHANES to obtain physical activity and fitness levels of U.S. youths aged 3 through 15. Initial data from the NNYFS were released in 2013 and serve as the basis for this discussion problem.

 

Begin by downloading the Excel file MHA610_Week 6_Discussion_NNYFS_workingdata.xls (available in the classroom). This workbook was created by merging two datasets from the NNYFS: the and the For the purposes of this discussion, many variables were eliminated from the original datasets, as well as observations with missing data on height and weight. The Excel workbook thus consists of one worksheet, with 1576 rows (the first row contains headers, and the next 1575 rows are observed values for the participants), and 11 columns of variables. The columns in the Excel file are the following:

 

SEQN                           the respondent sequence number (index for all the files)

 

 

RIAGENDR                 gender of the participant, 1 = male, 2 = female RIDRETH1        race/Hispanic origin:

1 = Mexican American 2 = other Hispanic

3 = non-Hispanic white 4 = non-Hispanic black 5 = other

 

RIDEXAGY                 age in years at time of physical exam INDHHIN2                   annual household income, categorized

INDFMIN2                   annual family income, categorized INDFMPIR                   ratio of family income to poverty, 0 to 5 BMXWT                      weight, in kg

BMXHT                       height, in cm

BMXBMI                     body mass index (kg/m^2)

BMDBMIC BMI category: 1 = underweight

2 = normal weight 3 = overweight

4 = obese

. = missing

 

 

For purposes of this discussion, you are asked to answer the three following questions:

  • Does BMI vary significantly between boys and girls?
  • Does BMI vary significantly among the racial/ethnic groups?
  • Is there any trend to BMI with age?

 

Comments:

There are several ways to address these questions. For example, you might take BMXBMI as your outcome variable of interest: it is continuous, so you could then perform a two-sample t test for (1), a one way analysis of variance for (2), and a simple regression analysis (with age as the predictor variable) for (3).

 

 

Alternatively, you might reduce the problem to consideration of binomial probabilities: for example, you could classify everyone as obese or not obese (or maybe, overweight/obese vs underweight/normal), then compare binomial outcomes for (1) and (2) (z tests with the normal approximation or contingency tables), and conduct a t test on ages for (3).

 

Neither approach is wrong—the key is interpreting your findings!

 

 

If you prefer to do the analyses in Statdisk, there is a file, MHA610_Week 6_Discussion_NNYFS_workingdata.csv (available in the classroom), ready to be read into Statdisk. (It’s the original Excel workbook, saved as csv.) No need to go through any additional steps, unless you wish to restructure the data in Excel.

 

Incidentally, the income variables are not needed for these questions, but as a bonus, you might want to investigate whether obesity is related to socioeconomic status (as reflected by family income).

 

Guided Response: Respond to at least two of your peers who chose a different of analysis that you by Day 7, 11:59PM. Did you arrive at the same conclusions as your colleague even though you chose different methods? If so, which method do you think is preferable and why? If not, which method do you believe produces more credible results and why? (You might consult the text to support your argument.). All initial and peer postings should be at least 250-500 words in APA format supported by scholarly sources.

 

 

 

 

Quiz

 

Week Six Quiz. Due by Day 6. Complete this quiz on the readings from Weeks Four through Six. It may be helpful to review the odd numbered questions from your text that you completed in Weeks Four, Five, and Six. There is no time limit to this quiz. You will have two attempts to take the quiz. If multiple attempts are made, eCollege will take the last grade earned not the highest grade earned.

 

 

Final Project

Final Project. Due by Day 7. In this final assignment, we will revisit datasets that we have utilized in previous assignments, but with new objectives.

 

  • In the Week One assignment, you looked at mortality in your particular state, with two different metrics: the first was numbers of deaths, and the second was years of life lost. For this question, return to the original dataset, but this time first pool all cancer causes of death together, so that cancer constitutes the only category for cause of death. Then, repeat your analyses from Week One. How do your conclusions change?
  • In the Week Two assignment, you looked at sex ratios for births in your

 

  • Take the data you have assembled from the second part of your Week Two assignment, namely, numbers of first-born boy and girl births in your state between 2007 and 2012, separately by racial group (i.e., American Indians, Asians, Blacks, and Whites). Form a two-by-four contingency table from these data: the two row categories are female (girl) and male (boy), and the four column categories are the four racial groups. Calculate the chi-square statistic from this contingency table, and interpret the
  • Return to the website, and obtain the numbers of births in your state between 2007 and 2012, by month. (Disregard gender, or race, or birth order—you want all births). Calculate a chi-square statistic to assess whether there is any seasonality to births. (Your null hypothesis is that births should be equally likely to occur in any of the 12 months. We are ignoring the varying lengths of the months to simplify calculations.) How would you interpret your findings? Explain in 500 words in APA format supported by scholarly

 

BONUS: Give a graphical representation of your findings for this portion highlighting what you consider significant.

 

  • In the Week Three assignment, you were given levels of tumor-associated antigens in a sample of 90 normal (non-cancer) individuals, and 160 hepatocellular carcinoma (HCC) patients. Here is a proposed diagnostic test for HCC:
    • For each individual, calculate a numerical score:
      • score = -3.95 + 10.7 * HCC1 – 4.14 * P16 + 13.95 * P53 + 28.92 * P90 + 6.48 * survivin

 

 

  • (This equation was derived from logistic )
  • If this score is positive (i.e., > 0), diagnose this individual as an HCC patient; if this score is negative (i.e., <0), diagnose this individual as normal (i.e., non-cancer).
  • Apply this rule to the entire cohort of 250 individuals. Report the sensitivity of this rule, the specificity, the false positive rate, the false negative rate, and the overall accuracy. Do you think the score function provides a good diagnostic test for HCC?

 

  • In the Week Four assignment, we considered a simple two-by-two crossover trial of a new experimental treatment for interstitial cystitis. We calculated t tests for carryover and treatment effects, but we have not yet considered period effects. It is unlikely that there are any period effects in this trial, but we may want to test this formally. If there were a period effect, then patient responses under either treatment would likely be systematically higher in one period than the other. (Here’s an analogy: Think of taking the same test twice. You would likely perform better on the test the second time, since you have learned from your experience of taking the first test.) Explain how you would devise a t test for assessing a period effect in this trial. (Hint: look at the explanation of the t test for treatment effects given in the Week Four assignment. There, we based the test on the random variable X – Y. Suppose we look instead at X + Y?)

 

  • In the Week Five assignment, you investigated measures of brain size and intelligence in a sample of 20 youths. A potential shortcoming of your prior analyses is that you did not take into account all available information in the dataset, in particular, gender. Answer the following questions and explain your answers:
    • Do any of the physiologic variables CCSA, HC, TOTSA, TOTVOL, and WEIGHT differ significantly between males and females?
    • Do IQs differ significantly by gender?
    • Undertake a paired analysis of IQs, in order to assess whether firstborns have higher IQs than non-firstborns. In this regard, there are 10 pairs of related youths, as denoted by the variable PAIR.

 

Completing the Final Project

The Final Project:

  1. Must include a title page with the following:
    1. Title of paper

 

 

  1. Student’s name
  2. Course name and number
  3. Instructor’s name
  4. Date submitted
  5. Must contain 5 sections, each starting on a new page; the section headings can be called Question 1, Question 2, Question 3, Question 4, Question 5
  6. Each section must have two subsections, with headings Results and
  7. The Results subsections must include your analyses of that particular question. Your results may include figures, tables, and statistical analyses, laid out in a logical
  8. The Conclusions subsections must contain your inferences relative to that question based on your results, and any discussion points you wish to
  9. Length of the Results subsections must vary by question, but should encompass all of your relevant
  10. Length of the Conclusions subsections typically will not exceed one
  11. If you have used any external references (e.g., the text), you should include a separate reference page, formatted according to APA style as outlined in the Ashford Writing

Course Map

The course map illustrates the careful design of the course through which each learning outcome is supported by one or more specific learning activities in order to create integrity and pedagogical depth in the learning experience.

 

Learning Outcome

 

Week

 

Activity

1. Apply basic statistical principles for describing, analyzing, and interpreting health data. 1

1

2

2

3

3

3

4

4

 

 

5

 

 

5

 

 

6

6

6

§ U.S. Mortality Rates- Assignment

§ Hospital Patient Data – Discussion

§ Sex Ratios- Assignment

§ Games of Chance – Discussion

§ Immune Responses- Assignment

§ Confidence Intervals – Discussion

§ Week Three Quiz

§ A Crossover Clinical Trial- Assignment

§ t-tests and Confidence Intervals for Continuous Data – Discussion

§ Brain Size and Intelligence- Assignment

§ Graphs with Linear Regression – Discussion

§ Final Project – Assignment

§ Week Six Quiz

§ Health and Nutritional Status – Discussion

2. Apply statistical methods of estimation and hypothesis testing in biostatistics and epidemiology. 2

2

3

3

§ Sex Ratios- Assignment

§ Games of Chance – Discussion

§ Confidence Intervals – Discussion

§ Immune Responses- Assignment

 

 

3 § Week Three Quiz

§ A Crossover Clinical Trial- Assignment

§ t-tests and Confidence Intervals for Continuous Data – Discussion

§ Graphs with Linear Regression – Discussion

§ Final Project- Assignment

§ Quiz Six Quiz

§ Health and Nutritional Status – Discussion

4
5
 

5

 

6

6
6
3. Analyze relationships between quantitative 5 § Brain Size and Intelligence- Assignment

§ Graphs with Linear Regression – Discussion

§ Final Project- Assignment

§ Week Six Quiz

§ Health and Nutritional Status – Discussion

variables using correlation and linear
regression. 5
 

6

6
6
4. Evaluate health care delivery and services 1 § U.S. Mortality Rates- Assignment

§ Sex Ratios- Assignment

§ Immune Responses- Assignment

§ A Crossover Clinical Trial- Assignment

§ t-tests and Confidence Intervals for Continuous Data – Discussion

§ Health and Nutritional Status – Discussion

§ Brain Size and Intelligence- Assignment

§ Final Project- Assignment

using epidemiological data and appropriate 2
statistical methods. 3
4
4
 

5

 

5

 

6

 

 

5. Communicate the findings and implications from statistical analyses to health care administration. 1

1

2

3

3

4

 

 

4

5

 

 

6

6

§ U.S. Mortality Rates- Assignment

§ Hospital Patient Data – Discussion

§ Sex Ratios- Assignment

§ Immune Responses- Assignment

§ Confidence Intyervals – Discussion

§ t-tests and Confidence Intervals for Continuous Data – Discussion

§ A Crossover Clinical Trial- Assignment

§ Brain Size and Intelligence- Assignment

§ Final Project- Assignment

§ Health and Nutritional Status – Discussion

 

MHA610 Alignment Map

The alignment map illustrates the careful design of the course through the alignment of course learning outcomes with both program learning outcomes and professional standards, to ensure integrity and pedagogical depth in the learning experience. MHA610 Introduction to Biostatistics Assignments and DQs

Course Learning Outcome Program Learning Outcome Professional Standards (ACHE)
1. Apply basic statistical principles for describing, analyzing, and interpreting health data. PLO8: Apply problem- solving approaches in the resolution of health care services. Knowledge of the Healthcare Environment

The understanding of the healthcare system and the environment in which healthcare managers and providers function.

2. Apply statistical methods of estimation and hypothesis testing in biostatistics and epidemiology. PLO4: Utilize health care information technology and statistical reasoning in organizational planning and decision-making. Business Skills and Knowledge

The ability to apply business principles, including systems thinking, to the healthcare environment.

3. Analyze relationships between quantitative variables using correlation and linear regression. . PLO4: Utilize health care information technology and statistical reasoning in organizational planning and decision-making. Business Skills and Knowledge The ability to apply business principles, including systems

thinking, to the healthcare environment.

4. Evaluate health care services using epidemiological data and appropriate statistical methods. PLO4: Utilize health care information technology and statistical reasoning in organizational planning and decision-making. Knowledge of the Healthcare Environment

The understanding of the healthcare system and the environment in which healthcare managers and

providers function.

 

 

Course Learning Outcome Program Learning Outcome Professional Standards (ACHE)
5. Communicate the findings and implications from statistical analyses to health care administration. PLO8: Apply problem- solving approaches in the resolution of health care services. Communication and Relationship Management

The ability to communicate clearly and concisely with internal and external customers, establish and maintain relationships, and facilitate constructive interactions with individuals and groups.

 

Did you find apk for android? You can find new and apps.

Do you need a similar assignment written for you from scratch? We have qualified writers to help you. You can rest assured of an A+ quality paper that is plagiarism free. Order now for a FREE first Assignment! Use Discount Code "FREE" for a 100% Discount!

NB: We do not resell papers. Upon ordering, we write an original paper exclusively for you.

Order New Solution