From 0b8b9ec1f271bd731ec3bae38b8876a213c906f6 Mon Sep 17 00:00:00 2001 From: Brian Caffo Date: Fri, 23 May 2014 00:12:38 -0400 Subject: [PATCH 01/15] Added hw3 --- 06_StatisticalInference/homework/hw3.Rmd | 207 ++++++++++ 06_StatisticalInference/homework/hw3.html | 480 ++++++++++++++++++++++ 06_StatisticalInference/homework/hw3.md | 211 ++++++++++ 3 files changed, 898 insertions(+) create mode 100644 06_StatisticalInference/homework/hw3.Rmd create mode 100644 06_StatisticalInference/homework/hw3.html create mode 100644 06_StatisticalInference/homework/hw3.md diff --git a/06_StatisticalInference/homework/hw3.Rmd b/06_StatisticalInference/homework/hw3.Rmd new file mode 100644 index 000000000..25fdc736f --- /dev/null +++ b/06_StatisticalInference/homework/hw3.Rmd @@ -0,0 +1,207 @@ +--- +title : Homework 3 for Stat Inference +subtitle : Extra problems for Stat Inference +author : Brian Caffo +job : Johns Hopkins Bloomberg School of Public Health +framework : io2012 +highlighter : highlight.js +hitheme : tomorrow +#url: +# lib: ../../librariesNew #Remove new if using old slidify +# assets: ../../assets +widgets : [mathjax, quiz, bootstrap] +mode : selfcontained # {standalone, draft} +--- +```{r setup, cache = F, echo = F, message = F, warning = F, tidy = F, results='hide'} +# make this an external chunk that can be included in any file +library(knitr) +options(width = 100) +opts_chunk$set(message = F, error = F, warning = F, comment = NA, fig.align = 'center', dpi = 100, tidy = F, cache.path = '.cache/', fig.path = 'fig/') + +options(xtable.type = 'html') +knit_hooks$set(inline = function(x) { + if(is.numeric(x)) { + round(x, getOption('digits')) + } else { + paste(as.character(x), collapse = ', ') + } +}) +knit_hooks$set(plot = knitr:::hook_plot_html) +``` + +## About these slides +- These are some practice problems for Statistical Inference Quiz 3 +- They were created using slidify interactive which you will learn in +Creating Data Products +- Please help improve this with pull requests here +(https://github.com/bcaffo/courses) + + + +--- &multitext +Load the data set `mtcars` in the `datasets` R package. Calculate a +95% confidence interval to the nearest MPG. + +1. What is the lower endpoint of the interval? +2. What is the upper endpoint of the interval? + +*** .hint +Do `library(datasets)` and then `data(mtcars)` to get the data. +Consider `t.test` for calculations. You may have to install +the datasets package. + + +*** .explanation +```{r} +library(datasets); data(mtcars) +round(t.test(mtcars$mpg)$conf.int) +``` + +`r round(min(t.test(mtcars$mpg)$conf.int))` +`r round(max(t.test(mtcars$mpg)$conf.int))` + +--- &multitext +Suppose that data of 9 paired differences has a standard error of $1$, what value would the average difference have to be to have the lower endpoint of a 95% +students t confidence interval touch zero? + +1. Give the number here to two decimal places + +*** .hint +The t interval is $\bar x t_{.95, 8}\pm s /sqrt{n}$ + +*** .explanation +`r round(qt(.95, df = 3) * 1 / 3, 2)` + +We want $\bar x = t_{.95} s / sqrt{n}$ +```{r} +round(qt(.95, df = 3) * 1 / 3, 2) +``` + + +--- &radio +An independent group Student's T interval is used over +a paired T interval when: + +1. The observations are paired between the groups. +2. _The observations within the groups are natually assumed to be statistically independent_ +3. As long as you do it correctly, either is fine. +4. More details are needed to answer this question + +*** .hint +A paired interval is for paired observations. + +*** .explanation +If the groups are independent is the correct interval. + + +--- &multitext +Consider the `mtcars` dataset. Construct a 95% T interval for MPG comparing +4 to 6 cylinder cars (subtracting in the order of 4 - 6) +assume a constant variance. + +1. What is the lower endpoint of the interval to 1 decimal place? +2. What is the upper endpoint of the interval to 1 decimal place? + +*** .hint +Use `t.test` with `var.equal=TRUE` + +*** .explanation + +```{r} +m4 <- mtcars$mpg[mtcars$cyl == 4] +m6 <- mtcars$mpg[mtcars$cyl == 6] +#this does 4 - 6 +confint <- as.vector(t.test(m4, m6, var.equal = TRUE)$conf.int) +``` + +`r round(min(confint), 1)` +`r round(max(confint), 1)` + + +--- &radio +If someone put a gun to your head and said "Your confidence interval +must contain what it's estimating or I'll pull the trigger", what would +be the smart thing to do? + +1. _Make your interval as wide as possible_ +2. Make your interval as small as possible +3. Call the authorities + +*** .hint +C'mon. You don't need a hint + +*** .explanation +This is just an example of what happens to confidence intervals as you +increas the confidence level. You want to be quite sure in your interval (i.e. +have a large confidence level) and so you would increase the interval's width + +--- &radio + +Refer back to comparing MPG for 4 versus 6 cylinders. What do you conclude? + +1. The interval is above zero, suggesting 6 is better than 4 in the terms of MPG +2. _The interval is above zero, suggesting 4 is better than 6 in the terms of MPG_ +3. The interval does not tell you anything about the hypothesis test; you have to do the test. +4. The interval contains 0 suggesting no difference. + +*** .hint +Refer back to the problem, consider the implications of the interval being +larger than 0, double check the order in which things were subtracted and +make sure the results make sense in the context of the problem. + +*** .explanation +The interval was conducted subtracting 4 - 6 and was entirely above zero. + +--- &multitext +Suppose that 18 obese subjects were randomized, 9 each, to a new diet pill and a placebo. Subjects' body mass indices (BMIs) were measured at a baseline and again after having received the treatment or placebo for four weeks. The average difference from follow-up to the baseline (followup - baseline) was ???3 kg/m2 for the treated group and 1 kg/m2 for the placebo group. The corresponding standard deviations of the differences was 1.5 kg/m2 for the treatment group and 1.8 kg/m2 for the placebo group. Does the change in BMI over the four week period appear to differ between the treated and placebo groups? + +1. Calculate the pooled variance estimate to 2 decimal places + + +*** .hint +The sample sizes are equal, so the pooled variance is the average of the +individual variances + + +*** .explanation +`r round(min(confint), 1)` +```{r} +n1 <- n2 <- 9 +x1 <- -3 ##treated +x2 <- 1 ##placebo +s1 <- 1.5 ##treated +s2 <- 1.8 ##placebo +spsq <- ( (n1 - 1) * s1^2 + (n2 - 1) * s2^2) / (n1 + n2 - 2) +``` +`r round(spsq, 2)` + + +--- &radio + +For Binomial data the maximum likelihood estimate for the probability of +a success is + +1. _The proportion of successes_ +2. The proportion of failures +3. A shrunken version of the proportion of successes +4. A shrunken version of the proportion of failures + +*** .hint +Look back at the notes about likelihood. + +*** .explanation +The MLE for binomial data is always the proportion of successes. + +--- &radio + +Bayesian inference requires + +1. A type I error rate +2. Setting your confidence level +3. _Assigning a prior probability distribution_ +4. Evaluating frequency error rates + +*** .explanation +All of the other answers discuss frequentist concepts. All Bayesian analyses requiring setting a prior. + + diff --git a/06_StatisticalInference/homework/hw3.html b/06_StatisticalInference/homework/hw3.html new file mode 100644 index 000000000..99aca5837 --- /dev/null +++ b/06_StatisticalInference/homework/hw3.html @@ -0,0 +1,480 @@ + + + + Homework 3 for Stat Inference + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+

Homework 3 for Stat Inference

+

Extra problems for Stat Inference

+

Brian Caffo
Johns Hopkins Bloomberg School of Public Health

+
+
+
+ + + + +
+

About these slides

+
+
+
    +
  • These are some practice problems for Statistical Inference Quiz 3
  • +
  • They were created using slidify interactive which you will learn in +Creating Data Products
  • +
  • Please help improve this with pull requests here +(https://github.com/bcaffo/courses)
  • +
+ +
+ +
+ + +
+ +
+

Load the data set mtcars in the datasets R package. Calculate a +95% confidence interval to the nearest MPG.

+ +
    +
  1. What is the lower endpoint of the interval?
  2. +
  3. What is the upper endpoint of the interval?
  4. +
+ + + + + + +
+

Do library(datasets) and then data(mtcars) to get the data. +Consider t.test for calculations. You may have to install +the datasets package.

+ +
+
+
library(datasets); data(mtcars)
+round(t.test(mtcars$mpg)$conf.int)
+
+ +
[1] 18 22
+attr(,"conf.level")
+[1] 0.95
+
+ +

18 +22

+ +
+
+
+ +
+ + +
+ +
+

Suppose that data of 9 paired differences has a standard error of \(1\), what value would the average difference have to be to have the lower endpoint of a 95% +students t confidence interval touch zero?

+ +
    +
  1. Give the number here to two decimal places
  2. +
+ + + + + + +
+

The t interval is \(\bar x t_{.95, 8}\pm s /sqrt{n}\)

+ +
+
+

0.78

+ +

We want \(\bar x = t_{.95} s / sqrt{n}\)

+ +
round(qt(.95, df = 3) * 1 / 3, 2)
+
+ +
[1] 0.78
+
+ +
+
+
+ +
+ + +
+ +
+

An independent group Student's T interval is used over +a paired T interval when:

+ +
    +
  1. The observations are paired between the groups.
  2. +
  3. The observations within the groups are natually assumed to be statistically independent
  4. +
  5. As long as you do it correctly, either is fine.
  6. +
  7. More details are needed to answer this question
  8. +
+ + + + + + +
+

A paired interval is for paired observations.

+ +
+
+

If the groups are independent is the correct interval.

+ +
+
+
+ +
+ + +
+ +
+

Consider the mtcars dataset. Construct a 95% T interval for MPG comparing +4 to 6 cylinder cars (subtracting in the order of 4 - 6) +assume a constant variance.

+ +
    +
  1. What is the lower endpoint of the interval to 1 decimal place?
  2. +
  3. What is the upper endpoint of the interval to 1 decimal place?
  4. +
+ + + + + + +
+

Use t.test with var.equal=TRUE

+ +
+
+
m4 <- mtcars$mpg[mtcars$cyl == 4]
+m6 <- mtcars$mpg[mtcars$cyl == 6]
+#this does 4 - 6
+confint <- as.vector(t.test(m4, m6, var.equal = TRUE)$conf.int)
+
+ +

3.2 +10.7

+ +
+
+
+ +
+ + +
+ +
+

If someone put a gun to your head and said "Your confidence interval +must contain what it's estimating or I'll pull the trigger", what would +be the smart thing to do?

+ +
    +
  1. Make your interval as wide as possible
  2. +
  3. Make your interval as small as possible
  4. +
  5. Call the authorities
  6. +
+ + + + + + +
+

C'mon. You don't need a hint

+ +
+
+

This is just an example of what happens to confidence intervals as you +increas the confidence level. You want to be quite sure in your interval (i.e. +have a large confidence level) and so you would increase the interval's width

+ +
+
+
+ +
+ + +
+ +
+

Refer back to comparing MPG for 4 versus 6 cylinders. What do you conclude?

+ +
    +
  1. The interval is above zero, suggesting 6 is better than 4 in the terms of MPG
  2. +
  3. The interval is above zero, suggesting 4 is better than 6 in the terms of MPG
  4. +
  5. The interval does not tell you anything about the hypothesis test; you have to do the test.
  6. +
  7. The interval contains 0 suggesting no difference.
  8. +
+ + + + + + +
+

Refer back to the problem, consider the implications of the interval being +larger than 0, double check the order in which things were subtracted and +make sure the results make sense in the context of the problem.

+ +
+
+

The interval was conducted subtracting 4 - 6 and was entirely above zero.

+ +
+
+
+ +
+ + +
+ +
+

Suppose that 18 obese subjects were randomized, 9 each, to a new diet pill and a placebo. Subjects' body mass indices (BMIs) were measured at a baseline and again after having received the treatment or placebo for four weeks. The average difference from follow-up to the baseline (followup - baseline) was ???3 kg/m2 for the treated group and 1 kg/m2 for the placebo group. The corresponding standard deviations of the differences was 1.5 kg/m2 for the treatment group and 1.8 kg/m2 for the placebo group. Does the change in BMI over the four week period appear to differ between the treated and placebo groups?

+ +
    +
  1. Calculate the pooled variance estimate to 2 decimal places
  2. +
+ + + + + + +
+

The sample sizes are equal, so the pooled variance is the average of the +individual variances

+ +
+
+

3.2

+ +
n1 <- n2 <- 9
+x1 <- -3  ##treated
+x2 <- 1  ##placebo
+s1 <- 1.5  ##treated
+s2 <- 1.8  ##placebo
+spsq <- ( (n1 - 1) * s1^2 + (n2 - 1) * s2^2) / (n1 + n2 - 2)
+
+ +

2.75

+ +
+
+
+ +
+ + +
+ +
+

For Binomial data the maximum likelihood estimate for the probability of +a success is

+ +
    +
  1. The proportion of successes
  2. +
  3. The proportion of failures
  4. +
  5. A shrunken version of the proportion of successes
  6. +
  7. A shrunken version of the proportion of failures
  8. +
+ + + + + + +
+

Look back at the notes about likelihood.

+ +
+
+

The MLE for binomial data is always the proportion of successes.

+ +
+
+
+ +
+ + +
+ +
+

Bayesian inference requires

+ +
    +
  1. A type I error rate
  2. +
  3. Setting your confidence level
  4. +
  5. Assigning a prior probability distribution
  6. +
  7. Evaluating frequency error rates
  8. +
+ + + + + + +
+

All of the other answers discuss frequentist concepts. All Bayesian analyses requiring setting a prior.

+ +
+
+
+ +
+ + +
+ + + + + + + + + + + + + + + + + + + + + \ No newline at end of file diff --git a/06_StatisticalInference/homework/hw3.md b/06_StatisticalInference/homework/hw3.md new file mode 100644 index 000000000..42919061a --- /dev/null +++ b/06_StatisticalInference/homework/hw3.md @@ -0,0 +1,211 @@ +--- +title : Homework 3 for Stat Inference +subtitle : Extra problems for Stat Inference +author : Brian Caffo +job : Johns Hopkins Bloomberg School of Public Health +framework : io2012 +highlighter : highlight.js +hitheme : tomorrow +#url: +# lib: ../../librariesNew #Remove new if using old slidify +# assets: ../../assets +widgets : [mathjax, quiz, bootstrap] +mode : selfcontained # {standalone, draft} +--- + + + +## About these slides +- These are some practice problems for Statistical Inference Quiz 3 +- They were created using slidify interactive which you will learn in +Creating Data Products +- Please help improve this with pull requests here +(https://github.com/bcaffo/courses) + + + +--- &multitext +Load the data set `mtcars` in the `datasets` R package. Calculate a +95% confidence interval to the nearest MPG. + +1. What is the lower endpoint of the interval? +2. What is the upper endpoint of the interval? + +*** .hint +Do `library(datasets)` and then `data(mtcars)` to get the data. +Consider `t.test` for calculations. You may have to install +the datasets package. + + +*** .explanation + +```r +library(datasets); data(mtcars) +round(t.test(mtcars$mpg)$conf.int) +``` + +``` +[1] 18 22 +attr(,"conf.level") +[1] 0.95 +``` + + +18 +22 + +--- &multitext +Suppose that data of 9 paired differences has a standard error of $1$, what value would the average difference have to be to have the lower endpoint of a 95% +students t confidence interval touch zero? + +1. Give the number here to two decimal places + +*** .hint +The t interval is $\bar x t_{.95, 8}\pm s /sqrt{n}$ + +*** .explanation +0.78 + +We want $\bar x = t_{.95} s / sqrt{n}$ + +```r +round(qt(.95, df = 3) * 1 / 3, 2) +``` + +``` +[1] 0.78 +``` + + + +--- &radio +An independent group Student's T interval is used over +a paired T interval when: + +1. The observations are paired between the groups. +2. _The observations within the groups are natually assumed to be statistically independent_ +3. As long as you do it correctly, either is fine. +4. More details are needed to answer this question + +*** .hint +A paired interval is for paired observations. + +*** .explanation +If the groups are independent is the correct interval. + + +--- &multitext +Consider the `mtcars` dataset. Construct a 95% T interval for MPG comparing +4 to 6 cylinder cars (subtracting in the order of 4 - 6) +assume a constant variance. + +1. What is the lower endpoint of the interval to 1 decimal place? +2. What is the upper endpoint of the interval to 1 decimal place? + +*** .hint +Use `t.test` with `var.equal=TRUE` + +*** .explanation + + +```r +m4 <- mtcars$mpg[mtcars$cyl == 4] +m6 <- mtcars$mpg[mtcars$cyl == 6] +#this does 4 - 6 +confint <- as.vector(t.test(m4, m6, var.equal = TRUE)$conf.int) +``` + + +3.2 +10.7 + + +--- &radio +If someone put a gun to your head and said "Your confidence interval +must contain what it's estimating or I'll pull the trigger", what would +be the smart thing to do? + +1. _Make your interval as wide as possible_ +2. Make your interval as small as possible +3. Call the authorities + +*** .hint +C'mon. You don't need a hint + +*** .explanation +This is just an example of what happens to confidence intervals as you +increas the confidence level. You want to be quite sure in your interval (i.e. +have a large confidence level) and so you would increase the interval's width + +--- &radio + +Refer back to comparing MPG for 4 versus 6 cylinders. What do you conclude? + +1. The interval is above zero, suggesting 6 is better than 4 in the terms of MPG +2. _The interval is above zero, suggesting 4 is better than 6 in the terms of MPG_ +3. The interval does not tell you anything about the hypothesis test; you have to do the test. +4. The interval contains 0 suggesting no difference. + +*** .hint +Refer back to the problem, consider the implications of the interval being +larger than 0, double check the order in which things were subtracted and +make sure the results make sense in the context of the problem. + +*** .explanation +The interval was conducted subtracting 4 - 6 and was entirely above zero. + +--- &multitext +Suppose that 18 obese subjects were randomized, 9 each, to a new diet pill and a placebo. Subjects' body mass indices (BMIs) were measured at a baseline and again after having received the treatment or placebo for four weeks. The average difference from follow-up to the baseline (followup - baseline) was ???3 kg/m2 for the treated group and 1 kg/m2 for the placebo group. The corresponding standard deviations of the differences was 1.5 kg/m2 for the treatment group and 1.8 kg/m2 for the placebo group. Does the change in BMI over the four week period appear to differ between the treated and placebo groups? + +1. Calculate the pooled variance estimate to 2 decimal places + + +*** .hint +The sample sizes are equal, so the pooled variance is the average of the +individual variances + + +*** .explanation +3.2 + +```r +n1 <- n2 <- 9 +x1 <- -3 ##treated +x2 <- 1 ##placebo +s1 <- 1.5 ##treated +s2 <- 1.8 ##placebo +spsq <- ( (n1 - 1) * s1^2 + (n2 - 1) * s2^2) / (n1 + n2 - 2) +``` + +2.75 + + +--- &radio + +For Binomial data the maximum likelihood estimate for the probability of +a success is + +1. _The proportion of successes_ +2. The proportion of failures +3. A shrunken version of the proportion of successes +4. A shrunken version of the proportion of failures + +*** .hint +Look back at the notes about likelihood. + +*** .explanation +The MLE for binomial data is always the proportion of successes. + +--- &radio + +Bayesian inference requires + +1. A type I error rate +2. Setting your confidence level +3. _Assigning a prior probability distribution_ +4. Evaluating frequency error rates + +*** .explanation +All of the other answers discuss frequentist concepts. All Bayesian analyses requiring setting a prior. + + From d7327b991ddc24cc4d0153164e270a7b1f6a45ff Mon Sep 17 00:00:00 2001 From: Brian Caffo Date: Fri, 23 May 2014 00:22:29 -0400 Subject: [PATCH 02/15] Added a fourth hw --- 06_StatisticalInference/homework/hw4.Rmd | 37 +++++++ 06_StatisticalInference/homework/hw4.html | 112 ++++++++++++++++++++++ 06_StatisticalInference/homework/hw4.md | 23 +++++ 3 files changed, 172 insertions(+) create mode 100644 06_StatisticalInference/homework/hw4.Rmd create mode 100644 06_StatisticalInference/homework/hw4.html create mode 100644 06_StatisticalInference/homework/hw4.md diff --git a/06_StatisticalInference/homework/hw4.Rmd b/06_StatisticalInference/homework/hw4.Rmd new file mode 100644 index 000000000..bf5a8da3b --- /dev/null +++ b/06_StatisticalInference/homework/hw4.Rmd @@ -0,0 +1,37 @@ +--- +title : Homework 4 for Stat Inference +subtitle : Extra problems for Stat Inference +author : Brian Caffo +job : Johns Hopkins Bloomberg School of Public Health +framework : io2012 +highlighter : highlight.js +hitheme : tomorrow +#url: +# lib: ../../librariesNew #Remove new if using old slidify +# assets: ../../assets +widgets : [mathjax, quiz, bootstrap] +mode : selfcontained # {standalone, draft} +--- +```{r setup, cache = F, echo = F, message = F, warning = F, tidy = F, results='hide'} +# make this an external chunk that can be included in any file +library(knitr) +options(width = 100) +opts_chunk$set(message = F, error = F, warning = F, comment = NA, fig.align = 'center', dpi = 100, tidy = F, cache.path = '.cache/', fig.path = 'fig/') + +options(xtable.type = 'html') +knit_hooks$set(inline = function(x) { + if(is.numeric(x)) { + round(x, getOption('digits')) + } else { + paste(as.character(x), collapse = ', ') + } +}) +knit_hooks$set(plot = knitr:::hook_plot_html) +``` + +## About these slides +- These are some practice problems for Statistical Inference Quiz 4 +- They were created using slidify interactive which you will learn in +Creating Data Products +- Please help improve this with pull requests here +(https://github.com/bcaffo/courses) diff --git a/06_StatisticalInference/homework/hw4.html b/06_StatisticalInference/homework/hw4.html new file mode 100644 index 000000000..565621a6d --- /dev/null +++ b/06_StatisticalInference/homework/hw4.html @@ -0,0 +1,112 @@ + + + + Homework 4 for Stat Inference + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+

Homework 4 for Stat Inference

+

Extra problems for Stat Inference

+

Brian Caffo
Johns Hopkins Bloomberg School of Public Health

+
+
+
+ + + + +
+

About these slides

+
+
+
    +
  • These are some practice problems for Statistical Inference Quiz 4
  • +
  • They were created using slidify interactive which you will learn in +Creating Data Products
  • +
  • Please help improve this with pull requests here +(https://github.com/bcaffo/courses)
  • +
+ +
+ +
+ + +
+ + + + + + + + + + + + + + + + + + + + + \ No newline at end of file diff --git a/06_StatisticalInference/homework/hw4.md b/06_StatisticalInference/homework/hw4.md new file mode 100644 index 000000000..a22e64543 --- /dev/null +++ b/06_StatisticalInference/homework/hw4.md @@ -0,0 +1,23 @@ +--- +title : Homework 4 for Stat Inference +subtitle : Extra problems for Stat Inference +author : Brian Caffo +job : Johns Hopkins Bloomberg School of Public Health +framework : io2012 +highlighter : highlight.js +hitheme : tomorrow +#url: +# lib: ../../librariesNew #Remove new if using old slidify +# assets: ../../assets +widgets : [mathjax, quiz, bootstrap] +mode : selfcontained # {standalone, draft} +--- + + + +## About these slides +- These are some practice problems for Statistical Inference Quiz 4 +- They were created using slidify interactive which you will learn in +Creating Data Products +- Please help improve this with pull requests here +(https://github.com/bcaffo/courses) From 482ea7973f07e9023398aa290f96d1a83eb499cd Mon Sep 17 00:00:00 2001 From: Troy Date: Sat, 24 May 2014 14:04:32 -0400 Subject: [PATCH 03/15] fix homework 1 question 1, which basically was missing a negative sign --- 06_StatisticalInference/homework/hw1.Rmd | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/06_StatisticalInference/homework/hw1.Rmd b/06_StatisticalInference/homework/hw1.Rmd index 7f31c63a0..7d92d4486 100644 --- a/06_StatisticalInference/homework/hw1.Rmd +++ b/06_StatisticalInference/homework/hw1.Rmd @@ -40,9 +40,9 @@ Creating Data Products --- &radio -Consider influenza epidemics for two parent heterosexual families. Suppose that the probability is 15% that at least one of the parents has contracted the disease. The probability that the father has contracted influenza is 6% while that the mother contracted the disease is 5%. What is the probability that both contracted influenza expressed as a whole number percentage? +Consider influenza epidemics for two parent heterosexual families. Suppose that the probability is 9% that at least one of the parents has contracted the disease. The probability that the father has contracted influenza is 6% while that the mother contracted the disease is 5%. What is the probability that both contracted influenza expressed as a whole number percentage? -1. 15% +1. 1% 2. 6% 3. 5% 4. _2%_ @@ -53,9 +53,9 @@ $P(A\cup B) = .15$, *** .explanation $P(A\cup B) = P(A) + P(B) - 2 P(AB)$ thus -$$.15 = .06 + .05 - 2 P(AB)$$ +$$.09 = .06 + .05 - 2 P(AB)$$ ```{r} -(0.15 - .06 - .05) / 2 +(0.09 - .06 - .05) / (-2) ``` --- &radio From 15080cd9481577c01ed6af1b3afc3829a732f5ed Mon Sep 17 00:00:00 2001 From: Troy Date: Sat, 24 May 2014 14:36:54 -0400 Subject: [PATCH 04/15] fixed homework 1 question 1 - another problem --- 06_StatisticalInference/homework/hw1.Rmd | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/06_StatisticalInference/homework/hw1.Rmd b/06_StatisticalInference/homework/hw1.Rmd index 7d92d4486..f5476f5c7 100644 --- a/06_StatisticalInference/homework/hw1.Rmd +++ b/06_StatisticalInference/homework/hw1.Rmd @@ -40,22 +40,22 @@ Creating Data Products --- &radio -Consider influenza epidemics for two parent heterosexual families. Suppose that the probability is 9% that at least one of the parents has contracted the disease. The probability that the father has contracted influenza is 6% while that the mother contracted the disease is 5%. What is the probability that both contracted influenza expressed as a whole number percentage? +Consider influenza epidemics for two parent heterosexual families. Suppose that the probability is 15% that at least one of the parents has contracted the disease. The probability that the father has contracted influenza is 10% while that the mother contracted the disease is 9%. What is the probability that both contracted influenza expressed as a whole number percentage? -1. 1% -2. 6% -3. 5% -4. _2%_ +1. 15% +2. 10% +3. 9% +4. _4%_ *** .hint -$A = Father$, $P(A) = .06$, $B = Mother$, $P(B) = .05$ +$A = Father$, $P(A) = .10$, $B = Mother$, $P(B) = .09$ $P(A\cup B) = .15$, *** .explanation -$P(A\cup B) = P(A) + P(B) - 2 P(AB)$ thus -$$.09 = .06 + .05 - 2 P(AB)$$ +$P(A\cup B) = P(A) + P(B) - P(AB)$ thus +$$.15 = .10 + .09 - P(AB)$$ ```{r} -(0.09 - .06 - .05) / (-2) +.10 + .09 - .15 ``` --- &radio From e2ce444fe372afb1730e7ece181a2bf6e01eed89 Mon Sep 17 00:00:00 2001 From: Troy Date: Sat, 24 May 2014 16:52:02 -0400 Subject: [PATCH 05/15] fix homework 2 slide 8 typo. lower.tail=TRUE changed to FALSE --- 06_StatisticalInference/homework/hw2.Rmd | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/06_StatisticalInference/homework/hw2.Rmd b/06_StatisticalInference/homework/hw2.Rmd index 3a568425c..c84934812 100644 --- a/06_StatisticalInference/homework/hw2.Rmd +++ b/06_StatisticalInference/homework/hw2.Rmd @@ -157,10 +157,10 @@ Let $p=.5$ and $X$ be binomial *** .explanation -`r round(pbinom(4, prob = .5, size = 6, lower.tail = TRUE) * 100, 1)` +`r round(pbinom(4, prob = .5, size = 6, lower.tail = FALSE) * 100, 1)` ```{r} -round(pbinom(4, prob = .5, size = 6, lower.tail = TRUE) * 100, 1) +round(pbinom(4, prob = .5, size = 6, lower.tail = FALSE) * 100, 1) ``` --- &multitext From 385e9a1f63928f0782c257ff2d08ec9b5a847f87 Mon Sep 17 00:00:00 2001 From: Troy Date: Sat, 24 May 2014 18:07:20 -0400 Subject: [PATCH 06/15] fix homework 2 slide 12 question - typo with parens and s/100/10 --- 06_StatisticalInference/homework/hw2.Rmd | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/06_StatisticalInference/homework/hw2.Rmd b/06_StatisticalInference/homework/hw2.Rmd index c84934812..b38eab299 100644 --- a/06_StatisticalInference/homework/hw2.Rmd +++ b/06_StatisticalInference/homework/hw2.Rmd @@ -210,9 +210,9 @@ If you roll ten standard dice, take their average, then repeat this process over $$Var(\bar X) = \sigma^2 /n$$ *** .explanation -The answer will be `r round( mean(1 : 6 - 3.5) ^2 / 100, 3)` -since the variance of the sampling distribution of the mean is $\sigma^2/12$ -and the variance of a die roll is +The answer will be `r round( mean( (1 : 6 - 3.5) ^2) / 10, 3)` +since the variance of the sampling distribution of the mean is $\sigma^2/10$ +where $\sigma^2$ is the variance of a single die roll, which is ```{r} mean((1 : 6 - 3.5)^2) From fb57d59f5c514a920b091a93c62e364f6bbc90cb Mon Sep 17 00:00:00 2001 From: Brian Caffo Date: Sat, 24 May 2014 23:38:13 -0400 Subject: [PATCH 07/15] test --- .../rStudioPresent/index.Rpres | 264 +++++++++--------- .../rStudioPresent/index.md | 264 +++++++++--------- 2 files changed, 264 insertions(+), 264 deletions(-) diff --git a/09_DevelopingDataProducts/rStudioPresent/index.Rpres b/09_DevelopingDataProducts/rStudioPresent/index.Rpres index a237721f7..00c9487c7 100644 --- a/09_DevelopingDataProducts/rStudioPresent/index.Rpres +++ b/09_DevelopingDataProducts/rStudioPresent/index.Rpres @@ -1,132 +1,132 @@ -RStudio Presenter -=== -author: Brian Caffo, Jeff Leek Roger Peng -date: `r format(Sys.Date(), format="%B %d %Y")` -transition: rotate - - -Department of Biostatistics -Bloomberg School of Public Health -Johns Hopkins University -Coursera Data Science Specialization - - - -RStudio Presentation -=== -- RStudio created a presentation authoring tool within their -development environment. -- If you are familiar with slidify, you will also be familiar with this tool - - Code is authored in a generalized markdown format that allows for code chunks - - The output is an html5 presentation - - The file index for the presenter file is .Rpres, which gets converted to an .md file and then to an html file if desired - - There's a preview tool in RStudio and GUIs for publishing to Rpubs or viewing/creating an html file - -Authoring content -=== -- This is a fairly complete guide - - http://www.rstudio.com/ide/docs/presentations/overview -- Quick start is - - `file` then `New File` then `R Presentation` - - (`alt-f` then `f` then `p` if you want key strokes) - - Use basically the same R markdown format for authoring as slidify/knitr - - Single quotes for inline code - - Tripple qutoes for block code - - Same options for code evaluation, caching, hiding etcetera - -Compiling and tools -=== -- R Studio auto formats and runs the code when you save the document -- Mathjax JS library is loaded by default so that `$x^2$` yields $x^2$ -- Slide navigation button on the preview; clicking on the notepad icon takes you to that slide in the deck -- Clicking on `more` yields options for - - Clearning the knitr cache - - Viewing in a browser (creates a temporay html file in `AppData/local/temp` for me) - - Create a html file to save where you want) -- A refresh button -- A zoom button that brings up a full window - -Visuals -=== -transition: linear - -- R Studio has made it easy to get some cool html5 effects, like cube transitions -with simple options in YAML-like code after the first slide such as -`transition: rotate` -- You can specify it in a slide-by-slide basis - -Here's the option "linear" -=== -transition: linear - -- Just put `transition: linear` right after the slide creation (three equal signs or more in a row) -- Tansition options - - http://www.rstudio.com/ide/docs/presentations/slide_transitions_and_navigation - -Hierarchical organization -=== -type: section -- If you want a hierarchical organization structure, just add a `type: typename` option after the slide -- This changes the default appearance - - http://www.rstudio.com/ide/docs/presentations/slide_transitions_and_navigation -- This is of type `section` - -Here's a subsection -=== -type: subsection - -Two columns -=== -- Do whatever for column one -- Then put `***` on a line by itself with blank lines before and after - -*** - -- Then do whatever for column two - - -Changing the slide font -========================================================== -font-import: http://fonts.googleapis.com/css?family=Risque -font-family: 'Risque' - -- Add a `font-family: fontname` option after the slide - - http://www.rstudio.com/ide/docs/presentations/customizing_fonts_and_appearance -- Specified in the same way as css font families - - http://www.w3schools.com/cssref/css_websafe_fonts.asp -- Use `font-import: url` to import fonts -- Important caveats - - Fonts must be present on the system that you're presenting on, or it will go to a fallback font - - You have to be connected to the internet to use an imported font (so don't rely on this for offline presentations) -- This is the `Risque` - - http://fonts.googleapis.com/css?family=Risque - -Really changing things -=== -- If you know html5 and CSS well, then you can basically change whatever you want -- A css file with the same names as your presentation will be autoimported -- You can use `css: file.css` to import a css file -- You have to create named classes and then use `class: classname` to get slide-specific style control from your css - - (Or you can apply then within a ``) -- Ultimately, you have an html file, that you can edit as you wish - - This should be viewed as a last resort, as the whole point is to have reproducible presentations, but may be the easiest way to get the exact style control you want for a final product - -Slidify versus R Studio Presenter -=== -**Slidify** -- Flexible control from the R MD file -- Under rapid ongoing development -- Large user base -- Lots and lots of styles and options -- Steeper learning curve -- More command-line oriented - -*** -**R Studio Presenter** -- Embedded in R Studio -- More GUI oriented -- Very easy to get started -- Smaller set of easy styles and options -- Default styles look very nice -- Ultimately as flexible as slidify with a little CSS and HTML knowledge - +RStudio Presenter +=== +author: Brian Caffo, Jeff Leek Roger Peng +date: `r format(Sys.Date(), format="%B %d %Y")` +transition: rotate + + +Department of Biostatistics +Bloomberg School of Public Health +Johns Hopkins University +Coursera Data Science Specialization + + + +RStudio Presentation +=== +- RStudio created a presentation authoring tool within their +development environment. +- If you are familiar with slidify, you will also be familiar with this tool + - Code is authored in a generalized markdown format that allows for code chunks + - The output is an html5 presentation + - The file index for the presenter file is .Rpres, which gets converted to an .md file and then to an html file if desired + - There's a preview tool in RStudio and GUIs for publishing to Rpubs or viewing/creating an html file + +Authoring content +=== +- This is a fairly complete guide + - http://www.rstudio.com/ide/docs/presentations/overview +- Quick start is + - `file` then `New File` then `R Presentation` + - (`alt-f` then `f` then `p` if you want key strokes) + - Use basically the same R markdown format for authoring as slidify/knitr + - Single quotes for inline code + - Tripple qutoes for block code + - Same options for code evaluation, caching, hiding etcetera + +Compiling and tools +=== +- R Studio auto formats and runs the code when you save the document +- Mathjax JS library is loaded by default so that `$x^2$` yields $x^2$ +- Slide navigation button on the preview; clicking on the notepad icon takes you to that slide in the deck +- Clicking on `more` yields options for + - Clearning the knitr cache + - Viewing in a browser (creates a temporay html file in `AppData/local/temp` for me) + - Create a html file to save where you want) +- A refresh button +- A zoom button that brings up a full window + +Visuals +=== +transition: linear + +- R Studio has made it easy to get some cool html5 effects, like cube transitions +with simple options in YAML-like code after the first slide such as +`transition: rotate` +- You can specify it in a slide-by-slide basis + +Here's the option "linear" +=== +transition: linear + +- Just put `transition: linear` right after the slide creation (three equal signs or more in a row) +- Tansition options + - http://www.rstudio.com/ide/docs/presentations/slide_transitions_and_navigation + +Hierarchical organization +=== +type: section +- If you want a hierarchical organization structure, just add a `type: typename` option after the slide +- This changes the default appearance + - http://www.rstudio.com/ide/docs/presentations/slide_transitions_and_navigation +- This is of type `section` + +Here's a subsection +=== +type: subsection + +Two columns +=== +- Do whatever for column one +- Then put `***` on a line by itself with blank lines before and after + +*** + +- Then do whatever for column two + + +Changing the slide font +========================================================== +font-import: http://fonts.googleapis.com/css?family=Risque +font-family: 'Risque' + +- Add a `font-family: fontname` option after the slide + - http://www.rstudio.com/ide/docs/presentations/customizing_fonts_and_appearance +- Specified in the same way as css font families + - http://www.w3schools.com/cssref/css_websafe_fonts.asp +- Use `font-import: url` to import fonts +- Important caveats + - Fonts must be present on the system that you're presenting on, or it will go to a fallback font + - You have to be connected to the internet to use an imported font (so don't rely on this for offline presentations) +- This is the `Risque` + - http://fonts.googleapis.com/css?family=Risque + +Really changing things +=== +- If you know html5 and CSS well, then you can basically change whatever you want +- A css file with the same names as your presentation will be autoimported +- You can use `css: file.css` to import a css file +- You have to create named classes and then use `class: classname` to get slide-specific style control from your css + - (Or you can apply then within a ``) +- Ultimately, you have an html file, that you can edit as you wish + - This should be viewed as a last resort, as the whole point is to have reproducible presentations, but may be the easiest way to get the exact style control you want for a final product + +Slidify versus R Studio Presenter +=== +**Slidify** +- Flexible control from the R MD file +- Under rapid ongoing development +- Large user base +- Lots and lots of styles and options +- Steeper learning curve +- More command-line oriented + +*** +**R Studio Presenter** +- Embedded in R Studio +- More GUI oriented +- Very easy to get started +- Smaller set of easy styles and options +- Default styles look very nice +- Ultimately as flexible as slidify with a little CSS and HTML knowledge + diff --git a/09_DevelopingDataProducts/rStudioPresent/index.md b/09_DevelopingDataProducts/rStudioPresent/index.md index 399fb071a..b998542ae 100644 --- a/09_DevelopingDataProducts/rStudioPresent/index.md +++ b/09_DevelopingDataProducts/rStudioPresent/index.md @@ -1,132 +1,132 @@ -RStudio Presenter -=== -author: Brian Caffo, Jeff Leek Roger Peng -date: April 24 2014 -transition: rotate - - -Department of Biostatistics -Bloomberg School of Public Health -Johns Hopkins University -Coursera Data Science Specialization - - - -RStudio Presentation -=== -- RStudio created a presentation authoring tool within their -development environment. -- If you are familiar with slidify, you will also be familiar with this tool - - Code is authored in a generalized markdown format that allows for code chunks - - The output is an html5 presentation - - The file index for the presenter file is .Rpres, which gets converted to an .md file and then to an html file if desired - - There's a preview tool in RStudio and GUIs for publishing to Rpubs or viewing/creating an html file - -Authoring content -=== -- This is a fairly complete guide - - http://www.rstudio.com/ide/docs/presentations/overview -- Quick start is - - `file` then `New File` then `R Presentation` - - (`alt-f` then `f` then `p` if you want key strokes) - - Use basically the same R markdown format for authoring as slidify/knitr - - Single quotes for inline code - - Tripple qutoes for block code - - Same options for code evaluation, caching, hiding etcetera - -Compiling and tools -=== -- R Studio auto formats and runs the code when you save the document -- Mathjax JS library is loaded by default so that `$x^2$` yields $x^2$ -- Slide navigation button on the preview; clicking on the notepad icon takes you to that slide in the deck -- Clicking on `more` yields options for - - Clearning the knitr cache - - Viewing in a browser (creates a temporay html file in `AppData/local/temp` for me) - - Create a html file to save where you want) -- A refresh button -- A zoom button that brings up a full window - -Visuals -=== -transition: linear - -- R Studio has made it easy to get some cool html5 effects, like cube transitions -with simple options in YAML-like code after the first slide such as -`transition: rotate` -- You can specify it in a slide-by-slide basis - -Here's the option "linear" -=== -transition: linear - -- Just put `transition: linear` right after the slide creation (three equal signs or more in a row) -- Tansition options - - http://www.rstudio.com/ide/docs/presentations/slide_transitions_and_navigation - -Hierarchical organization -=== -type: section -- If you want a hierarchical organization structure, just add a `type: typename` option after the slide -- This changes the default appearance - - http://www.rstudio.com/ide/docs/presentations/slide_transitions_and_navigation -- This is of type `section` - -Here's a subsection -=== -type: subsection - -Two columns -=== -- Do whatever for column one -- Then put `***` on a line by itself with blank lines before and after - -*** - -- Then do whatever for column two - - -Changing the slide font -========================================================== -font-import: http://fonts.googleapis.com/css?family=Risque -font-family: 'Risque' - -- Add a `font-family: fontname` option after the slide - - http://www.rstudio.com/ide/docs/presentations/customizing_fonts_and_appearance -- Specified in the same way as css font families - - http://www.w3schools.com/cssref/css_websafe_fonts.asp -- Use `font-import: url` to import fonts -- Important caveats - - Fonts must be present on the system that you're presenting on, or it will go to a fallback font - - You have to be connected to the internet to use an imported font (so don't rely on this for offline presentations) -- This is the `Risque` - - http://fonts.googleapis.com/css?family=Risque - -Really changing things -=== -- If you know html5 and CSS well, then you can basically change whatever you want -- A css file with the same names as your presentation will be autoimported -- You can use `css: file.css` to import a css file -- You have to create named classes and then use `class: classname` to get slide-specific style control from your css - - (Or you can apply then within a ``) -- Ultimately, you have an html file, that you can edit as you wish - - This should be viewed as a last resort, as the whole point is to have reproducible presentations, but may be the easiest way to get the exact style control you want for a final product - -Slidify versus R Studio Presenter -=== -**Slidify** -- Flexible control from the R MD file -- Under rapid ongoing development -- Large user base -- Lots and lots of styles and options -- Steeper learning curve -- More command-line oriented - -*** -**R Studio Presenter** -- Embedded in R Studio -- More GUI oriented -- Very easy to get started -- Smaller set of easy styles and options -- Default styles look very nice -- Ultimately as flexible as slidify with a little CSS and HTML knowledge - +RStudio Presenter +=== +author: Brian Caffo, Jeff Leek Roger Peng +date: May 21 2014 +transition: rotate + + +Department of Biostatistics +Bloomberg School of Public Health +Johns Hopkins University +Coursera Data Science Specialization + + + +RStudio Presentation +=== +- RStudio created a presentation authoring tool within their +development environment. +- If you are familiar with slidify, you will also be familiar with this tool + - Code is authored in a generalized markdown format that allows for code chunks + - The output is an html5 presentation + - The file index for the presenter file is .Rpres, which gets converted to an .md file and then to an html file if desired + - There's a preview tool in RStudio and GUIs for publishing to Rpubs or viewing/creating an html file + +Authoring content +=== +- This is a fairly complete guide + - http://www.rstudio.com/ide/docs/presentations/overview +- Quick start is + - `file` then `New File` then `R Presentation` + - (`alt-f` then `f` then `p` if you want key strokes) + - Use basically the same R markdown format for authoring as slidify/knitr + - Single quotes for inline code + - Tripple qutoes for block code + - Same options for code evaluation, caching, hiding etcetera + +Compiling and tools +=== +- R Studio auto formats and runs the code when you save the document +- Mathjax JS library is loaded by default so that `$x^2$` yields $x^2$ +- Slide navigation button on the preview; clicking on the notepad icon takes you to that slide in the deck +- Clicking on `more` yields options for + - Clearning the knitr cache + - Viewing in a browser (creates a temporay html file in `AppData/local/temp` for me) + - Create a html file to save where you want) +- A refresh button +- A zoom button that brings up a full window + +Visuals +=== +transition: linear + +- R Studio has made it easy to get some cool html5 effects, like cube transitions +with simple options in YAML-like code after the first slide such as +`transition: rotate` +- You can specify it in a slide-by-slide basis + +Here's the option "linear" +=== +transition: linear + +- Just put `transition: linear` right after the slide creation (three equal signs or more in a row) +- Tansition options + - http://www.rstudio.com/ide/docs/presentations/slide_transitions_and_navigation + +Hierarchical organization +=== +type: section +- If you want a hierarchical organization structure, just add a `type: typename` option after the slide +- This changes the default appearance + - http://www.rstudio.com/ide/docs/presentations/slide_transitions_and_navigation +- This is of type `section` + +Here's a subsection +=== +type: subsection + +Two columns +=== +- Do whatever for column one +- Then put `***` on a line by itself with blank lines before and after + +*** + +- Then do whatever for column two + + +Changing the slide font +========================================================== +font-import: http://fonts.googleapis.com/css?family=Risque +font-family: 'Risque' + +- Add a `font-family: fontname` option after the slide + - http://www.rstudio.com/ide/docs/presentations/customizing_fonts_and_appearance +- Specified in the same way as css font families + - http://www.w3schools.com/cssref/css_websafe_fonts.asp +- Use `font-import: url` to import fonts +- Important caveats + - Fonts must be present on the system that you're presenting on, or it will go to a fallback font + - You have to be connected to the internet to use an imported font (so don't rely on this for offline presentations) +- This is the `Risque` + - http://fonts.googleapis.com/css?family=Risque + +Really changing things +=== +- If you know html5 and CSS well, then you can basically change whatever you want +- A css file with the same names as your presentation will be autoimported +- You can use `css: file.css` to import a css file +- You have to create named classes and then use `class: classname` to get slide-specific style control from your css + - (Or you can apply then within a ``) +- Ultimately, you have an html file, that you can edit as you wish + - This should be viewed as a last resort, as the whole point is to have reproducible presentations, but may be the easiest way to get the exact style control you want for a final product + +Slidify versus R Studio Presenter +=== +**Slidify** +- Flexible control from the R MD file +- Under rapid ongoing development +- Large user base +- Lots and lots of styles and options +- Steeper learning curve +- More command-line oriented + +*** +**R Studio Presenter** +- Embedded in R Studio +- More GUI oriented +- Very easy to get started +- Smaller set of easy styles and options +- Default styles look very nice +- Ultimately as flexible as slidify with a little CSS and HTML knowledge + From 33a0fc9a80a393e1336c7a2df9e2897cd1daf17a Mon Sep 17 00:00:00 2001 From: Troy Date: Mon, 2 Jun 2014 00:37:30 -0400 Subject: [PATCH 08/15] fix typos and adjust wording in hw3, knit html with slidify --- 06_StatisticalInference/homework/hw1.html | 20 ++++++++++---------- 06_StatisticalInference/homework/hw1.md | 20 ++++++++++---------- 06_StatisticalInference/homework/hw2.html | 17 ++++++++--------- 06_StatisticalInference/homework/hw2.md | 15 +++++++-------- 06_StatisticalInference/homework/hw3.Rmd | 15 +++++++-------- 06_StatisticalInference/homework/hw3.html | 22 +++++++++------------- 06_StatisticalInference/homework/hw3.md | 17 ++++++++--------- 7 files changed, 59 insertions(+), 67 deletions(-) diff --git a/06_StatisticalInference/homework/hw1.html b/06_StatisticalInference/homework/hw1.html index e2e14b4a6..3db02f1ae 100644 --- a/06_StatisticalInference/homework/hw1.html +++ b/06_StatisticalInference/homework/hw1.html @@ -63,13 +63,13 @@

About these slides

-

Consider influenza epidemics for two parent heterosexual families. Suppose that the probability is 15% that at least one of the parents has contracted the disease. The probability that the father has contracted influenza is 6% while that the mother contracted the disease is 5%. What is the probability that both contracted influenza expressed as a whole number percentage?

+

Consider influenza epidemics for two parent heterosexual families. Suppose that the probability is 15% that at least one of the parents has contracted the disease. The probability that the father has contracted influenza is 10% while that the mother contracted the disease is 9%. What is the probability that both contracted influenza expressed as a whole number percentage?

  1. 15%
  2. -
  3. 6%
  4. -
  5. 5%
  6. -
  7. 2%
  8. +
  9. 10%
  10. +
  11. 9%
  12. +
  13. 4%
@@ -78,18 +78,18 @@

About these slides

-

\(A = Father\), \(P(A) = .06\), \(B = Mother\), \(P(B) = .05\) +

\(A = Father\), \(P(A) = .10\), \(B = Mother\), \(P(B) = .09\) \(P(A\cup B) = .15\),

-

\(P(A\cup B) = P(A) + P(B) - 2 P(AB)\) thus -\[.15 = .06 + .05 - 2 P(AB)\]

+

\(P(A\cup B) = P(A) + P(B) - P(AB)\) thus +\[.15 = .10 + .09 - P(AB)\]

-
(0.15 - .06 - .05) / 2
+
.10 + .09 - .15
 
-
[1] 0.02
+
[1] 0.04
 
@@ -107,7 +107,7 @@

About these slides

  1. 1.00
  2. 0.75
  3. -
  4. 0.50
  5. +
  6. 0.50
  7. 0.25
diff --git a/06_StatisticalInference/homework/hw1.md b/06_StatisticalInference/homework/hw1.md index 4a7add5f0..0025d9fc3 100644 --- a/06_StatisticalInference/homework/hw1.md +++ b/06_StatisticalInference/homework/hw1.md @@ -25,27 +25,27 @@ Creating Data Products --- &radio -Consider influenza epidemics for two parent heterosexual families. Suppose that the probability is 15% that at least one of the parents has contracted the disease. The probability that the father has contracted influenza is 6% while that the mother contracted the disease is 5%. What is the probability that both contracted influenza expressed as a whole number percentage? +Consider influenza epidemics for two parent heterosexual families. Suppose that the probability is 15% that at least one of the parents has contracted the disease. The probability that the father has contracted influenza is 10% while that the mother contracted the disease is 9%. What is the probability that both contracted influenza expressed as a whole number percentage? 1. 15% -2. 6% -3. 5% -4. _2%_ +2. 10% +3. 9% +4. _4%_ *** .hint -$A = Father$, $P(A) = .06$, $B = Mother$, $P(B) = .05$ +$A = Father$, $P(A) = .10$, $B = Mother$, $P(B) = .09$ $P(A\cup B) = .15$, *** .explanation -$P(A\cup B) = P(A) + P(B) - 2 P(AB)$ thus -$$.15 = .06 + .05 - 2 P(AB)$$ +$P(A\cup B) = P(A) + P(B) - P(AB)$ thus +$$.15 = .10 + .09 - P(AB)$$ ```r -(0.15 - .06 - .05) / 2 +.10 + .09 - .15 ``` ``` -[1] 0.02 +[1] 0.04 ``` @@ -55,7 +55,7 @@ A random variable, $X$, is uniform, a box from $0$ to $1$ of height $1$. (So tha 1. 1.00 2. 0.75 -3. 0.50 +3. _0.50_ 4. 0.25 *** .hint diff --git a/06_StatisticalInference/homework/hw2.html b/06_StatisticalInference/homework/hw2.html index 5bab28043..e7a3d6a50 100644 --- a/06_StatisticalInference/homework/hw2.html +++ b/06_StatisticalInference/homework/hw2.html @@ -48,12 +48,11 @@

About these slides

    -
  • These are some practice problems for Statistical Inference Quiz 1
  • +
  • These are some practice problems for Statistical Inference Quiz 2
  • They were created using slidify interactive which you will learn in Creating Data Products
  • Please help improve this with pull requests here -(https://github.com/bcaffo/courses) -runif(1)
  • +(https://github.com/bcaffo/courses)
@@ -288,12 +287,12 @@

About these slides

-

89.1

+

10.9

-
round(pbinom(4, prob = .5, size = 6, lower.tail = TRUE) * 100, 1)
+
round(pbinom(4, prob = .5, size = 6, lower.tail = FALSE) * 100, 1)
 
-
[1] 89.1
+
[1] 10.9
 
@@ -388,9 +387,9 @@

About these slides

-

The answer will be 0 -since the variance of the sampling distribution of the mean is \(\sigma^2/12\) -and the variance of a die roll is

+

The answer will be 0.292 +since the variance of the sampling distribution of the mean is \(\sigma^2/10\) +where \(\sigma^2\) is the variance of a single die roll, which is

mean((1 : 6 - 3.5)^2)
 
diff --git a/06_StatisticalInference/homework/hw2.md b/06_StatisticalInference/homework/hw2.md index 32ef6b25f..44ecbe56b 100644 --- a/06_StatisticalInference/homework/hw2.md +++ b/06_StatisticalInference/homework/hw2.md @@ -16,12 +16,11 @@ mode : selfcontained # {standalone, draft} ## About these slides -- These are some practice problems for Statistical Inference Quiz 1 +- These are some practice problems for Statistical Inference Quiz 2 - They were created using slidify interactive which you will learn in Creating Data Products - Please help improve this with pull requests here (https://github.com/bcaffo/courses) -runif(1) --- &radio The probability that a manuscript gets accepted to a journal is 12% (say). However, @@ -182,15 +181,15 @@ Let $p=.5$ and $X$ be binomial *** .explanation -89.1 +10.9 ```r -round(pbinom(4, prob = .5, size = 6, lower.tail = TRUE) * 100, 1) +round(pbinom(4, prob = .5, size = 6, lower.tail = FALSE) * 100, 1) ``` ``` -[1] 89.1 +[1] 10.9 ``` @@ -247,9 +246,9 @@ If you roll ten standard dice, take their average, then repeat this process over $$Var(\bar X) = \sigma^2 /n$$ *** .explanation -The answer will be 0 -since the variance of the sampling distribution of the mean is $\sigma^2/12$ -and the variance of a die roll is +The answer will be 0.292 +since the variance of the sampling distribution of the mean is $\sigma^2/10$ +where $\sigma^2$ is the variance of a single die roll, which is ```r diff --git a/06_StatisticalInference/homework/hw3.Rmd b/06_StatisticalInference/homework/hw3.Rmd index 25fdc736f..df1866fc0 100644 --- a/06_StatisticalInference/homework/hw3.Rmd +++ b/06_StatisticalInference/homework/hw3.Rmd @@ -70,11 +70,11 @@ students t confidence interval touch zero? The t interval is $\bar x t_{.95, 8}\pm s /sqrt{n}$ *** .explanation -`r round(qt(.95, df = 3) * 1 / 3, 2)` +`r round(qt(.95, df = 8) * 1 / 3, 2)` We want $\bar x = t_{.95} s / sqrt{n}$ ```{r} -round(qt(.95, df = 3) * 1 / 3, 2) +round(qt(.95, df = 8) * 1 / 3, 2) ``` @@ -83,7 +83,7 @@ An independent group Student's T interval is used over a paired T interval when: 1. The observations are paired between the groups. -2. _The observations within the groups are natually assumed to be statistically independent_ +2. _The observations between the groups are natually assumed to be statistically independent_ 3. As long as you do it correctly, either is fine. 4. More details are needed to answer this question @@ -91,7 +91,7 @@ a paired T interval when: A paired interval is for paired observations. *** .explanation -If the groups are independent is the correct interval. +We can't pair them if the groups are independent of each other as well as independent within themselves. --- &multitext @@ -132,7 +132,7 @@ C'mon. You don't need a hint *** .explanation This is just an example of what happens to confidence intervals as you -increas the confidence level. You want to be quite sure in your interval (i.e. +increase the confidence level. You want to be quite sure in your interval (i.e. have a large confidence level) and so you would increase the interval's width --- &radio @@ -153,9 +153,9 @@ make sure the results make sense in the context of the problem. The interval was conducted subtracting 4 - 6 and was entirely above zero. --- &multitext -Suppose that 18 obese subjects were randomized, 9 each, to a new diet pill and a placebo. Subjects' body mass indices (BMIs) were measured at a baseline and again after having received the treatment or placebo for four weeks. The average difference from follow-up to the baseline (followup - baseline) was ???3 kg/m2 for the treated group and 1 kg/m2 for the placebo group. The corresponding standard deviations of the differences was 1.5 kg/m2 for the treatment group and 1.8 kg/m2 for the placebo group. Does the change in BMI over the four week period appear to differ between the treated and placebo groups? +Suppose that 18 obese subjects were randomized, 9 each, to a new diet pill and a placebo. Subjects' body mass indices (BMIs) were measured at a baseline and again after having received the treatment or placebo for four weeks. The average difference from follow-up to the baseline (followup - baseline) was 3 kg/m2 for the treated group and 1 kg/m2 for the placebo group. The corresponding standard deviations of the differences was 1.5 kg/m2 for the treatment group and 1.8 kg/m2 for the placebo group. The study aims to answer whether the change in BMI over the four week period appear to differ between the treated and placebo groups. -1. Calculate the pooled variance estimate to 2 decimal places +What is the pooled variance estimate? (to 2 decimal places) *** .hint @@ -164,7 +164,6 @@ individual variances *** .explanation -`r round(min(confint), 1)` ```{r} n1 <- n2 <- 9 x1 <- -3 ##treated diff --git a/06_StatisticalInference/homework/hw3.html b/06_StatisticalInference/homework/hw3.html index 99aca5837..6e54ea85e 100644 --- a/06_StatisticalInference/homework/hw3.html +++ b/06_StatisticalInference/homework/hw3.html @@ -122,14 +122,14 @@

About these slides

-

0.78

+

0.62

We want \(\bar x = t_{.95} s / sqrt{n}\)

-
round(qt(.95, df = 3) * 1 / 3, 2)
+
round(qt(.95, df = 8) * 1 / 3, 2)
 
-
[1] 0.78
+
[1] 0.62
 
@@ -147,7 +147,7 @@

About these slides

  1. The observations are paired between the groups.
  2. -
  3. The observations within the groups are natually assumed to be statistically independent
  4. +
  5. The observations between the groups are natually assumed to be statistically independent
  6. As long as you do it correctly, either is fine.
  7. More details are needed to answer this question
@@ -162,7 +162,7 @@

About these slides

-

If the groups are independent is the correct interval.

+

We can't pair them if the groups are independent of each other as well as independent within themselves.

@@ -233,7 +233,7 @@

About these slides

This is just an example of what happens to confidence intervals as you -increas the confidence level. You want to be quite sure in your interval (i.e. +increase the confidence level. You want to be quite sure in your interval (i.e. have a large confidence level) and so you would increase the interval's width

@@ -279,11 +279,9 @@

About these slides

-

Suppose that 18 obese subjects were randomized, 9 each, to a new diet pill and a placebo. Subjects' body mass indices (BMIs) were measured at a baseline and again after having received the treatment or placebo for four weeks. The average difference from follow-up to the baseline (followup - baseline) was ???3 kg/m2 for the treated group and 1 kg/m2 for the placebo group. The corresponding standard deviations of the differences was 1.5 kg/m2 for the treatment group and 1.8 kg/m2 for the placebo group. Does the change in BMI over the four week period appear to differ between the treated and placebo groups?

+

Suppose that 18 obese subjects were randomized, 9 each, to a new diet pill and a placebo. Subjects' body mass indices (BMIs) were measured at a baseline and again after having received the treatment or placebo for four weeks. The average difference from follow-up to the baseline (followup - baseline) was 3 kg/m2 for the treated group and 1 kg/m2 for the placebo group. The corresponding standard deviations of the differences was 1.5 kg/m2 for the treatment group and 1.8 kg/m2 for the placebo group. The study aims to answer whether the change in BMI over the four week period appear to differ between the treated and placebo groups.

-
    -
  1. Calculate the pooled variance estimate to 2 decimal places
  2. -
+

What is the pooled variance estimate? (to 2 decimal places)

@@ -296,9 +294,7 @@

About these slides

-

3.2

- -
n1 <- n2 <- 9
+  
n1 <- n2 <- 9
 x1 <- -3  ##treated
 x2 <- 1  ##placebo
 s1 <- 1.5  ##treated
diff --git a/06_StatisticalInference/homework/hw3.md b/06_StatisticalInference/homework/hw3.md
index 42919061a..93859ed5b 100644
--- a/06_StatisticalInference/homework/hw3.md
+++ b/06_StatisticalInference/homework/hw3.md
@@ -64,16 +64,16 @@ students t confidence interval touch zero?
 The t interval is $\bar x t_{.95, 8}\pm s /sqrt{n}$
 
 *** .explanation
-0.78
+0.62
 
 We want $\bar x = t_{.95} s / sqrt{n}$
 
 ```r
-round(qt(.95, df = 3) * 1 / 3, 2)
+round(qt(.95, df = 8) * 1 / 3, 2)
 ```
 
 ```
-[1] 0.78
+[1] 0.62
 ```
 
 
@@ -83,7 +83,7 @@ An independent group Student's T interval is used over
 a paired T interval when:
 
 1. The observations are paired between the groups.
-2. _The observations within the groups are natually assumed to be statistically independent_
+2. _The observations between the groups are natually assumed to be statistically independent_
 3. As long as you do it correctly, either is fine.
 4. More details are needed to answer this question
 
@@ -91,7 +91,7 @@ a paired T interval when:
 A paired interval is for paired observations.
 
 *** .explanation
-If the groups are independent is the correct interval.
+We can't pair them if the groups are independent of each other as well as independent within themselves.
 
 
 --- &multitext
@@ -134,7 +134,7 @@ C'mon. You don't need a hint
 
 *** .explanation
 This is just an example of what happens to confidence intervals as you
-increas the confidence level. You want to be quite sure in your interval (i.e.
+increase the confidence level. You want to be quite sure in your interval (i.e.
 have a large confidence level) and so you would increase the interval's width
 
 --- &radio
@@ -155,9 +155,9 @@ make sure the results make sense in the context of the problem.
 The interval was conducted subtracting 4 - 6 and was entirely above zero.
 
 --- &multitext
-Suppose that 18 obese subjects were randomized, 9 each, to a new diet pill and a placebo. Subjects' body mass indices (BMIs) were measured at a baseline and again after having received the treatment or placebo for four weeks. The average difference from follow-up to the baseline (followup - baseline) was ???3 kg/m2 for the treated group and 1 kg/m2 for the placebo group. The corresponding standard deviations of the differences was 1.5 kg/m2 for the treatment group and 1.8 kg/m2 for the placebo group. Does the change in BMI over the four week period appear to differ between the treated and placebo groups?  
+Suppose that 18 obese subjects were randomized, 9 each, to a new diet pill and a placebo. Subjects' body mass indices (BMIs) were measured at a baseline and again after having received the treatment or placebo for four weeks. The average difference from follow-up to the baseline (followup - baseline) was 3 kg/m2 for the treated group and 1 kg/m2 for the placebo group. The corresponding standard deviations of the differences was 1.5 kg/m2 for the treatment group and 1.8 kg/m2 for the placebo group. The study aims to answer whether the change in BMI over the four week period appear to differ between the treated and placebo groups. 
 
-1. Calculate the pooled variance estimate to 2 decimal places
+What is the pooled variance estimate? (to 2 decimal places)
 
 
 *** .hint
@@ -166,7 +166,6 @@ individual variances
 
 
 *** .explanation
-3.2
 
 ```r
 n1 <- n2 <- 9

From f65be39d555d2cec74af97f024d1a49cd5527ee3 Mon Sep 17 00:00:00 2001
From: "Roger D. Peng [amelia]" 
Date: Thu, 9 Jul 2015 13:30:25 -0400
Subject: [PATCH 09/15] Fix complex coercion

---
 02_RProgramming/DataTypes/index.Rmd | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/02_RProgramming/DataTypes/index.Rmd b/02_RProgramming/DataTypes/index.Rmd
index 65eb1ce54..56394c9df 100644
--- a/02_RProgramming/DataTypes/index.Rmd
+++ b/02_RProgramming/DataTypes/index.Rmd
@@ -200,7 +200,9 @@ NAs introduced by coercion
 > as.logical(x)
 [1] NA NA NA
 > as.complex(x)
-[1] 0+0i 1+0i 2+0i 3+0i 4+0i 5+0i 6+0i
+[1] NA NA NA
+Warning message:
+NAs introduced by coercion 
 ```
 
 ---
@@ -472,4 +474,4 @@ Data Types
 
 - data frames
 
-- names
\ No newline at end of file
+- names

From 3ddf31bf1048973f471a939556b27bbfa7335428 Mon Sep 17 00:00:00 2001
From: "Roger D. Peng [amelia]" 
Date: Thu, 9 Jul 2015 13:31:52 -0400
Subject: [PATCH 10/15] Use librariesNew

---
 02_RProgramming/DataTypes/index.Rmd  |   2 +-
 02_RProgramming/DataTypes/index.html | 357 +++++++++++++++++++--------
 02_RProgramming/DataTypes/index.md   |   6 +-
 3 files changed, 263 insertions(+), 102 deletions(-)

diff --git a/02_RProgramming/DataTypes/index.Rmd b/02_RProgramming/DataTypes/index.Rmd
index 56394c9df..19f8f1af4 100644
--- a/02_RProgramming/DataTypes/index.Rmd
+++ b/02_RProgramming/DataTypes/index.Rmd
@@ -8,7 +8,7 @@ framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
 highlighter : highlight.js  # {highlight.js, prettify, highlight}
 hitheme     : tomorrow      # 
 url:
-  lib: ../../libraries
+  lib: ../../librariesNew
   assets: ../../assets
 widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
 mode        : selfcontained # {standalone, draft}
diff --git a/02_RProgramming/DataTypes/index.html b/02_RProgramming/DataTypes/index.html
index 9b50617cb..00c65c081 100644
--- a/02_RProgramming/DataTypes/index.html
+++ b/02_RProgramming/DataTypes/index.html
@@ -8,46 +8,46 @@
   
   
   
-  
-  
+  
-  
-  
-   
-  
+   
+  
   
-    
-
+  
 
 
 
   
     
     
-    
+        
+  
+  
+

Introduction to the R Language

+

Data Types and Basic Operations

+

Roger Peng, Associate Professor
Johns Hopkins Bloomberg School of Public Health

+
+
+
- - - - -
-

Introduction to the R Language

-

Data Types and Basic Operations

-

Roger Peng, Associate Professor
Johns Hopkins Bloomberg School of Public Health

-
-
- - +

Objects

-
+

R has five basic or “atomic” classes of objects: