carlosstack
diff --git a/‎Analysis.Rmd
Lines changed: 119 additions & 0 deletions b/‎Analysis.Rmd
Lines changed: 119 additions & 0 deletions
diff --git a/‎Analysis.html
Lines changed: 2066 additions & 0 deletions b/‎Analysis.html
Lines changed: 2066 additions & 0 deletions
diff --git a/‎Analysis.pdf
45.7 KB b/‎Analysis.pdf
45.7 KB
diff --git a/‎Report.Rmd
Lines changed: 37 additions & 20 deletions b/‎Report.Rmd
Lines changed: 37 additions & 20 deletions
diff --git a/‎Report.html
Lines changed: 1368 additions & 99 deletions b/‎Report.html
Lines changed: 1368 additions & 99 deletions
diff --git a/‎Report.pdf
80.2 KB b/‎Report.pdf
80.2 KB
diff --git a/‎Report_files/figure-html/unnamed-chunk-3-1.png
17.6 KB b/‎Report_files/figure-html/unnamed-chunk-3-1.png
17.6 KB
diff --git a/‎Report_files/figure-html/unnamed-chunk-4-1.png
16.3 KB b/‎Report_files/figure-html/unnamed-chunk-4-1.png
16.3 KB
diff --git a/‎rsconnect/documents/Analysis.Rmd/rpubs.com/rpubs/Document.dcf
Lines changed: 10 additions & 0 deletions b/‎rsconnect/documents/Analysis.Rmd/rpubs.com/rpubs/Document.dcf
Lines changed: 10 additions & 0 deletions
diff --git a/‎rsconnect/documents/Report.Rmd/rpubs.com/rpubs/Document.dcf
Lines changed: 10 additions & 0 deletions b/‎rsconnect/documents/Report.Rmd/rpubs.com/rpubs/Document.dcf
Lines changed: 10 additions & 0 deletions
diff --git a/‎statistical-inference.Rproj
Lines changed: 1 addition & 1 deletion b/‎statistical-inference.Rproj
Lines changed: 1 addition & 1 deletion
@@ -0,0 +1,119 @@
+---
+title: "Analysis of ToothGrowth data in the R datasets package"
+author: "Carlos Hernández"
+date: "25/9/2020"
+output: 
+  pdf_document:
+    latex_engine: xelatex
+    highlight: espresso
+    toc: true
+    toc_depth: 4
+---
+
+```{r setup, include=FALSE}
+knitr::opts_chunk$set(echo = TRUE)
+```
+
+## ToothGrowth Dataset
+
+ToothGrowth data set contains the result from an experiment studying the effect of vitamin C on tooth growth in 60 Guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, orange juice or ascorbic acid (a form of vitamin C and coded as VC).
+
+```{r ToothGrowth }
+library(kableExtra)
+head(ToothGrowth,5) %>%
+  kbl() %>%
+  kable_material(c("striped", "hover"))
+
+```
+ 
+1. len: Tooth length
+2. supp: Supplement type (VC or OJ).
+3. dose: numeric Dose in milligrams/day
+
+## Basic exploratory data analysis
+
+```{r}
+library(ggplot2)
+ggplot(ToothGrowth, aes(x = dose, y = len, fill = supp)) + 
+  geom_col() +
+  facet_grid(~supp, scales = "free")
+```
+
+At first glance, it seems that the dose given through orange juice is more effective, since greater growth is observed in the teeth when the dose is administered via orange juice and less when it is administered with ascorbic acid. We could also notice that when doses of 2mg are administered, it seems that the growth is the same regardless of which medium is administered.
+
+But these are only initial guesses that we can verify or reject by performing a hypothesis test.
+
+
+## Hypothesis Testing
+
+### Assumptions
+
+- The variables must be independent and identically distributed (i.i.d.).
+- Variances of tooth growth are different when using different supplement and dosage.
+- Tooth growth follows a normal distribution.
+
+### Hypothesis 1: Variation of tooth length when using OJ or VC
+
+Our null hypothesis is that the length of the tooth does not vary when we use either of the two methods (VC or OJ).
+
+Therefore, our alternative hypothesis would be that tooth length varies depending on the method through which the dose is delivered.
+
+```{r}
+oj_len <- ToothGrowth[ToothGrowth$supp=="OJ",]$len
+vc_len <- ToothGrowth[ToothGrowth$supp=="VC",]$len
+
+t.test(oj_len,vc_len, paired = FALSE, var.equal = FALSE, alternative = "greater") 
+```
+
+As we can see our p-value is greater than 0.05, therefore our null hypothesis is rejected and we accept that the length of the tooth varies according to the method used.
+
+Furthermore, we can see that on average if we use OJ the tooth length is greater than using VC.
+
+### Hypothesis 2: Variation of tooth length when using different doses
+
+Our null hypothesis is that tooth length does not vary between methods when we use different doses.
+
+Therefore, our alternative hypothesis would be that the length of the teeth varies according to the method and dose delivered.
+
+```{r}
+OJDoseHalf <- ToothGrowth[ToothGrowth$supp=="OJ" & ToothGrowth$dose==0.5,]$len
+OJDoseOne <- ToothGrowth[ToothGrowth$supp=="OJ" & ToothGrowth$dose==1.0,]$len
+OJDoseTwo <- ToothGrowth[ToothGrowth$supp=="OJ" & ToothGrowth$dose==2.0,]$len
+
+VCDoseHalf <- ToothGrowth[ToothGrowth$supp=="VC" & ToothGrowth$dose==0.5,]$len
+VCDoseOne <- ToothGrowth[ToothGrowth$supp=="VC" & ToothGrowth$dose==1.0,]$len
+VCDoseTwo <- ToothGrowth[ToothGrowth$supp=="VC" & ToothGrowth$dose==2.0,]$len
+```
+
+For dose equal to 0.5 mg:
+
+```{r}
+t.test(OJDoseHalf, VCDoseHalf, paired = FALSE, var.equal = FALSE, alternative = "greater") 
+```
+
+For dose equal to 1 mg:
+
+```{r}
+t.test(OJDoseOne,VCDoseOne, paired = FALSE, var.equal = FALSE, alternative = "greater") 
+
+```
+
+For dose equal to 2 mg:
+
+```{r}
+t.test(OJDoseTwo, VCDoseTwo, paired = FALSE, var.equal = FALSE, alternative = "greater") 
+```
+
+As we can see, for doses of 0.5 mg and 1 mg we obtained results similar to that of our hypothesis 1. In both cases the p-value is less than 0.5, therefore we can reject the null hypothesis and accept that the logintud of the teeth It varies according to the dose and greater lengths are obtained with doses of 1 mg being administered with OJ.
+
+However, for doses of 2 mg, we obtain a p-value greater than 0.5, which we can interpret in that we must accept the null hypothesis. This means that regardless of the method used (VC or OJ) the length of the teeth obtained is the same for a dose of 2 mg.
+
+## Conclusion
+
+As a conclusion we can say that after conducting this brief but interesting analysis, we have shown that for doses of 0.5 mg and 1 mg, orange juice results in greater tooth length. However for doses of 2mg, the length of teeth obtained will be the same regardless of whether OJ or VC is used.
+
+
+
+
+
+
@@ -1,8 +1,14 @@
 ---
 title: "Simulation of Exponential Distribution using R"
+
 author: "Carlos Hernández"
 date: "25/9/2020"
-output: html_document
+output:
+  pdf_document:
+    latex_engine: xelatex
+    highlight: espresso
+    toc: true
+    toc_depth: 4
 ---
 
 ```{r setup, include=FALSE}
@@ -50,8 +56,9 @@ data <- data.frame(value = c(t(rexp(1000, rate = 1))))
 
 ggplot(data, aes(x=value)) + 
   geom_histogram(aes(y=..density..),binwidth=.25, col="black", fill="lightblue")+
-    labs(title= "Exponential distribution with mean = 1", caption="Produced by Carlos Hernández") +
-   xlab("x") +
+  labs(title= "Exponential distribution with mean = 1", 
+       caption="Produced by Carlos Hernández") +
+  xlab("x") +
   ylab("y")
 
 ```
@@ -73,7 +80,9 @@ expData <- data.frame(value = c(t(expData))) # convert to data frame
 # plot
 ggplot(expData, aes(x=value)) + 
   geom_histogram(aes(y=..density..), binwidth=.8,colour="black", fill="lightblue") +
-   labs(title= "Exponential distribution with lambda = 0.2 and 40 observations", subtitle = "Replicated 1000 times", caption="Produced by Carlos Hernández") +
+   labs(title= "Exponential distribution with lambda = 0.2 and 40 observations", 
+        subtitle = "Replicated 1000 times", 
+        caption="Produced by Carlos Hernández") +
   xlab("x") +
   ylab("exp(x)")
 
@@ -89,7 +98,9 @@ data <- data.frame(value = c(t(data)), size = 40)
 
 ggplot(data, aes(x=value)) + 
   geom_histogram(aes(y=..density..),binwidth=.25, col="black", fill="lightblue") +
-  labs(title= "Average of 40 random exponential distribution", subtitle = "Replicated 1000 times", caption="Produced by Carlos Hernández") +
+  labs(title= "Average of 40 random exponential distribution", 
+       subtitle = "Replicated 1000 times", 
+       caption="Produced by Carlos Hernández") +
   xlab("x") +
   ylab("mean")
 
@@ -105,15 +116,19 @@ theoretical_mu <- 1/lambda # calculate theoretical mean
 sample_mu <-mean(data$value) # calculate experimental mean
 
 ggplot(data, aes(x=value)) + 
-  stat_function(fun=dnorm,
-                         color="black",
-                         args=list(mean=mean(data$value), 
-                                  sd=sd(data$value)))+
+  stat_function(fun=dnorm, 
+                color="black", 
+                args=list(mean=mean(data$value), 
+                sd=sd(data$value)))+
   geom_vline(xintercept = theoretical_mu, colour="red") +
-  geom_text(aes(x=theoretical_mu-.25, label="\nTheoretical mean", y=.2), colour="red", angle=90, text=element_text(size=11)) +
+  geom_text(aes(x=theoretical_mu-.25, 
+                label="\nTheoretical mean", y=.2), 
+            colour="red", angle=90, text=element_text(size=11)) +
   geom_vline(xintercept = sample_mu, colour="green")+
-  geom_text(aes(x=sample_mu+.05, label="\nSample mean", y=.2), colour="green", angle=90, text=element_text(size=11)) +
-  labs(title= "Theoretical mean vs sample mean", caption="Produced by Carlos Hernández") +
+  geom_text(aes(x=sample_mu+.05, label="\nSample mean", y=.2), 
+            colour="green", angle=90) + 
+  labs(title= "Theoretical mean vs sample mean", 
+       caption="Produced by Carlos Hernández") +
   xlab("x") +
   ylab("y")
 
@@ -131,15 +146,15 @@ theoretical_variance <- 1/(n * lambda^2)
 sample_variance <- round(var(data$value),3)
 
 ggplot(data, aes(x=value)) + 
-  stat_function(fun=dnorm,
-                         color="black",
-                         args=list(mean=mean(data$value), 
-                                  sd=sd(data$value)))+
+  stat_function(fun=dnorm, color="black", args=list(mean=mean(data$value), sd=sd(data$value)))+
   geom_vline(xintercept = sample_mu, colour="gray", linetype="dashed")+
   geom_vline(xintercept = theoretical_mu, colour="gray", linetype="dashed")+
-  geom_segment(aes(x = sample_mu, y = 0.36, xend =sample_mu + sample_variance, yend = 0.36), colour="green") +
-  geom_segment(aes(x = theoretical_mu - theoretical_variance, y = 0.35, xend =theoretical_mu, yend = 0.35), colour="red") +
-  labs(title= "Theoretical variance vs sample variance", caption="Produced by Carlos Hernández") +
+  geom_segment(aes(x = sample_mu, y = 0.36, xend =sample_mu + 
+                     sample_variance, yend = 0.36), colour="green") +
+  geom_segment(aes(x = theoretical_mu - theoretical_variance, y = 0.35, 
+                   xend =theoretical_mu, yend = 0.35), colour="red") +
+  labs(title= "Theoretical variance vs sample variance", 
+       caption="Produced by Carlos Hernández") +
   geom_text(aes(x=sample_mu+.55, label="\nSample variance", y=.42), colour="green") +
   geom_text(aes(x=theoretical_mu-.65, label="\nTheoretical variance", y=.33), colour="red") +
   xlab("x") +
@@ -160,7 +175,9 @@ ggplot(data, aes(x=value)) +
                          color="blue",
                          args=list(mean=mean(data$value), 
                                   sd=sd(data$value)))+
-  labs(title= "Average of 40 random exponential distribution", subtitle = "Replicated 1000 times", caption="Produced by Carlos Hernández") +
+  labs(title= "Average of 40 random exponential distribution", 
+       subtitle = "Replicated 1000 times", 
+       caption="Produced by Carlos Hernández") +
   xlab("x") +
   ylab("y")
 ```
@@ -0,0 +1,10 @@
+name: Document
+title:
+username:
+account: rpubs
+server: rpubs.com
+hostUrl: rpubs.com
+appId: https://api.rpubs.com/api/v1/document/666390/732703cd577a43f6bc78b51c6ea04308
+bundleId: https://api.rpubs.com/api/v1/document/666390/732703cd577a43f6bc78b51c6ea04308
+url: http://rpubs.com/publish/claim/666390/1284b297f0dd4bf78e340a41469d0358
+when: 1601097026.03967
@@ -0,0 +1,10 @@
+name: Document
+title:
+username:
+account: rpubs
+server: rpubs.com
+hostUrl: rpubs.com
+appId: https://api.rpubs.com/api/v1/document/666328/609244398d1340bf99c4cfae58bb879a
+bundleId: https://api.rpubs.com/api/v1/document/666328/609244398d1340bf99c4cfae58bb879a
+url: http://rpubs.com/publish/claim/666328/7f37464851904714b83eea439bb2e2f7
+when: 1601083212.76761
@@ -10,4 +10,4 @@ NumSpacesForTab: 2
 Encoding: UTF-8
 
 RnwWeave: Sweave
-LaTeX: pdfLaTeX
+LaTeX: XeLaTeX