Skip to content

Commit 5212c79

Browse files
authored
Merge pull request #217 from pymc-labs/glossary
Convert static glossary into a proper interlinked sphinx glossary
2 parents 106041f + 889c0d4 commit 5212c79

File tree

8 files changed

+103
-84
lines changed

8 files changed

+103
-84
lines changed

docs/source/glossary.md

-66
This file was deleted.

docs/source/glossary.rst

+83
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
Glossary
2+
========
3+
4+
.. glossary::
5+
:sorted:
6+
7+
ANCOVA
8+
Analysis of covariance is a simple linear model, typically with one continuous predictor (the covariate) and a catgeorical variable (which may correspond to treatment or control group). In the context of this package, ANCOVA could be useful in pre-post treatment designs, either with or without random assignment. This is similar to the approach of difference in differences, but only applicable with a single pre and post treatment measure.
9+
10+
Average treatment effect
11+
ATE
12+
The average treatement effect across all units.
13+
14+
Average treatment effect on the treated
15+
ATT
16+
The average effect of the treatment on the units that recieved it. Also called Treatment on the treated.
17+
18+
Change score analysis
19+
A statistical procedure where the outcome variable is the difference between the posttest and protest scores.
20+
21+
Comparative interrupted time-series
22+
CITS
23+
An interrupted time series design with added comparison time series observations.
24+
25+
Confound
26+
Anything besides the treatment which varies across the treatment and control conditions.
27+
28+
Counterfactual
29+
A hypothetical outcome that could or will occur under specific hypothetical circumstances.
30+
31+
Difference in differences
32+
DiD
33+
Analysis where the treatment effect is estimated as a difference between treatment conditions in the differences between pre-treatment to post treatment observations.
34+
35+
Interrupted time series design
36+
ITS
37+
A quasi-experimental design to estimate a treatment effect where a series of observations are collected before and after a treatment. No control group is present.
38+
39+
Non-equivalent group designs
40+
NEGD
41+
A quasi-experimental design where units are assigned to conditions non-randomly, and not according to a running variable (see Regression discontinuity design).
42+
43+
One-group posttest-only design
44+
A design where a single group is exposed to a treatment and assessed on an outcome measure. There is no pretest measure or comparison group.
45+
46+
Panel data
47+
Time series data collected on multiple units where the same units are observed at each time point.
48+
49+
Pretest-posttest design
50+
A quasi-experimental design where the treatment effect is estimated by comparing an outcome measure before and after treatment.
51+
52+
Quasi-experiment
53+
An empirical comparison used to estimate the effects of a treatment where units are not assigned to conditions at random.
54+
55+
Random assignment
56+
Where units are assigned to conditions at random.
57+
58+
Randomized experiment
59+
An emprical comparison used to estimate the effects of treatments where units are assigned to treatment conditions randomly.
60+
61+
Regression discontinuity design
62+
A quasi–experimental comparison to estimate a treatment effect where units are assigned to treatment conditions based on a cut-off score on a quantitative assignment variable (aka running variable).
63+
64+
Sharp regression discontinuity design
65+
A Regression discontinuity design where allocation to treatment or control is determined by a sharp threshold / step function.
66+
67+
Synthetic control
68+
The synthetic control method is a statistical method used to evaluate the effect of an intervention in comparative case studies. It involves the construction of a weighted combination of groups used as controls, to which the treatment group is compared.
69+
70+
Treatment on the treated effect
71+
TOT
72+
The average effect of the treatment on the units that recieved it. Also called the average treatment effect on the treated (ATT).
73+
74+
Treatment effect
75+
The difference in outcomes between what happened after a treatment is implemented and what would have happened (see Counterfactual) if the treatment had not been implemented, assuming everything else had been the same.
76+
77+
Wilkinson notation
78+
A notation for describing statistical models :footcite:p:`wilkinson1973symbolic`.
79+
80+
81+
References
82+
----------
83+
.. footbibliography::

docs/source/index.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -126,7 +126,7 @@ Documentation outline
126126
.. toctree::
127127
:titlesonly:
128128

129-
glossary.md
129+
glossary
130130

131131
.. toctree::
132132
:caption: Examples

docs/source/notebooks/ancova_pymc.ipynb

+1-1
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
"This is a preliminary example based on synthetic data. It will hopefully soon be updated with data from a real study.\n",
1212
":::\n",
1313
"\n",
14-
"In cases where there is just one pre and one post treatment measurement, it we can analyse data from NEGD experiments using an ANCOVA type approach. The basic model is:\n",
14+
"In cases where there is just one pre and one post treatment measurement, it we can analyse data from {term}`NEGD` experiments using an {term}`ANCOVA` type approach. The basic model is:\n",
1515
"\n",
1616
"$$\n",
1717
"post_i = \\beta_0 + (\\beta_1 \\cdot T_i) + (\\beta_2 \\cdot pre_i) + \\epsilon_i\n",

docs/source/notebooks/did_pymc_banks.ipynb

+3-8
Original file line numberDiff line numberDiff line change
@@ -513,7 +513,7 @@
513513
"source": [
514514
"## Analysis 2 - DiD with multiple pre/post observations\n",
515515
"\n",
516-
"Now we'll do a difference in differences analysis of the full dataset. This approach has similarities to CITS (Comparative Interrupted Time Series) with a single control over time. Although slightly abitrary, we distinguish between the two techniques on whether there is enough time series data for CITS to capture the time series patterns."
516+
"Now we'll do a difference in differences analysis of the full dataset. This approach has similarities to {term}`CITS` (Comparative Interrupted Time-Series) with a single control over time. Although slightly abitrary, we distinguish between the two techniques on whether there is enough time series data for CITS to capture the time series patterns."
517517
]
518518
},
519519
{
@@ -721,7 +721,7 @@
721721
"kernelspec": {
722722
"display_name": "CausalPy",
723723
"language": "python",
724-
"name": "causalpy"
724+
"name": "python3"
725725
},
726726
"language_info": {
727727
"codemirror_mode": {
@@ -733,12 +733,7 @@
733733
"name": "python",
734734
"nbconvert_exporter": "python",
735735
"pygments_lexer": "ipython3",
736-
"version": "3.11.4"
737-
},
738-
"vscode": {
739-
"interpreter": {
740-
"hash": "46d31859cc45aa26a1223a391e7cf3023d69984b498bed11e66c690302b7e251"
741-
}
736+
"version": "3.10.8"
742737
}
743738
},
744739
"nbformat": 4,

docs/source/notebooks/geolift1.ipynb

+1-1
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
"source": [
88
"# Bayesian geolift with CausalPy\n",
99
"\n",
10-
"This notebook covers how to use `CausalPy`'s Bayesian synthetic control functionality to assess 'geolift'. Our hypothetical scenario is:\n",
10+
"This notebook covers how to use `CausalPy`'s Bayesian {term}`synthetic control` functionality to assess 'geolift'. Our hypothetical scenario is:\n",
1111
"\n",
1212
"> We are a data scientist within a company that operates over Europe. We have been given a historical dataset of sales volumes, in units of 1000's. The data is broken down by country and was collected at weekly frequency. We have data for the past 4 years. \n",
1313
"\n",

docs/source/notebooks/sc_pymc_brexit.ipynb

+3-7
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
"source": [
88
"# The effects of Brexit\n",
99
"\n",
10-
"The aim of this notebook is to estimate the causal impact of Brexit upon the UK's GDP. This will be done using the synthetic control approch. As such, it is similar to the policy brief \"What can we know about the cost of Brexit so far?\" {cite:p}`brexit2022policybrief` from the Center for European Reform. That approach did not use Bayesian estimation methods however.\n",
10+
"The aim of this notebook is to estimate the causal impact of Brexit upon the UK's GDP. This will be done using the {term}`synthetic control` approch. As such, it is similar to the policy brief \"What can we know about the cost of Brexit so far?\" {cite:p}`brexit2022policybrief` from the Center for European Reform. That approach did not use Bayesian estimation methods however.\n",
1111
"\n",
1212
"I did not use the GDP data from the above report however as it had been scaled in some way that was hard for me to understand how it related to the absolute GDP figures. Instead, GDP data was obtained courtesy of Prof. Dooruj Rambaccussing. Raw data is in units of billions of USD."
1313
]
@@ -215,13 +215,9 @@
215215
"cell_type": "markdown",
216216
"metadata": {},
217217
"source": [
218-
"<div class=\"alert alert-info\">\n",
219-
"\n",
220-
"Note:\n",
221-
"\n",
218+
":::{note}\n",
222219
"The `random_seed` keyword argument for the PyMC sampler is not neccessary. We use it here so that the results are reproducible.\n",
223-
"\n",
224-
"</div>"
220+
":::"
225221
]
226222
},
227223
{

docs/source/references.bib

+11
Original file line numberDiff line numberDiff line change
@@ -40,3 +40,14 @@ @article{carpenter2009effect
4040
year={2009},
4141
publisher={American Economic Association}
4242
}
43+
44+
@article{wilkinson1973symbolic,
45+
title={Symbolic description of factorial models for analysis of variance},
46+
author={Wilkinson, GN and Rogers, CE},
47+
journal={Journal of the Royal Statistical Society Series C: Applied Statistics},
48+
volume={22},
49+
number={3},
50+
pages={392--399},
51+
year={1973},
52+
publisher={Oxford University Press}
53+
}

0 commit comments

Comments
 (0)