Lecture 05 - Elemental Confounds
Rose / Thorn
Rose: completely changed my approach to science
Thorn: why set M and not A in intervention?
Correlation
- correlation is common in nature, causation is sparse
- correlation over time series especially common
- scientific question/estimand -> recipe/estimator -> result/estimate
Association & Causation
- we must defend against confounding in our stats
- confounds mislead us - feature of the sample + how we use it
- four elemental confounds
- causes of confounds can be extremely diverse but they are all at their core made up of relationships between 3 variables
The Fork
- X and Y are associated \(Y \not\!\perp\!\!\!\perp X\) (not independent)
- Share a common cause Z
- Once stratified by Z, no association \(Y \perp\!\!\!\perp X | Z\)
<- dagify(
dag ~ Z,
X ~ Z
Y
)
ggdag(dag) +
theme_dag()
- simulation:
<- 1000
n <- rbern(n, 0.5)
Z <- rbern(n, (1-Z)*0.01 + Z*0.9)
X <- rbern(n, (1-Z)*0.01 + Z*0.9)
Y
# correlated
cor(X, Y)
[1] 0.7962541
# no longer correlated when we pull out Z
cor(X[Z==0], Y[Z==0])
[1] -0.00895323
cor(X[Z==1], Y[Z==1])
[1] -0.08114448
<- c(4,2)
cols <- 300
n <- rbern(n, 0.5)
Z <- rnorm(n, (1-Z)*0.01 + Z*0.9)
X <- rnorm(n, (1-Z)*0.01 + Z*0.9)
Y
plot(X, Y, col=cols[Z+1]) +
abline(lm(Y[Z==1] ~ X[Z==1]), col = 2) +
abline(lm(Y[Z==0] ~X[Z==0]), col = 4) +
abline(lm(Y~ X))
integer(0)
example:
why do regions of the USA with higher rates of marriage also have higher rates of divorce?
estimand: causal effect of marriage rate on divorce rate
<- dagify(
dag ~ A,
D ~ A,
M ~ M
D
)
ggdag(dag) +
theme_dag()
causes are not in the data
is effect of marriage rates on divorce rates just a symptom of common cause age?
need to break the fork to test direct effect – stratify by A
continuous variable stratification means that we are adding to that variable essentially to the intercept
stratifying means for every level of the stratified value, what is the association between the other two variables?
what does this mean for continuous variables?
every value of A produces a different relationship between D and M
\[ \mu_i = (\alpha + \beta_A*A) + B_D*D \]
standardizing variables is almost always helpful in linear regression
- transform measurement scales to the scale we want, as long as we remember how we changed them
- make the mean zero and divide by sd
- alpha prior is the value of average divorce rate (when standardized = 0)
- beta is within 1 and -1 for standardized variables because outside of those values the variable would be explaining all of the variation
to stratify by A, include as a term in the linear model:
\(D _i \sim Normal(\mu _i, \sigma)\)
\(\mu _i = \alpha + \beta_MM_i + \beta_AA_i\)
\(\alpha \sim Normal(0, 0.2)\)
\(\beta_M \sim Normal(0,0.5)\)
\(\beta_A \sim Normal(0, 0.5)\)
\(\sigma \sim Exponential(1)\)
library(rethinking)
data(WaffleDivorce)
<- WaffleDivorce
d
# model
<- list(
dat D = standardize(d$Divorce),
M = standardize(d$Marriage),
A = standardize(d$MedianAgeMarriage)
)
<- quap(
m_DMA alist(
~ dnorm(mu,sigma),
D <- a + bM*M + bA*A,
mu ~ dnorm(0,0.2),
a ~ dnorm(0,0.5),
bM ~ dnorm(0,0.5),
bA ~ dexp(1)
sigma data=dat )
) ,
plot(precis(m_DMA))
a causal effect is a manipulation of the generative model, an intervention
distribution of D when we intervene (“do”) M \(p(D|do(M))\)
- implies deleting all arrows into M and simulating D
- set values of M while holding A constant
<- extract.samples(m_DMA)
post
# sample A from data
<- 1e3
n <- sample(dat$A, size = n, replace = T)
As
# simulate D for M=0 (sample mean)
<- with(post, rnorm(n, a + bM*0 + bA*As, sigma))
DM0
# simulate D for M = 1 (+1 standard deviation)
# use the same A values
<- with(post, rnorm(n, a + bM*1 + bA*As, sigma))
DM1
# contrast
<- DM1 - DM0
M10_contrast dens(M10_contrast, lwd=4, col=2, xlab="effect of increase in M")
The Pipe
X and Y are associated \(Y \not\!\perp\!\!\!\perp X\)
Influence of X on Y transmitted through Z
Once stratified by Z, no association \(Y \perp\!\!\!\perp X | Z\)
<- dagify(
dag ~ Z,
Y ~ X
Z
)
ggdag(dag) +
theme_dag()
<- 1000
n <- rbern(n, 0.5)
X <- rbern(n, (1-X)*0.01 + X*0.9)
Z <- rbern(n, (1-Z)*0.01 + Z*0.9)
Y cor(X,Y)
[1] 0.7876689
everything that Y knows about X, is already known by Z
- once you learn Z, there is nothing more to learn about the association
<- c(4,2)
cols <- 300
n <- rbern(n)
X <- rbern(n, inv_logit(X))
Z <- rbern(n, (2*Z-1)) Y
Warning in rbinom(n, size = 1, prob = prob): NAs produced
# plot
- including the mediator at the wrong time can lead to incorrect inference
<- dagify(
dag ~ H0 + Treat,
H1 ~ Fungus,
H1 ~ Treat
Fungus
)
ggdag(dag) +
theme_dag()
what is the total causal effect of treatment?
Treat -> Fungus -> H1 is a pipe, should not stratify by F
post-treatment bias
if you stratify by a consquence of the treatment, it can induce post-treatment bias - gives you a misleading estimate of what you’re after
consequences of treatment should not usually be included in the estimator
The Collider
X and Y are not associated (share no causes) \(Y \perp\!\!\!\perp X\)
X and Y both influence Z
Once stratified by Z, X and Y are associated \(Y \not\!\perp\!\!\!\perp X | Z\)
<- dagify(
dag ~ X + Y
Z
)
ggdag(dag) +
theme_dag()
<- 1000
n <- rbern(n, 0.5)
X <- rbern(n, 0.5)
Y <- rbern(n, ifelse(X+Y>0, 0.9, 0.2)) Z
- strong correlations caused by the collider could be read as correlation but causes are not in the data
<- c(4,2)
cols <- 300
N <- rnorm(N)
X <- rnorm(N)
Y <- rbern(N, inv_logit(2*X+2*Y-2))
Z
plot(X,Y, cols = cols[Z+1]) +
abline(lm(Y[Z==1] ~ X[Z==1]), col = 2) +
abline(lm(Y[Z==0] ~X[Z==0]), col = 4) +
abline(lm(Y~ X))
Warning in plot.window(...): "cols" is not a graphical parameter
Warning in plot.xy(xy, type, ...): "cols" is not a graphical parameter
Warning in axis(side = side, at = at, labels = labels, ...): "cols" is not a
graphical parameter
Warning in axis(side = side, at = at, labels = labels, ...): "cols" is not a
graphical parameter
Warning in box(...): "cols" is not a graphical parameter
Warning in title(...): "cols" is not a graphical parameter
integer(0)
sometimes samples come already stratified by collider
associations among the things you have measured post-selection is dangerous because the selection is often collider bias
endogenous colliders
if you include a collider in your estimator you can induce a spurious correlation
happens within your analysis
example: age and happiness
estimand: influence of age on happiness
possible confound: marital status
suppose age has zero influence on happiness, but that both age and happiness influence marital status
<- dagify(
dag ~ A + H
M
)
ggdag(dag) +
theme_dag()
- stratified by marital status, negative association between age and happiness even though they are not related -> age/happiness are unrelated
The Descendant
how it behaves depends on what it is attached to
A is the descendant, contains information of its “parent”
X and Y are causally associated through Z \(Y \not\!\perp\!\!\!\perp X\)
A holds information about Z
Once stratified by A, X and Y less associated (if strong enough) \(Y \perp\!\!\!\perp X | A\)
<- dagify(
dag ~ X,
Z ~ Z,
Y ~ Z
A
)
ggdag(dag) +
theme_dag()
<- 1000
n <- rbern(n, 0.5)
x <- rbern(n, (1-x)*0.1 + x*0.9)
z <- rbern(n, (1-z)*0.1 + z*0.9)
y <- rbern(n, (1-z)*0.1 + z*0.9) a
- descendants are everywhere - proxies