m copyediting Tag: Visual edit |
Citation bot (talk | contribs) Add: journal, title. | Use this bot. Report bugs. | Suggested by Headbomb | #UCB_toolbar |
||
(47 intermediate revisions by 23 users not shown) | |||
Line 1: | Line 1: | ||
In statistics, econometrics, epidemiology, genetics and related disciplines, '''causal graphs''' (also known as [[ |
In statistics, econometrics, epidemiology, genetics and related disciplines, '''causal graphs''' (also known as [[Path analysis (statistics)|path diagrams]], causal [[Bayesian networks]] or [[Directed Acyclic Graph|DAGs]]) are [[graphical models|probabilistic graphical models]] used to encode assumptions about the data-generating process. |
||
Causal graphs can be used for communication and for inference. As communication devices the graphs provide formal and transparent representation of the causal assumptions that researchers may wish to convey and defend. As inference tools, the graphs enable researchers to estimate effect sizes from non-experimental data,<ref name=causality>{{cite book|last1=Pearl|first1=Judea|title=Causality|date=2000|publisher=MIT Press|location=Cambridge, MA}}</ref><ref>{{cite |
Causal graphs can be used for communication and for inference. They are complementary to other forms of causal reasoning, for instance using [[causal equality notation]]. As communication devices, the graphs provide formal and transparent representation of the causal assumptions that researchers may wish to convey and defend. As inference tools, the graphs enable researchers to estimate effect sizes from non-experimental data,<ref name=causality>{{cite book|last1=Pearl|first1=Judea|author-link=Judea Pearl|title=Causality|url=https://archive.org/details/causalitymodelsr0000pear|url-access=registration|date=2000|publisher=MIT Press|location=Cambridge, MA|isbn=9780521773621 }}</ref><ref>{{cite book|last1=Tian|first1=Jin|last2=Pearl|first2=Judea|chapter=A general identification condition for causal effects|title=Proceedings of the Eighteenth National Conference on Artificial Intelligence|date=2002|chapter-url=https://escholarship.org/uc/item/17r754xz|isbn=978-0-262-51129-2 }}</ref><ref>{{cite journal|last1=Shpitser|first1=Ilya|last2=Pearl|first2=Judea|title=Complete Identification Methods for the Causal Hierarchy|url=http://www.jmlr.org/papers/volume9/shpitser08a/shpitser08a.pdf|journal=Journal of Machine Learning Research|date=2008|volume=9|pages=1941–1979}}</ref><ref>{{cite journal|last1=Huang|first1=Y.|last2=Valtorta|first2=M. |date=2006|url=https://www.aaai.org/Papers/AAAI/2006/AAAI06-180.pdf}}</ref><ref>{{cite book|last1=Bareinboim|first1=Elias|last2=Pearl|first2=Judea|chapter=Causal Inference by Surrogate Experiments: z-Identifiability|title=Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence|date=2012|isbn=978-0-9749039-8-9 |arxiv=1210.4842|bibcode=2012arXiv1210.4842B}}</ref> derive [[testable]] implications of the assumptions encoded,<ref name=causality /><ref>{{cite book|last1=Tian|first1=Jin|last2=Pearl|first2=Judea|chapter=On the Testable Implications of Causal Models with Hidden Variables|title=Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence|date=2002|pages=519–27 |isbn=978-1-55860-897-9 |arxiv=1301.0608 |bibcode=2013arXiv1301.0608T}}</ref><ref>{{cite journal|last1=Shpitser|first1=Ilya|last2=Pearl|first2=Judea |date=2008}}</ref><ref>{{cite journal|last1=Chen|first1=Bryant|last2=Pearl|first2=Judea |title=Testable Implications of Linear Structural Equation Models |journal=Proceedings of the AAAI Conference on Artificial Intelligence |date=2014|volume=28 |doi=10.1609/aaai.v28i1.9065 |s2cid=1612893 |url=https://ojs.aaai.org/index.php/AAAI/article/download/9065/8924|doi-access=free}}</ref> test for external validity,<ref>{{cite journal|last1=Bareinmboim|first1=Elias|last2=Pearl|first2=Judea|title=External Validity: From do-calculus to Transportability across Populations|journal=Statistical Science|date=2014|doi=10.1214/14-sts486|volume=29|issue=4|pages=579–595|arxiv=1503.01603|s2cid=5586184 }}</ref> and manage missing data<ref>{{cite journal|last1=Mohan|first1=Karthika|last2=Pearl|first2=Judea|last3=Tian|first3=Jin|title=Graphical Models for Inference with Missing Data|journal=Advances in Neural Information Processing Systems|date=2013|url=https://proceedings.neurips.cc/paper/2013/file/0ff8033cf9437c213ee13937b1c4c455-Paper.pdf}}</ref> and selection bias.<ref>{{cite journal|last1=Bareinboim|first1=Elias|last2=Tian|first2=Jin|last3=Pearl|first3=Judea |title=Recovering from Selection Bias in Causal and Statistical Inference |journal=Proceedings of the AAAI Conference on Artificial Intelligence |date=2014|volume=28 |doi=10.1609/aaai.v28i1.9074 |url=https://ojs.aaai.org/index.php/AAAI/article/view/9074/8933|doi-access=free}}</ref> |
||
Causal graphs were first used by the geneticist [[Sewall Wright]]<ref>{{cite journal | last1 = Wright | first1 = S. | year = 1921 | title = Correlation and causation | |
Causal graphs were first used by the geneticist [[Sewall Wright]]<ref>{{cite journal | last1 = Wright | first1 = S. | year = 1921 | title = Correlation and causation | journal = Journal of Agricultural Research | volume = 20 | pages = 557–585 }}</ref> under the rubric "path diagrams". They were later adopted by social scientists<ref>{{cite journal | last1 = Blalock| first1 = H. M. | year = 1960 | title = Correlational analysis and causal inferences | doi = 10.1525/aa.1960.62.4.02a00060 | journal = American Anthropologist | volume = 62| issue = 4| pages = 624–631 | doi-access = free }}</ref><ref>{{cite journal | last1 = Duncan| first1 = O. D. | year = 1966 | title = Path analysis: Sociological examples. | journal = American Journal of Sociology | volume = 72| pages = 1–16 | doi=10.1086/224256| s2cid = 59428866 }}</ref><ref>{{cite journal | last1 = Duncan| first1 = O. D. | year = 1976 | title = Introduction to structural equation models | journal = American Journal of Sociology | volume = 82 | issue = 3 | pages = 731–733 | doi=10.1086/226377}}</ref><ref>{{cite journal | last1 = Jöreskog| first1 = K. G. | year = 1969 | title = A general approach to confirmatory maximum likelihood factor analysis | journal = Psychometrika | volume = 34| issue = 2| pages = 183–202 | doi=10.1007/bf02289343| s2cid = 186236320 }}</ref><ref>{{cite book|last1=Goldberger|first1=A. S.|last2=Duncan|first2=O. D.|title=Structural equation models in the social sciences|date=1973|publisher=Seminar Press|location=New York}}</ref><ref>{{cite journal|last1=Goldberger|first1=A. S.|title=Structural equation models in the social sciences|journal=Econometrica|volume=40|issue=6|date=1972|pages=979–1001|doi=10.2307/1913851|jstor=1913851}}</ref> and, to a lesser extent, by economists.<ref>{{cite journal|last1=White|first1=Halbert|last2=Chalak|first2=Karim|last3=Lu|first3=Xun|title=Linking granger causality and the pearl causal model with settable systems|journal=Causality in Time Series Challenges in Machine Learning|date=2011|volume=5|url=http://proceedings.mlr.press/v12/white11/white11.pdf}}</ref> These models were initially confined to linear equations with fixed parameters. Modern developments have extended graphical models to non-parametric analysis, and thus achieved a generality and flexibility that has transformed causal analysis in computer science, epidemiology,<ref>{{cite book|last1=Rothman|first1=Kenneth J.|last2=Greenland|first2=Sander|last3=Lash|first3=Timothy|title=Modern epidemiology|date=2008|publisher=Lippincott Williams & Wilkins}}</ref> and social science.<ref>{{cite book|last1=Morgan|first1=S. L.|last2=Winship|first2=C.|title=Counterfactuals and causal inference: Methods and principles for social research.|date=2007|publisher=Cambridge University Press|location=New York}}</ref> |
||
==Construction and |
==Construction and terminology== |
||
⚫ | The causal graph can be drawn in the following way. Each variable in the model has a corresponding vertex or node and an arrow is drawn from a variable ''X'' to a variable ''Y'' whenever ''Y'' is judged to respond to changes in ''X'' when all other variables are being held constant. Variables connected to ''Y'' through direct arrows are called ''parents'' of ''Y'', or "direct causes of ''Y''," and are denoted by ''Pa(Y)''. |
||
⚫ | The causal graph can be drawn in the following way. Each variable in the model has a corresponding vertex or node and an arrow is drawn from a variable ''X'' to a variable ''Y'' whenever ''Y'' is judged to respond to changes in ''X'' when all other variables are being held constant. Variables connected to ''Y'' through direct arrows are called ''parents'' of ''Y'', or "direct causes of ''Y'' |
||
Causal models often include "error terms" or "omitted factors" which represent all unmeasured factors that influence a variable ''Y'' when ''Pa(Y)'' are held constant. In most cases, error terms are excluded from the graph. However, if the graph author suspects that the error terms of any two variables are dependent (e.g. the two variables have an unobserved or latent common cause) then a bidirected arc is drawn between them. Thus, the presence of latent variables is taken into account through the correlations they induce between the error terms, as represented by bidirected arcs. |
Causal models often include "error terms" or "omitted factors" which represent all unmeasured factors that influence a variable ''Y'' when ''Pa(Y)'' are held constant. In most cases, error terms are excluded from the graph. However, if the graph author suspects that the error terms of any two variables are dependent (e.g. the two variables have an unobserved or latent common cause) then a bidirected arc is drawn between them. Thus, the presence of latent variables is taken into account through the correlations they induce between the error terms, as represented by bidirected arcs. |
||
==Fundamental |
==Fundamental tools== |
||
A fundamental tool in graphical analysis is [[ |
A fundamental tool in graphical analysis is [[Bayesian network#d-separation|d-separation]], which allows researchers to determine, by inspection, whether the causal structure implies that two sets of variables are independent given a third set. In recursive models without correlated error terms (sometimes called ''Markovian''), these conditional independences represent all of the model's testable implications.<ref>{{cite journal|last1=Geiger|first1=Dan|last2=Pearl|first2=Judea|title=Logical and Algorithmic Properties of Conditional Independence|journal=Annals of Statistics|date=1993|volume=21|issue=4|pages=2001–2021|doi=10.1214/aos/1176349407|citeseerx=10.1.1.295.2043}}</ref> |
||
==Example== |
==Example== |
||
Line 20: | Line 19: | ||
'''Model 1''' |
'''Model 1''' |
||
<center> |
|||
<math> |
: <math> |
||
\begin{align} |
\begin{align} |
||
Q_1 &= U_1\\ |
Q_1 &= U_1\\ |
||
Line 27: | Line 26: | ||
Q_2 &= c \cdot C + d \cdot Q_1 + U_3\\ |
Q_2 &= c \cdot C + d \cdot Q_1 + U_3\\ |
||
S &= b \cdot C + e \cdot Q_2 + U_4, |
S &= b \cdot C + e \cdot Q_2 + U_4, |
||
\end{align}</math |
\end{align}</math> |
||
where <math>Q_1</math> represents the individual's qualifications prior to college, <math>Q_2</math> represents qualifications after college, <math>C</math> contains attributes representing the quality of the college attended, and <math>S</math> the individual's salary. |
where <math>Q_1</math> represents the individual's qualifications prior to college, <math>Q_2</math> represents qualifications after college, <math>C</math> contains attributes representing the quality of the college attended, and <math>S</math> the individual's salary. |
||
Line 34: | Line 33: | ||
[[File:College notID proj.png|thumb|Figure 2: Unidentified model with latent variables summarized]] |
[[File:College notID proj.png|thumb|Figure 2: Unidentified model with latent variables summarized]] |
||
Figure 1 is a causal graph that represents this model specification. Each variable in the model has a corresponding node or vertex in the graph. Additionally, for each equation, arrows are drawn from the independent variables to the dependent variables. These arrows reflect the direction of causation. In some cases, we may label the arrow with its corresponding structural coefficient as in Figure 1. |
Figure 1 is a causal graph that represents this model specification. Each variable in the model has a corresponding node or vertex in the graph. Additionally, for each equation, arrows are drawn from the independent variables to the dependent variables. These arrows reflect the direction of causation. In some cases, we may label the arrow with its corresponding structural coefficient as in Figure 1. |
||
If <math>Q_1</math> and <math>Q_2</math> are unobserved or latent variables their influence on <math>C</math> and <math>S</math> can be attributed to their error terms. By removing them, we obtain the following model specification: |
If <math>Q_1</math> and <math>Q_2</math> are unobserved or latent variables their influence on <math>C</math> and <math>S</math> can be attributed to their error terms. By removing them, we obtain the following model specification: |
||
Line 40: | Line 39: | ||
'''Model 2''' |
'''Model 2''' |
||
⚫ | |||
<center> |
|||
⚫ | |||
\begin{align} |
\begin{align} |
||
C &= U_C \\ |
C &= U_C \\ |
||
S &= \beta C + U_S |
S &= \beta C + U_S |
||
\end{align}</math> |
\end{align}</math> |
||
</center> |
|||
The background information specified by Model 1 imply that the error term of <math>S</math>, <math>U_S</math>, is correlated with ''C'''s error term, <math>U_C</math>. As a result, we add a bidirected arc between ''S'' and ''C'', as in Figure 2. |
The background information specified by Model 1 imply that the error term of <math>S</math>, <math>U_S</math>, is correlated with ''C''{{'s}} error term, <math>U_C</math>. As a result, we add a bidirected arc between ''S'' and ''C'', as in Figure 2. |
||
[[File:College.png|thumb|Figure 3: Identified model with latent variables (<math>Q_1</math> and <math> Q_2 </math>) shown explicitly]] |
[[File:College.png|thumb|Figure 3: Identified model with latent variables (<math>Q_1</math> and <math> Q_2 </math>) shown explicitly]] |
||
Line 57: | Line 54: | ||
'''Model 3''' |
'''Model 3''' |
||
⚫ | |||
<center> |
|||
⚫ | |||
\begin{align} |
\begin{align} |
||
Q_1 &= U_1\\ |
Q_1 &= U_1\\ |
||
Line 66: | Line 62: | ||
S &= c \cdot C + f \cdot Q_2 + U_5, |
S &= c \cdot C + f \cdot Q_2 + U_5, |
||
\end{align}</math> |
\end{align}</math> |
||
</center> |
|||
By removing the latent variables from the model specification we obtain: |
By removing the latent variables from the model specification we obtain: |
||
Line 72: | Line 67: | ||
'''Model 4''' |
'''Model 4''' |
||
⚫ | |||
<center> |
|||
⚫ | |||
\begin{align} |
\begin{align} |
||
A &= a \cdot Q_1 + U_A \\ |
A &= a \cdot Q_1 + U_A \\ |
||
Line 79: | Line 73: | ||
S &= \beta \cdot C + U_S, |
S &= \beta \cdot C + U_S, |
||
\end{align}</math> |
\end{align}</math> |
||
</center> |
|||
with <math>U_A</math> correlated with <math>U_S</math>. |
with <math>U_A</math> correlated with <math>U_S</math>. |
||
Now, <math>\beta</math> is identified and can be estimated using the regression of <math>S</math> on <math>C</math> and <math>A</math>. This can be verified using the ''single-door criterion'' |
Now, <math>\beta</math> is identified and can be estimated using the regression of <math>S</math> on <math>C</math> and <math>A</math>. This can be verified using the ''single-door criterion'',<ref name=causality /><ref>{{cite journal|last1=Chen|first1=B.|last2=Pearl|first2=J|title=Graphical Tools for Linear Structural Equation Modeling|journal=Technical Report|date=2014|url=https://apps.dtic.mil/sti/pdfs/ADA609131.pdf}}</ref> a necessary and sufficient graphical condition for the identification of a structural coefficients, like <math>\beta</math>, using regression. |
||
{{clear}} |
|||
==References== |
==References== |
||
{{reflist}} |
{{reflist}} |
||
[[Category:Graphical models]] |
|||
* |
Latest revision as of 23:17, 3 April 2024
In statistics, econometrics, epidemiology, genetics and related disciplines, causal graphs (also known as path diagrams, causal Bayesian networks or DAGs) are probabilistic graphical models used to encode assumptions about the data-generating process.
Causal graphs can be used for communication and for inference. They are complementary to other forms of causal reasoning, for instance using causal equality notation. As communication devices, the graphs provide formal and transparent representation of the causal assumptions that researchers may wish to convey and defend. As inference tools, the graphs enable researchers to estimate effect sizes from non-experimental data,[1][2][3][4][5] derive testable implications of the assumptions encoded,[1][6][7][8] test for external validity,[9] and manage missing data[10] and selection bias.[11]
Causal graphs were first used by the geneticist Sewall Wright[12] under the rubric "path diagrams". They were later adopted by social scientists[13][14][15][16][17][18] and, to a lesser extent, by economists.[19] These models were initially confined to linear equations with fixed parameters. Modern developments have extended graphical models to non-parametric analysis, and thus achieved a generality and flexibility that has transformed causal analysis in computer science, epidemiology,[20] and social science.[21]
Construction and terminology
The causal graph can be drawn in the following way. Each variable in the model has a corresponding vertex or node and an arrow is drawn from a variable X to a variable Y whenever Y is judged to respond to changes in X when all other variables are being held constant. Variables connected to Y through direct arrows are called parents of Y, or "direct causes of Y," and are denoted by Pa(Y).
Causal models often include "error terms" or "omitted factors" which represent all unmeasured factors that influence a variable Y when Pa(Y) are held constant. In most cases, error terms are excluded from the graph. However, if the graph author suspects that the error terms of any two variables are dependent (e.g. the two variables have an unobserved or latent common cause) then a bidirected arc is drawn between them. Thus, the presence of latent variables is taken into account through the correlations they induce between the error terms, as represented by bidirected arcs.
Fundamental tools
A fundamental tool in graphical analysis is d-separation, which allows researchers to determine, by inspection, whether the causal structure implies that two sets of variables are independent given a third set. In recursive models without correlated error terms (sometimes called Markovian), these conditional independences represent all of the model's testable implications.[22]
Example
Suppose we wish to estimate the effect of attending an elite college on future earnings. Simply regressing earnings on college rating will not give an unbiased estimate of the target effect because elite colleges are highly selective, and students attending them are likely to have qualifications for high-earning jobs prior to attending the school. Assuming that the causal relationships are linear, this background knowledge can be expressed in the following structural equation model (SEM) specification.
Model 1
where represents the individual's qualifications prior to college, represents qualifications after college, contains attributes representing the quality of the college attended, and the individual's salary.
Figure 1 is a causal graph that represents this model specification. Each variable in the model has a corresponding node or vertex in the graph. Additionally, for each equation, arrows are drawn from the independent variables to the dependent variables. These arrows reflect the direction of causation. In some cases, we may label the arrow with its corresponding structural coefficient as in Figure 1.
If and are unobserved or latent variables their influence on and can be attributed to their error terms. By removing them, we obtain the following model specification:
Model 2
The background information specified by Model 1 imply that the error term of , , is correlated with C's error term, . As a result, we add a bidirected arc between S and C, as in Figure 2.
Since is correlated with and, therefore, , is endogenous and is not identified in Model 2. However, if we include the strength of an individual's college application, , as shown in Figure 3, we obtain the following model:
Model 3
By removing the latent variables from the model specification we obtain:
Model 4
with correlated with .
Now, is identified and can be estimated using the regression of on and . This can be verified using the single-door criterion,[1][23] a necessary and sufficient graphical condition for the identification of a structural coefficients, like , using regression.
References
- ^ a b c Pearl, Judea (2000). Causality. Cambridge, MA: MIT Press. ISBN 9780521773621.
- ^ Tian, Jin; Pearl, Judea (2002). "A general identification condition for causal effects". Proceedings of the Eighteenth National Conference on Artificial Intelligence. ISBN 978-0-262-51129-2.
- ^ Shpitser, Ilya; Pearl, Judea (2008). "Complete Identification Methods for the Causal Hierarchy" (PDF). Journal of Machine Learning Research. 9: 1941–1979.
- ^ Huang, Y.; Valtorta, M. (2006). https://www.aaai.org/Papers/AAAI/2006/AAAI06-180.pdf.
{{cite journal}}
: Cite journal requires|journal=
(help); Missing or empty|title=
(help) - ^ Bareinboim, Elias; Pearl, Judea (2012). "Causal Inference by Surrogate Experiments: z-Identifiability". Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence. arXiv:1210.4842. Bibcode:2012arXiv1210.4842B. ISBN 978-0-9749039-8-9.
- ^ Tian, Jin; Pearl, Judea (2002). "On the Testable Implications of Causal Models with Hidden Variables". Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence. pp. 519–27. arXiv:1301.0608. Bibcode:2013arXiv1301.0608T. ISBN 978-1-55860-897-9.
- ^ Shpitser, Ilya; Pearl, Judea (2008).
{{cite journal}}
: Cite journal requires|journal=
(help); Missing or empty|title=
(help) - ^ Chen, Bryant; Pearl, Judea (2014). "Testable Implications of Linear Structural Equation Models". Proceedings of the AAAI Conference on Artificial Intelligence. 28. doi:10.1609/aaai.v28i1.9065. S2CID 1612893.
- ^ Bareinmboim, Elias; Pearl, Judea (2014). "External Validity: From do-calculus to Transportability across Populations". Statistical Science. 29 (4): 579–595. arXiv:1503.01603. doi:10.1214/14-sts486. S2CID 5586184.
- ^ Mohan, Karthika; Pearl, Judea; Tian, Jin (2013). "Graphical Models for Inference with Missing Data" (PDF). Advances in Neural Information Processing Systems.
- ^ Bareinboim, Elias; Tian, Jin; Pearl, Judea (2014). "Recovering from Selection Bias in Causal and Statistical Inference". Proceedings of the AAAI Conference on Artificial Intelligence. 28. doi:10.1609/aaai.v28i1.9074.
- ^ Wright, S. (1921). "Correlation and causation". Journal of Agricultural Research. 20: 557–585.
- ^ Blalock, H. M. (1960). "Correlational analysis and causal inferences". American Anthropologist. 62 (4): 624–631. doi:10.1525/aa.1960.62.4.02a00060.
- ^ Duncan, O. D. (1966). "Path analysis: Sociological examples". American Journal of Sociology. 72: 1–16. doi:10.1086/224256. S2CID 59428866.
- ^ Duncan, O. D. (1976). "Introduction to structural equation models". American Journal of Sociology. 82 (3): 731–733. doi:10.1086/226377.
- ^ Jöreskog, K. G. (1969). "A general approach to confirmatory maximum likelihood factor analysis". Psychometrika. 34 (2): 183–202. doi:10.1007/bf02289343. S2CID 186236320.
- ^ Goldberger, A. S.; Duncan, O. D. (1973). Structural equation models in the social sciences. New York: Seminar Press.
- ^ Goldberger, A. S. (1972). "Structural equation models in the social sciences". Econometrica. 40 (6): 979–1001. doi:10.2307/1913851. JSTOR 1913851.
- ^ White, Halbert; Chalak, Karim; Lu, Xun (2011). "Linking granger causality and the pearl causal model with settable systems" (PDF). Causality in Time Series Challenges in Machine Learning. 5.
- ^ Rothman, Kenneth J.; Greenland, Sander; Lash, Timothy (2008). Modern epidemiology. Lippincott Williams & Wilkins.
- ^ Morgan, S. L.; Winship, C. (2007). Counterfactuals and causal inference: Methods and principles for social research. New York: Cambridge University Press.
- ^ Geiger, Dan; Pearl, Judea (1993). "Logical and Algorithmic Properties of Conditional Independence". Annals of Statistics. 21 (4): 2001–2021. CiteSeerX 10.1.1.295.2043. doi:10.1214/aos/1176349407.
- ^ Chen, B.; Pearl, J (2014). "Graphical Tools for Linear Structural Equation Modeling" (PDF). Technical Report.