class: left, middle, inverse, title-slide # A Quick Introduction to
ggdag
### Sakeef M. Karim
McGill University
Consortium on Analytics for Data-Driven Decision-Making
### December 1, 2021
code
sakeefkarim
--- <style type="text/css"> .highlight-last-item > ul > li, .highlight-last-item > ol > li { opacity: 0.5; } .highlight-last-item > ul > li:last-of-type, .highlight-last-item > ol > li:last-of-type { opacity: 1; } .huge .remark-code { font-size: 200% !important; } .smaller .remark-code { font-size: 75% !important; } .evensmaller .remark-code { font-size: 65% !important; } .tiny .remark-code { font-size: 50% !important; } </style>
# What We’ll Do Today -- .can-edit[ + Load the packages you'll need to generate causal diagrams in
```r library(dagitty) library(tidyverse) *library(ggdag) library(ggthemes) ``` ] -- + Revisit an example from last week's [asynchronous module](https://canvas.ubc.ca/courses/75578/assignments/1079082) on directed acyclic graphs -- + Learn how to use `dagitty` and `ggdag` in-tandem -- + Use `ggdag`'s inbuilt functions as _tools_ to identify sources of bias -- + Customize our DAGs using `ggplot2` in conjunction with `ggdag` --- class: inverse center middle # A DAG Example --- # Pre-Natal Vitamins and Birth Defects .panelset[ .panel[.panel-name[Using `DAGitty.net`] > [DAGitty.net](http://www.dagitty.net/dags.html) <img src="./media/DAGitty online.png" width="45%" height="40%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Using the `dagitty` package in `R`] > [dagitty in
](http://dagitty.net/primer/) <img src="index_files/figure-html/dagitty_r-1.png" width="65%" height="75%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Using `ggdag` (in conjunction with `ggplot2`)] > [ggdag](https://ggdag.malco.io/index.html) (alongside `ggplot2`) <img src="index_files/figure-html/ggdag_ggplot-1.png" width="80%" height="80%" style="display: block; margin: auto;" /> ] ] --- class: inverse center middle # Using <code><font color="white">dagitty</font></code> and <code><font color="white">ggdag</font></code> --- # Combining `dagitty` and `ggdag` .pull-left[ .footnote[ **Step 1**: Create a DAG using [DAGitty.net](http://www.dagitty.net/dags.html). Copy the model code. ] <img src="media/step_1.png" width="100%" /> ] -- .pull-right[ .footnote[ **Step 2**: Use `dagitty::dagitty()` and the pasted model code to generate a DAG object within
. ] .can-edit[ .smaller[ ```r base_dagitty <- dagitty('dag { A [pos="-2.200,-1.520"] B [pos="1.400,-1.460"] D [outcome,pos="1.400,1.621"] E [exposure,pos="-2.200,1.597"] Z [pos="-0.300,-0.082"] A -> E A -> Z [pos="-0.791,-1.045"] B -> D B -> Z [pos="0.680,-0.496"] E -> D }') ``` ] ] ] --- # Combining `dagitty` and `ggdag` .pull-left[ .smaller[ ```r dag_cand3 <- dagitty('dag { Birth_Defects [outcome,pos="0.109,0.631"] Difficulty_Conceiving [pos="0.117,-1.517"] Genetics [pos="0.850,-0.411"] PNC [pos="-0.837,-0.433"] SES [pos="-1.839,-1.468"] Vitamins [exposure,pos="-1.844,0.645"] Difficulty_Conceiving -> PNC Genetics -> Birth_Defects Genetics -> Difficulty_Conceiving PNC -> Birth_Defects PNC -> Vitamins SES -> PNC SES -> Vitamins Vitamins -> Birth_Defects }') *dag_cand3 %>% ggdag() ``` ] ] -- .pull-right[ <br> <img src="index_files/figure-html/unnamed-chunk-2-1.png" width="100%" style="display: block; margin: auto;" /> ] --- class: inverse center middle # Using <code><font color="white">ggdag</font></code> as a Tool --- # Identifying Covariate Adjustment Sets ```r *dag_cand3 %>% ggdag_adjustment_set() ``` -- <img src="index_files/figure-html/unnamed-chunk-3-1.png" width="100%" style="display: block; margin: auto;" /> --- # Identifying Colliders ```r *dag_cand3 %>% ggdag_collider() ``` -- <img src="index_files/figure-html/unnamed-chunk-4-1.png" width="100%" style="display: block; margin: auto;" /> --- # Identifying A Variable's Ancestors ```r #In this case, the ancestors of "pre-natal care!" *dag_cand3 %>% ggdag_ancestors("PNC") ``` -- <img src="index_files/figure-html/unnamed-chunk-5-1.png" width="85%" height="85%" style="display: block; margin: auto;" /> --- # Identifying Open Paths ```r #Non-faceted version: *dag_cand3 %>% ggdag_paths_fan(from = "Vitamins", to = "Birth_Defects") ``` -- <img src="index_files/figure-html/unnamed-chunk-6-1.png" width="85%" height="85%" style="display: block; margin: auto;" /> --- # Identifying Open Paths ```r #Faceted version: *dag_cand3 %>% ggdag_paths(from = "Vitamins", to = "Birth_Defects") ``` -- <img src="index_files/figure-html/unnamed-chunk-7-1.png" width="85%" height="85%" style="display: block; margin: auto;" /> --- # Identifying Open Paths What if we conditioned on a variable — say, "PNC?" ```r dag_cand3 %>% ggdag_paths(from = "Vitamins", to = "Birth_Defects", * adjust_for = "PNC", shadow = TRUE) ``` <img src="index_files/figure-html/open_paths_pnc-1.png" width="65%" height="65%" style="display: block; margin: auto;" /> .footnote[ **Note**: [Script file](https://github.com/sakeefkarim/intro_ggdag_script/tree/main/code) includes an example of how to identify paths activated by conditioning on a collider — via `ggdag::ggdag_adjust()`. ] --- # Identifying Open Paths What if we also adjusted for _genetics_? ```r dag_cand3 %>% ggdag_paths_fan(from = "Vitamins", to = "Birth_Defects", * adjust_for = c("PNC", "Genetics")) ``` <img src="index_files/figure-html/open_paths_pnc_genetics-1.png" width="82%" height="82%" style="display: block; margin: auto;" /> --- class: inverse center middle # Customization via <code><font color="white">ggdag</font></code> and <code><font color="white">ggplot2</font></code> --- # Using
Syntax to Generate a DAG ```r dag_cand3 <- dagify(#Here, we see conventional R syntax (e.g., outcome ~ predictors) birth_defects ~ vitamins + pnc + genetics, vitamins ~ ses + pnc, pnc ~ ses + diff_conceiving, diff_conceiving ~ genetics, #These labels will be useful for plotting purposes down the line! labels = c(#\n signals a line break birth_defects = "Birth Defects\n (Outcome)", vitamins = "Vitamins\n (Exposure)", pnc = "Pre-Natal Care", diff_conceiving = "Difficulty\n Conceiving", ses = "SES", genetics = "Genetics"), exposure = "vitamins", outcome = "birth_defects") ``` --- # Identifying Covariate Adjustment Sets Pt II ```r set.seed(905) *dag_cand3 %>% ggdag_adjustment_set() ``` <img src="index_files/figure-html/adjustment_sets_ii-1.png" width="80%" height="80%" /> --- # Using Labels ```r set.seed(905) *dag_cand3 %>% ggdag(text = FALSE, * use_labels = "label") ``` <img src="index_files/figure-html/labels-1.png" width="80%" height="80%" /> --- # Using Labels (and Identifying Open Paths) ```r set.seed(905) *ggdag_paths(dag_cand3, text = FALSE, use_labels = "label") ``` <img src="index_files/figure-html/labels_paths-1.png" width="80%" height="80%" /> --- # Identifying Covariate Adjustment Sets Pt III .smaller[ ```r set.seed(905) dag_cand3 %>% ggdag_adjustment_set(text = FALSE, use_labels = "label") + * theme_dag() + theme(legend.position = "none") ``` <img src="index_files/figure-html/labels_adjustment-1.png" width="80%" height="80%" /> ] --- # Leveraging `ggplot2` First, tidy your DAG and specify your covariate adjustment set. .smaller[ ```r set.seed(905) dag_cand3_gg <- dag_cand3 %>% tidy_dagitty(layout = "nicely") %>% * node_dconnected(controlling_for = c("pnc", "ses")) dag_cand3_gg %>% slice(1:3) ``` ``` ## # A DAG with 2 nodes and 3 edges ## # ## # Exposure: vitamins ## # Outcome: birth_defects ## # Paths opened by conditioning on a collider: ## # ## # A tibble: 3 × 12 ## name x y direction to xend yend circular label collider_line ## <chr> <dbl> <dbl> <fct> <chr> <dbl> <dbl> <lgl> <chr> <lgl> ## 1 diff_… 10.5 4.23 -> pnc 9.55 5.25 FALSE "Diffi… FALSE ## 2 genet… 11.5 4.90 -> birth… 10.6 5.83 FALSE "Genet… FALSE ## 3 genet… 11.5 4.90 -> diff_… 10.5 4.23 FALSE "Genet… FALSE ## # … with 2 more variables: adjusted <fct>, d_relationship <fct> ``` ] --- # Leveraging `ggplot2` Then, use `ggdag` and `ggplot2` functions to produce a nicer DAG! .evensmaller[ ```r dag_cand3_gg %>% mutate(adjusted = #Simple way to capitalize a string: str_to_title(adjusted), arrow = #Allows us to modify transparency of arrows as a function of whether or not a variable is adjusted: ifelse(adjusted == "Adjusted", 0.15, 0.85)) %>% ggplot(aes(#Coordinates (i.e., where the nodes will be located) x = x, y = y, xend = xend, yend = yend, #Mapping aesthetics — will vary as a function of whether a variable is adjusted or unadjusted: colour = adjusted, fill = adjusted, shape = adjusted)) + #Adds nodes to plotting area: geom_dag_point() + #Adds arrows connecting the nodes (as specified in your DAG syntax):: geom_dag_edges(aes(#Adjusts transparency of arrows: edge_alpha = arrow), edge_width = 0.5) + #Changes the shapes corresponding to adjusted/unadjusted. scale_shape_manual(values = c(22, 21)) + #The two lines that follow adjust the colour/fill of the nodes based on ggtheme's Economist theme: scale_fill_economist() + scale_colour_economist() + #The following line uses the logic of geom_label_repel to generate/modify your labels. geom_dag_label_repel(aes(label = label), colour = "white", show.legend = FALSE) + theme_dag() + #Removes legend title: theme(legend.title = element_blank()) ``` ] --- # Leveraging `ggplot2` <img src="index_files/figure-html/unnamed-chunk-8-1.png" width="87%" height="87%" /> .footnote[ **Note**: [Script file](https://github.com/sakeefkarim/intro_ggdag_script/tree/main/code) includes examples of how to use custom coordinates (to generate a DAG via `ggplot2`) and how to shorten DAG arrows. ] --- class: inverse center middle # The End --- class: center, middle # Thank you! Sakeef M. Karim <br>
[sakeef.karim@mail.mcgill.ca](mailto:sakeef.karim@mail.mcgill.ca) <br>
[Script File](https://github.com/sakeefkarim/intro_ggdag_script/tree/main/code)