Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing limit argument in dag_paths() #65

Closed
henningte opened this issue Apr 6, 2022 · 2 comments
Closed

Missing limit argument in dag_paths() #65

henningte opened this issue Apr 6, 2022 · 2 comments

Comments

@henningte
Copy link
Contributor

dag_paths() internally uses dagitty::paths() to identify open paths. dagitty::paths() has an argument limit which specifies how many of all existing distinct paths are considered. By default, limit = 100, meaning that only the first 100 paths are considered.

Could the limit argument be added also to dag_paths()? I think this should be quite straightforward to implement:

library(magrittr)
library(ggdag)
#> 
#> Attaching package: 'ggdag'
#> The following object is masked from 'package:stats':
#> 
#>     filter

# updated version of dag_paths
dag_paths <- function (.dag, from = NULL, to = NULL, adjust_for = NULL, limit = 100, directed = FALSE, 
          paths_only = FALSE, ...) 
{
  .tdy_dag <- ggdag:::if_not_tidy_daggity(.dag, ...)
  if (is.null(from)) 
    from <- dagitty::exposures(.tdy_dag$dag)
  if (is.null(to)) 
    to <- dagitty::outcomes(.tdy_dag$dag)
  if (is.null(from) || is.null(to)) 
    stop("`exposure` and `outcome` must be set!")
  pathways <- dagitty::paths(.tdy_dag$dag, from, to, Z = adjust_for, limit = limit,
                             directed = directed) %>% dplyr::as_tibble() %>% dplyr::filter(open) %>% 
    dplyr::pull(paths)
  vars <- c(from = from, to = to)
  .tdy_dag$data <- pathways %>% purrr::map_df(function(.x) {
    path_df <- .x %>% ggdag:::dag2() %>% dagitty::edges() %>% dplyr::select(.from = v, 
                                                                    .to = w) %>% dplyr::mutate(.from = as.character(.from), 
                                                                                               .to = as.character(.to), path = "open path") %>% 
      dplyr::left_join(.tdy_dag$data, ., by = c(name = ".from", 
                                                to = ".to"))
    any_x_unopend <- any(path_df$name == vars[[1]] & is.na(path_df$path))
    if (any_x_unopend) {
      x_has_no_children <- any(path_df$name == vars[[1]] & 
                                 is.na(path_df$to))
      if (x_has_no_children) {
        path_df[path_df$name == vars[[1]], "path"] <- "open path"
      }
      else {
        path_df <- path_df %>% filter(name == vars[[1]]) %>% 
          dplyr::slice(1) %>% dplyr::mutate(path = "open path") %>% 
          dplyr::bind_rows(path_df, .)
      }
    }
    y_has_no_children <- any(path_df$name == vars[[2]] & 
                               is.na(path_df$to))
    if (y_has_no_children) {
      path_df[path_df$name == vars[[2]], "path"] <- "open path"
    }
    else {
      path_df <- path_df %>% filter(name == vars[[2]]) %>% 
        dplyr::slice(1) %>% dplyr::mutate(path = "open path") %>% 
        dplyr::bind_rows(path_df, .)
    }
    path_df
  }, .id = "set")
  if (paths_only) 
    .tdy_dag$data <- filter(.tdy_dag$data, path == "open path")
  .tdy_dag
}

# example
x_td <- 
  ggdag::dag("x1 -> x2
           x1 -> x3
           x2 -> x3
           x2 -> x4
           x3 -> x4
           x3 -> x5
           x4 -> x5
           x4 -> x6
           x4 -> x7
           x5 -> x7
           x5 -> x8
           x6 -> x8
           x1 -> x6
           x4 -> x9
           x2 -> x9
           x5 -> x9
           x9 -> x10
           x9 -> x3
           x10 -> x6
           x3 -> x10
           x5 -> x11") %>%
  ggdag::tidy_dagitty(seed = 345)
  
# more than 100 paths with dagitty::paths when limit is set appropriately 
x_td_paths1 <- dagitty::paths(x_td$dag, limit = 200, from = "x5", to = "x8")
length(x_td_paths1$paths)
#> [1] 115

# no option available in ggdag
x_td_paths2 <- 
  ggdag::dag_paths(x_td, from = "x5", to = "x8")

# updated version
x_td_paths3 <- 
  dag_paths(x_td, from = "x5", to = "x8", limit = 200)

identical(x_td_paths2$data$path, x_td_paths3$data$path)
#> [1] FALSE

Created on 2022-04-06 by the reprex package (v2.0.1)

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.1.0 (2021-05-18)
#>  os       Ubuntu 20.04.3 LTS          
#>  system   x86_64, linux-gnu           
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       Etc/UTC                     
#>  date     2022-04-06                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  ! package      * version    date       lib
#>    boot           1.3-28     2021-05-03 [2]
#>  P cli            3.0.1      2021-07-17 [?]
#>  P colorspace     2.0-2      2021-06-24 [?]
#>  P crayon         1.4.1      2021-02-08 [?]
#>  P curl           4.3.2      2021-06-23 [?]
#>  P dagitty        0.3-1      2021-01-21 [?]
#>  P digest         0.6.27     2020-10-24 [?]
#>  P dplyr          1.0.7      2021-06-18 [?]
#>  P ellipsis       0.3.2      2021-04-29 [?]
#>  P evaluate       0.14       2019-05-28 [?]
#>  P fansi          0.5.0      2021-05-25 [?]
#>  P farver         2.1.0      2021-02-28 [?]
#>  P fs             1.5.0      2020-07-31 [?]
#>  P generics       0.1.0      2020-10-31 [?]
#>    ggdag        * 0.2.4      2021-10-10 [1]
#>  P ggforce        0.3.3      2021-03-05 [?]
#>  P ggplot2        3.3.5      2021-06-25 [?]
#>  P ggraph         2.0.5      2021-02-23 [?]
#>  P ggrepel        0.9.1      2021-01-15 [?]
#>  P glue           1.4.2      2020-08-27 [?]
#>  P graphlayouts   0.7.1      2020-10-26 [?]
#>  P gridExtra      2.3        2017-09-09 [?]
#>  P gtable         0.3.0      2019-03-25 [?]
#>  P highr          0.9        2021-04-16 [?]
#>  P htmltools      0.5.1.1    2021-01-22 [?]
#>  P igraph         1.2.6      2020-10-06 [?]
#>  P jsonlite       1.7.2      2020-12-09 [?]
#>  P knitr          1.33       2021-04-24 [?]
#>  P lifecycle      1.0.0      2021-02-15 [?]
#>  P magrittr     * 2.0.1      2020-11-17 [?]
#>    MASS           7.3-54     2021-05-03 [2]
#>  P munsell        0.5.0      2018-06-12 [?]
#>  P pillar         1.6.2      2021-07-29 [?]
#>  P pkgconfig      2.0.3      2019-09-22 [?]
#>  P polyclip       1.10-0     2019-03-14 [?]
#>  P purrr          0.3.4      2020-04-17 [?]
#>  P R6             2.5.0      2020-10-28 [?]
#>  P Rcpp           1.0.7      2021-07-07 [?]
#>  P reprex         2.0.1      2021-08-05 [?]
#>  P rlang          0.4.11     2021-04-30 [?]
#>  P rmarkdown      2.10       2021-08-06 [?]
#>  P rstudioapi     0.13       2020-11-12 [?]
#>  P scales         1.1.1      2020-05-11 [?]
#>  P sessioninfo    1.1.1      2018-11-05 [?]
#>  P stringi        1.7.3      2021-07-16 [?]
#>  P stringr        1.4.0      2019-02-10 [?]
#>  P tibble         3.1.3      2021-07-23 [?]
#>    tidygraph      1.2.0.9000 2022-04-05 [1]
#>  P tidyr          1.1.3      2021-03-03 [?]
#>  P tidyselect     1.1.1      2021-04-30 [?]
#>  P tweenr         1.0.2      2021-03-23 [?]
#>  P utf8           1.2.2      2021-07-24 [?]
#>  P V8             3.4.2      2021-05-01 [?]
#>  P vctrs          0.3.8      2021-04-29 [?]
#>  P viridis        0.6.1      2021-05-11 [?]
#>  P viridisLite    0.4.0      2021-04-13 [?]
#>  P withr          2.4.2      2021-04-18 [?]
#>  P xfun           0.25       2021-08-06 [?]
#>  P yaml           2.2.1      2020-02-01 [?]
#>  source                              
#>  CRAN (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  CRAN (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  CRAN (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  Github (thomasp85/tidygraph@54a5e59)
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#>  RSPM (R 4.1.0)                      
#> 
#> [1] /home/rstudio/.cache/R/renv/library/peat-mid-infrared-interpretation-3ca2b36b/R-4.1/x86_64-pc-linux-gnu
#> [2] /usr/local/lib/R/library
#> 
#>  P ── Loaded and on-disk path mismatch.

In addition, I think there's a typo in the documentation for dag_paths(): The first line starts with:

node_paths finds the ...

But should probably be

dag_paths finds the ...

@malcolmbarrett
Copy link
Collaborator

Yes, I agree. Happy to accept a PR for this and and ggdag_paths() if you want to expedite the process

@henningte
Copy link
Contributor Author

@malcolmbarrett Thanks for your quick reply. I actually plan to use this feature in the near future!

I have made a pull request. If there are any issues, please let me know.

malcolmbarrett added a commit that referenced this issue Apr 14, 2022
Missing `limit` argument in `dag_paths()` (#65)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants