Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add neuron name/type to neuprint_connection_table #132

Merged
merged 3 commits into from
Aug 22, 2020

Conversation

jefferis
Copy link
Contributor

No description provided.

* no changes to logic, just (a bit) easier to read
* neurons without types were being dropped
* proper tests for new functionality
@jefferis jefferis requested a review from romainFr August 16, 2020 22:14
Copy link
Collaborator

@romainFr romainFr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks good. Do we want to also consider returning even more metadata fields, like "notes", "status", "pre", "downstream" and the likes? They're usually helpful in subsequent analysis together with the type information.

@jefferis
Copy link
Contributor Author

jefferis commented Aug 17, 2020

I guess the alternative is to merge in all metadata. I thought this was a good compromise for many purposes. I will likely add another function that adds or updates metadata for an existing data.frame. Something like this:

neuprint_add_meta <- function(x, idname='bodyid', ignore.case = TRUE, ...) {
  if(!is.data.frame(x)) stop("I expect a data frame")
  cx=colnames(x)
  if(isTRUE(ignore.case)) {
    cx=tolower(cx)
    idname=tolower(idname)
  }
  matchcol=stats::na.omit(colnames(x)[match(idname, cx)])
  if(length(matchcol)!=1)
    stop("id column:", idname, " not present exactly once in input data frame!")
  
  # nb only check unique ids
  meta=neuprint_get_meta(unique(x[[matchcol]]), ...)
  # just merge the body id column
  merged=merge(x[matchcol], meta, by.x=matchcol, by.y='bodyid', all.x = T, sort=F)
  # make sure we have same number of rows in both tables
  stopifnot(isTRUE(all.equal(nrow(x),nrow(merged))))
  # make sure that the id orders match exactly
  merged=merged[match(x[[matchcol]], merged[[1]]), ]
  # and then check that ids are identical
  stopifnot(isTRUE(all.equal(x[[matchcol]], merged[[1]])))
  # now set columns that are present in meta (overwriting dups)
  x[colnames(merged)]=merged
  x
}

You would then use it like this:

mbon01ds=neuprint_connection_table("MBON01", threshold=5)
mbon01ds=neuprint_add_meta(mbon01ds, idname="partner")
# do your analysis

@romainFr
Copy link
Collaborator

Yes, that's basically what our workflows looks like right now. So pulling it right when pulling the connections would save the overhead of finding them in the database twice. But I suppose it is a matter what the most common workflows are?

On a related topic, we usually reformat our connection tables into a to/from (name.from/name.to, type.fom/type.to...) format to not be dependent on the "prepost" column. Would such a reformatting function be of interest for neuprintr?

@jefferis
Copy link
Contributor Author

Do you want to sketch out your format?

@romainFr
Copy link
Collaborator

Yes, starting from a connection table with added metadata for both the partners and the "source" neurons, I do something like :

   connectionTable <- connectionTable %>% mutate(from = ifelse(prepost==1,bodyid,partner),
                                                  to = ifelse(prepost==1,partner,bodyid),
                                                  name.from = as.character(ifelse(prepost==1,name,partnerName)),
                                                  name.to = as.character(ifelse(prepost==1,partnerName,name)),
                                                  type.from = as.character(ifelse(prepost==1,type,partnerType)),
                                                  type.to = as.character(ifelse(prepost==1,partnerType,type))
    ) %>%
      select(-bodyid,-partner,-name,-partnerName,-partnerType,-type,-prepost)
    return(connectionTable)

I'm thinking that to put the connections in context it would then make sense to add to that downstream.from(or post.from) and upstream.to (or pre.to) and their ROI specific equivalents if the request is ROI specific.

The other potential fields (status.from and status.to, notes.from and to) may also come in handy in some analysis/brain regions.

I'd be happy to make a PR for that if that's useful.

@jefferis
Copy link
Contributor Author

@romainFr I'm merging this, but I'd be very happy to see a PR along the lines that you suggest so long as it stays as lean as possible.

@jefferis jefferis merged commit 217d0a9 into master Aug 22, 2020
@jefferis jefferis deleted the feature/richer-conn-table branch May 21, 2022 06:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants