getPrevalence, NA values #486

TuomasBorman · 2024-01-31T12:50:05Z

If there are NA values, they are interpreted as FALSE i.e., certain value do not exceed the detection threshold i.e. "this taxon cannot be found from this sample"

antagomir

This is good if we assume that users will always want to consider NA as "not detected". I think this is OK.

Another option would be to let users pass na.rm=TRUE for rowSums on line 216 (or perhaps this could be the default then; albeit now na.rm=FALSE by default in other parts of this R file).

It might be advisable to throw a warning when NAs are interpreted as "not detected". User should get to know this.

I would also add some unit tests to spot this problem in the future if anything is changed?

TuomasBorman · 2024-02-14T19:01:56Z

One option is to catch na.rm in .agg_for_prevalence function and create new parameter called drop.empty.rank --> don't feed it to agglomreateByRank function instead use drop.empty.rank

--> Use na.rm in rowSums in line 216

antagomir · 2024-02-14T20:32:24Z

Yes the drop.empty.rank might be more clear.

antagomir · 2024-02-14T20:34:55Z

TBH I am not sure how necessary it is to allow users decide whether they want to have to have na.rm TRUE of FALSE in the merging. But perhaps it is good to let user decide ultimately.

I would pay some attention for documentation, examples & unit tests to keep it clear for users.

TuomasBorman · 2024-02-15T16:32:40Z

Is it now good to go?

antagomir

Some suggestions in the comments.

R/getPrevalence.R

antagomir · 2024-02-15T18:04:06Z

tests/testthat/test-5prevalence.R

+    assay(tse, "counts")[remove, ] <- NA
+    # There should be 3 NA values if na.rm = FALSE. Otherwise there should be 0
+    expect_true( sum(is.na(
+        getPrevalence(tse, assay.type = "counts", na.rm = FALSE) )) == 3)


does this also work when rank argument is specified? Because earlier there were differences how this works when rank is additionally provided, compared to this present example.

These modifications do not affect this thing, I guess. (I am not sure what differences there were)

When rank is specified

TreeSE is agglomerated to level specified by rank. agg.na.rm (I renamed drop.empty.ranks) is used to specify whether to drop those rows that do not have taxonomy in level specified by rank. --> Instead of "phylum_something" & "class_something2" there are only taxa that have taxonomy info in class level, for instance.

If there are NA values, agglomerated tables have also NA values in higher level rank --> This might be misleading? We could add parameter to agglomerateByRank that removes NAs? (agglomerateByRanks --> mergeRows --> .merge_rows --> scuttle::sumCountsAcrossFeatures)

getPrevalence calculates prevalence of features. With na.rm, user can specify whether to remove NA values.

So currently, when there are NA values and rank is specified, there are some taxa that have prevalence NA.

antagomir

Can we also have in unit tests test of the rank parameter, something like

getPrevalence(tse, assay.type = "counts", na.rm = FALSE, rank="Genus")

antagomir · 2024-02-16T11:17:35Z

R/getPrevalence.R

+        if( any( is.na(x) ) ){
+            msg <- paste0(
+                "The abundance table contains NA values and they are ",
+                ifelse(na.rm, "not", ""), "excluded (see 'na.rm').")


is this in sync with the agg.na.rm?

This is na.rm for getPrevalence function (agg.na.rm is for agglomerateByRanks) --> so yes

Ok. Just ensure it is sufficiently clear from documentation what na.rm is for what purposes and there is no real risk for confusion between those.

Merge branch 'getprevalence' of github.com:microbiome/mia into getprevalence # Conflicts: # DESCRIPTION

TuomasBorman added 2 commits January 31, 2024 14:00

up

5a42ea9

up

327f4c2

TuomasBorman requested a review from antagomir January 31, 2024 12:50

antagomir approved these changes Feb 14, 2024

View reviewed changes

Merge branch 'master' into getprevalence

747aceb

TuomasBorman linked an issue Feb 14, 2024 that may be closed by this pull request

getPrevalence, named numeric(0) if NA values in the assay #492

Closed

up

311e0d9

antagomir reviewed Feb 14, 2024

View reviewed changes

TuomasBorman added 3 commits February 15, 2024 15:27

up

c8b2ad1

up

5e8ce7a

up

7a97922

antagomir reviewed Feb 15, 2024

View reviewed changes

TuomasBorman added 2 commits February 16, 2024 11:11

up

ac7f02a

up

010ebe2

antagomir reviewed Feb 16, 2024

View reviewed changes

TuomasBorman and others added 5 commits February 16, 2024 15:12

up

db4fee1

Merge branch 'master' into getprevalence

47128bf

Merge branch 'master' into getprevalence

3a7e38a

up

d3e5074

up

a42b032

Merge branch 'getprevalence' of github.com:microbiome/mia into getprevalence # Conflicts: # DESCRIPTION

TuomasBorman merged commit aa1d488 into master Mar 5, 2024
1 check passed

TuomasBorman deleted the getprevalence branch March 5, 2024 13:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

getPrevalence, NA values #486

getPrevalence, NA values #486

TuomasBorman commented Jan 31, 2024

antagomir left a comment •

edited

Loading

TuomasBorman commented Feb 14, 2024

antagomir commented Feb 14, 2024

antagomir commented Feb 14, 2024

TuomasBorman commented Feb 15, 2024

antagomir left a comment

antagomir Feb 15, 2024

TuomasBorman Feb 16, 2024 •

edited

Loading

antagomir left a comment

antagomir Feb 16, 2024

TuomasBorman Feb 16, 2024

antagomir Feb 16, 2024

getPrevalence, NA values #486

getPrevalence, NA values #486

Conversation

TuomasBorman commented Jan 31, 2024

antagomir left a comment • edited Loading

Choose a reason for hiding this comment

TuomasBorman commented Feb 14, 2024

antagomir commented Feb 14, 2024

antagomir commented Feb 14, 2024

TuomasBorman commented Feb 15, 2024

antagomir left a comment

Choose a reason for hiding this comment

antagomir Feb 15, 2024

Choose a reason for hiding this comment

TuomasBorman Feb 16, 2024 • edited Loading

Choose a reason for hiding this comment

antagomir left a comment

Choose a reason for hiding this comment

antagomir Feb 16, 2024

Choose a reason for hiding this comment

TuomasBorman Feb 16, 2024

Choose a reason for hiding this comment

antagomir Feb 16, 2024

Choose a reason for hiding this comment

antagomir left a comment •

edited

Loading

TuomasBorman Feb 16, 2024 •

edited

Loading