Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename proc.cdf to remove_noise #190

Merged
merged 6 commits into from
Apr 19, 2023
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- refactored `adjust.time.R` [#64](https://github.com/RECETOX/recetox-aplcms/pull/64)[#102](https://github.com/RECETOX/recetox-aplcms/pull/102)
- refactored `find.tol.time.R` [#91](https://github.com/RECETOX/recetox-aplcms/pull/91)
- refactored `find.turn.point.R` [#91](https://github.com/RECETOX/recetox-aplcms/pull/91)
- refactored `proc.cdf.R` and `adaptive.bin.R` [#137](https://github.com/RECETOX/recetox-aplcms/pull/137)
- refactored `remove_noise.R` and `adaptive.bin.R` [#137](https://github.com/RECETOX/recetox-aplcms/pull/137)
hechth marked this conversation as resolved.
Show resolved Hide resolved
- refactored `cont.index.R` and renamed as `run_filter.R` [#156](https://github.com/RECETOX/recetox-aplcms/pull/156)
- use proper sample IDs inside feature tables [#153](https://github.com/RECETOX/recetox-aplcms/pull/153)
- exported functions in NAMESPACE [#154](https://github.com/RECETOX/recetox-aplcms/pull/154)
Expand Down
2 changes: 1 addition & 1 deletion NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -71,9 +71,9 @@ export(predict_smoothed_rt)
export(prep.uv)
export(preprocess_bandwidth)
export(preprocess_profile)
export(proc.cdf)
export(prof.to.features)
export(recover.weaker)
export(remove_noise)
export(rev_cum_sum)
export(rm.ridge)
export(run_filter)
Expand Down
4 changes: 2 additions & 2 deletions R/hybrid.R
Original file line number Diff line number Diff line change
Expand Up @@ -256,7 +256,7 @@ augment_known_table <- function(
#' @param filenames The CDF file names.
#' @param known_table Table of known chemicals.
#' @param min_occurrence A feature has to show up in at least this number of profiles to be included in the final result.
#' @param min_pres This is a parameter of the run filter, to be passed to the function proc.cdf().
#' @param min_pres This is a parameter of the run filter, to be passed to the function remove_noise().
#' @param min_run Run filter parameter. The minimum length of elution time for a series of signals grouped by m/z to be considered a peak.
#' @param mz_tol m/z tolerance level for the grouping of data points. This value is expressed as the fraction of the m/z value.
#' This value, multiplied by the m/z value, becomes the cutoff level. The recommended value is the machine's nominal accuracy level.
Expand Down Expand Up @@ -345,7 +345,7 @@ hybrid <- function(

message("**** feature extraction ****")
profiles <- snow::parLapply(cluster, filenames, function(filename) {
proc.cdf(
remove_noise(
filename = filename,
min_pres = min_pres,
min_run = min_run,
Expand Down
6 changes: 3 additions & 3 deletions R/prof.to.features.R
Original file line number Diff line number Diff line change
Expand Up @@ -869,10 +869,10 @@ normix.bic <- function(x, y, do.plot = FALSE, bw = c(15, 30, 60), eliminate = .0

#' Generate feature table from noise-removed LC/MS profile.
#' @description
#' Each LC/MS profile is first processed by the function proc.cdf() to remove noise and reduce data size. A matrix containing m/z
#' value, retention time, intensity, and group number is output from proc.cdf(). This matrix is then fed to the function
#' Each LC/MS profile is first processed by the function remove_noise() to remove noise and reduce data size. A matrix containing m/z
#' value, retention time, intensity, and group number is output from remove_noise(). This matrix is then fed to the function
#' prof.to.features() to generate a feature list. Every detected feature is summarized into a single row in the output matrix from this function.
#' @param profile The matrix output from proc.cdf(). It contains columns of m/z value, retention time, intensity and group number.
#' @param profile The matrix output from remove_noise(). It contains columns of m/z value, retention time, intensity and group number.
#' @param bandwidth A value between zero and one. Multiplying this value to the length of the signal along the time axis helps
#' determine the bandwidth in the kernel smoother used for peak identification.
#' @param min_bandwidth The minimum bandwidth to use in the kernel smoother.
Expand Down
2 changes: 1 addition & 1 deletion R/recover.weaker.R
Original file line number Diff line number Diff line change
Expand Up @@ -636,7 +636,7 @@ refine_selection <- function(target_rt, rectangle, aligned_mz, rt_tol, mz_tol) {
#' The default value is NA, in which case 0.5 times the retention time tolerance in the aligned object will be used.
#' @param use_observed_range If the value is TRUE, the actual range of the observed locations
#' of the feature in all the spectra will be used.
#' @param mz_tol The mz.tol parameter provided to the proc.cdf() function. This helps retrieve the intermediate file.
#' @param mz_tol The mz.tol parameter provided to the remove_noise() function. This helps retrieve the intermediate file.
#' @param min_bandwidth The minimum bandwidth to use in the kernel smoother.
#' @param max_bandwidth The maximum bandwidth to use in the kernel smoother.
#' @param bandwidth A value between zero and one. Multiplying this value to the length of the signal along the
Expand Down
2 changes: 1 addition & 1 deletion R/proc.cdf.R → R/remove_noise.R
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ load_data <- function(filename,
#' @param cache Whether to use cache
#' @return A matrix with four columns: m/z value, retention time, intensity, and group number.
#' @export
proc.cdf <- function(filename,
remove_noise <- function(filename,
min_pres,
min_run,
mz_tol,
Expand Down
8 changes: 4 additions & 4 deletions R/semi.sup.R
Original file line number Diff line number Diff line change
Expand Up @@ -14,12 +14,12 @@ NULL
#' @param known.table A data frame containing the known metabolite ions and previously found features.
#' @param n.nodes The number of CPU cores to be used
#' @param min.exp If a feature is to be included in the final feature table, it must be present in at least this number of spectra.
#' @param min.pres This is a parameter of thr run filter, to be passed to the function proc.cdf().
#' @param min.run This is a parameter of thr run filter, to be passed to the function proc.cdf().
#' @param min.pres This is a parameter of thr run filter, to be passed to the function remove_noise().
#' @param min.run This is a parameter of thr run filter, to be passed to the function remove_noise().
#' @param mz.tol The user can provide the m/z tolerance level for peak identification. This value is expressed
#' as the percentage of the m/z value. This value, multiplied by the m/z value, becomes the cutoff level.
#' @param baseline.correct.noise.percentile The perenctile of signal strength of those EIC that don't pass the run filter,
#' to be used as the baseline threshold of signal strength. This parameter is passed to proc.cdf()
#' to be used as the baseline threshold of signal strength. This parameter is passed to remove_noise()
#' @param shape.model The mathematical model for the shape of a peak. There are two choices - bi-Gaussian and Gaussian.
#' When the peaks are asymmetric, the bi-Gaussian is better. The default is bi-Gaussian.
#' @param BIC.factor the factor that is multiplied on the number of parameters to modify the BIC criterion. If
Expand Down Expand Up @@ -144,7 +144,7 @@ semi.sup <- function(
that.name<-paste(strsplit(tolower(files[j]),"\\.")[[1]][1],suf.prof,".profile",sep="_")

processable<-"goodgood"
processable<-try(this.prof<-proc.cdf(files[j], min_pres=min.pres, min_run=min.run, mz_tol=mz.tol, baseline_correct=baseline.correct, baseline_correct_noise_percentile=baseline.correct.noise.percentile, do.plot=FALSE, intensity_weighted=intensity.weighted, cache=FALSE))
processable<-try(this.prof<-remove_noise(files[j], min_pres=min.pres, min_run=min.run, mz_tol=mz.tol, baseline_correct=baseline.correct, baseline_correct_noise_percentile=baseline.correct.noise.percentile, do.plot=FALSE, intensity_weighted=intensity.weighted, cache=FALSE))
if(substr(processable,1,5)=="Error")
{
file.copy(from=files[j], to="error_files")
Expand Down
6 changes: 3 additions & 3 deletions R/two.step.hybrid.R
Original file line number Diff line number Diff line change
Expand Up @@ -390,12 +390,12 @@ semisup_to_hybrid_adapter <- function(batchwise, batches_idx) {
#' @param batch.align.rt.tol The RT tolerance for between-batch alignment.
#' @param known.table A data frame containing the known metabolite ions and previously found features.
#' @param cluster The number of CPU cores to be used
#' @param min.pres This is a parameter of the run filter, to be passed to the function proc.cdf().
#' @param min.run This is a parameter of the run filter, to be passed to the function proc.cdf().
#' @param min.pres This is a parameter of the run filter, to be passed to the function remove_noise().
#' @param min.run This is a parameter of the run filter, to be passed to the function remove_noise().
#' @param mz.tol The user can provide the m/z tolerance level for peak identification. This value is expressed as the
#' percentage of the m/z value. This value, multiplied by the m/z value, becomes the cutoff level.
#' @param baseline.correct.noise.percentile The perenctile of signal strength of those EIC that don't pass the run filter,
#' to be used as the baseline threshold of signal strength. This parameter is passed to proc.cdf()
#' to be used as the baseline threshold of signal strength. This parameter is passed to remove_noise()
#' @param shape.model The mathematical model for the shape of a peak. There are two choices - "bi-Gaussian" and "Gaussian".
#' When the peaks are asymmetric, the bi-Gaussian is better. The default is "bi-Gaussian".
#' @param baseline.correct This is a parameter in peak detection. After grouping the observations, the highest observation
Expand Down
4 changes: 2 additions & 2 deletions R/unsupervised.R
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ get_sample_name <- function(filename) {
#'
#' @param filenames The CDF file names.
#' @param min_occurrence A feature has to show up in at least this number of profiles to be included in the final result.
#' @param min_pres This is a parameter of the run filter, to be passed to the function proc.cdf().
#' @param min_pres This is a parameter of the run filter, to be passed to the function remove_noise().
#' @param min_run Run filter parameter. The minimum length of elution time for a series of signals grouped by m/z
#' to be considered a peak.
#' @param mz_tol m/z tolerance level for the grouping of data points. This value is expressed as the fraction of
Expand Down Expand Up @@ -135,7 +135,7 @@ unsupervised <- function(

message("**** feature extraction ****")
profiles <- snow::parLapply(cluster, filenames, function(filename) {
proc.cdf(
remove_noise(
filename = filename,
min_pres = min_pres,
min_run = min_run,
Expand Down
4 changes: 2 additions & 2 deletions R/utils.R
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ NULL

register_functions_to_cluster <- function(cluster) {
snow::clusterExport(cluster, list(
'proc.cdf',
'remove_noise',
'prof.to.features',
'load.lcms',
'adaptive.bin',
Expand Down Expand Up @@ -123,4 +123,4 @@ get_num_workers <- function() {
num_workers <- parallel::detectCores()
}
return(num_workers)
}
}
2 changes: 1 addition & 1 deletion man/hybrid.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions man/prof.to.features.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/recover.weaker.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 4 additions & 4 deletions man/proc.cdf.Rd → man/remove_noise.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions man/semi.sup.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions man/two.step.hybrid.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/unsupervised.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion tests/testthat/test-benchmark-extract_features.R
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ patrick::with_parameters_test_that(
res <- microbenchmark::microbenchmark(
extract_feature = {
profiles <- snow::parLapply(cluster, filenames, function(filename) {
proc.cdf(
remove_noise(
filename = filename,
min_pres = min_pres,
min_run = min_run,
Expand Down
2 changes: 1 addition & 1 deletion tests/testthat/test-extract_features.R
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ patrick::with_parameters_test_that(
register_functions_to_cluster(cluster)

profiles <- snow::parLapply(cluster, filenames, function(filename) {
proc.cdf(
remove_noise(
filename = filename,
min_pres = min_pres,
min_run = min_run,
Expand Down
4 changes: 2 additions & 2 deletions tests/testthat/test-proc.cdf.R
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
patrick::with_parameters_test_that(
"test proc.cdf",
"test remove_noise",
{
if(ci_skip == TRUE) skip_on_ci()

testdata <- file.path("..", "testdata")
input_path <- file.path(testdata, "input", filename)

sut <- proc.cdf(
sut <- remove_noise(
input_path,
min_pres = min_pres,
min_run = min_run,
Expand Down