Skip to content

Commit

Permalink
Allow specifying column of pointlevel
Browse files Browse the repository at this point in the history
  • Loading branch information
yufongpeng committed Jan 9, 2024
1 parent c5e938d commit 1acf1aa
Show file tree
Hide file tree
Showing 8 changed files with 80 additions and 76 deletions.
28 changes: 17 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ This package provides two basic wrappers, `ColumnDataTable{A, T}` and `RowDataTa
|----------|---------------------|------------------|
|`analytename`|`Vector{Symbol}`, the column names that are analytes names|`Vector{Symbol}`, symbols transformed from column `analytecol`.|
|`samplename`|`Vector{Symbol}`, symbols transformed from column `samplecol`.|`Vector{Symbol}`, the column names that are sample names.|
|`analyte`|`Vector{A}` stored in field `config`, analytes in user-defined types.|`Vector{A}`, analytes in user-defined types.|
|`analyte`|`Vector{A}`, analytes in user-defined types.|same|
|`sample`|`Vector`, the column `samplecol`.|`Vector{Symbol}`, the column names that are sample names..|
|`table`|Tabular data of type `T`|same|

Expand All @@ -23,7 +23,7 @@ The package provides another two wrappers, `MethodTable{A, T}`, and `AnalysisTab
This type is used for storing method, containing all analytes, their internal standards and calibration curve setting, and data for fitting calibration curve.
|Property|Description|
|----------|---------|
|`analytetable`|`Table` with 3 columns, `analytes` identical to property `analytes`, `isd`, matching each analyte to index of its internal standard, and `calibration` matching each analyte to index of other analyte for fitting its calibration curve. `-1` indicates the analyte itself is internal standard, and `0` indicates no internal standard. For example, a row `(analytes = AnalyteX, isd = 2, calibration = 3)` means that internal standard of `AnalyteX` is the second analyte, and it will be quantified using calibration curve of the third analyte.|
|`analytetable`|`Table` with at least 3 columns, `analytes` identical to property `analytes`, `isd`, matching each analyte to index of its internal standard, and `calibration` matching each analyte to index of other analyte for fitting its calibration curve. `-1` indicates the analyte itself is internal standard, and `0` indicates no internal standard. For example, a row `(analytes = AnalyteX, isd = 2, calibration = 3)` means that internal standard of `AnalyteX` is the second analyte, and it will be quantified using calibration curve of the third analyte.|
|`signal`|`Symbol`, propertyname for extracting signal data from an `AnalysisTable`|
|`pointlevel`|`Vector{Int}` matching each point to level. It can be empty if there is only one level in `conctable`.|
|`conctable`|`AbstractDataTable{A, <: T}` containing concentration data for each level. Sample names must be symbol or string of integers for multiple levels. One level indicates using `SingleCalibration`.|
Expand Down Expand Up @@ -119,7 +119,7 @@ batch_name.batch
├──1_quantity2.dt
└──2_quantity3.dt
```
Config files has the following general forms
Config files have the following general forms
```
[property1]
value
Expand All @@ -139,7 +139,7 @@ The property `delim` determines the default delimiter for `table.txt` in this di
### *.dt
All `*.dt` files will be read as `ColumnDataTable` or `RowDataTable`. They contain `config.txt` and `table.txt`.

Config file for `ColumnDataTable` needs at least the following two properties.
Config file for `ColumnDataTable` needs the following properties.
```
[Type]
C
Expand All @@ -157,7 +157,7 @@ analyte_col_name_2
.
.
```
Config file for `RowDataTable` needs at least the following two properties.
Config file for `RowDataTable` needs the following properties.
```
[Type]
R
Expand All @@ -177,29 +177,34 @@ sample_col_name_2
```

### *.mt
It must contain two `*dt` files. `true_concentrstion.dt` contains true concentration for each analyte and level. The sample names must be integers.
It must contain two `*dt` files. `true_concentration.dt` contains true concentration for each analyte and level. The sample names must be integers.
Another `*.dt` file is signal data for each analyte and calibration point. The file name is determined by `config.txt`.

Config file for `method.mt` needs two properties, `signal` and `pointlevel`.
Config file for `method.mt` needs the following properties.
```
[signal]
area
[delim]
\t
[levelname]
level
[pointlevel]
level_for_1st_point
level_for_2nd_point
.
.
.
```
`signal` specify which `.dt` file serving as signal data. For the above file, `method.mt/area.dt` will be `method.signaltable`.
`signal` specifys which `.dt` file serving as signal data. For the above file, `method.mt/area.dt` will become `method.signaltable`.

`pointlevel` maps each point to level which should be integers.

`pointlevel` maps each point to level which should be integers as well.
`level` specifys the column representing property `pointlevel` of `MethodTable`. It only works for which `signaltable` is `ColumnDataTable`; otherwise, it falls back to use `pointlevel`.

`analytetable.txt` contains analyte names, index of their internal standards, and index of of other analytes whose calibration curve is used.
`analytetable.txt` needs to contain analyte names, index of their internal standards, and index of of other analytes whose calibration curve is used.
```
analytes isd calibration
analyte1 isd1 calibration_analyte_id1
Expand Down Expand Up @@ -277,7 +282,7 @@ signaltable = ColumnDataTable(
:point;
analytetype = AnalyteTest
)
method = MethodTable(conctable, signaltable, :area, repeat(1:7, 3); analyte = AnalyteTest.(analyte_names), isd = [2, -1, 4, -1], calibration = [1, -1, 3, -1])
method = MethodTable(conctable, signaltable, :area, :point; analyte = AnalyteTest.(analyte_names), isd = [2, -1, 4, -1], calibration = [1, -1, 3, -1])

# Create sample data
rdata = AnalysisTable([:area], [
Expand Down Expand Up @@ -329,6 +334,7 @@ cdata.area[1, "G1(drug_a)"] == 6
collect(eachanalyte(cdata.area))
collect(eachsample(cdata.area))
getanalyte(cdata.area, AnalyteG1("G1(drug_b)"))
getanalyte(cdata.area, 1)
getsample(cdata.area, "S2")
dynamic_range(cbatch.calibration[1])
signal_range(rbatch.calibration[2])
Expand Down
5 changes: 5 additions & 0 deletions src/ChemistryQuantitativeAnalysis.jl
Original file line number Diff line number Diff line change
Expand Up @@ -285,6 +285,7 @@ propertynames(tbl::MethodTable) = (:analytetable, :signal, :pointlevel, :conctab
"""
MethodTable(conctable::AbstractDataTable, signaltable::Union{AbstractDataTable, Nothing}, signal, pointlevel = []; kwargs...)
MethodTable{T}(conctable::AbstractDataTable, signaltable::Union{AbstractDataTable, Nothing}, signal, pointlevel = []; kwargs...)
MethodTable(conctable::AbstractDataTable, signaltable::ColumnDataTable, signal::Symbol, levelname::Symbol; kwargs...)
User-friendly contructors for `MethodTable`. `kwargs` will be columns in `analytetable`; when `analyte`, `isd` and `calibration` are not provided, it will use analyte in `conctable`.
"""
Expand All @@ -296,6 +297,10 @@ MethodTable(conctable::AbstractDataTable{A, T},
signaltable::Nothing,
signal::Symbol,
pointlevel::Vector{Int} = Int[]; kwargs...) where {A, T} = MethodTable{A, T}(conctable, signaltable, signal, pointlevel; kwargs...)
MethodTable(conctable::AbstractDataTable{A, T},
signaltable::ColumnDataTable{B, S},
signal::Symbol,
levelname::Symbol; kwargs...) where {A, B, T, S} = MethodTable{promote_type(T, S)}(conctable, signaltable, signal, getproperty(signaltable, levelname); kwargs...)
function MethodTable{T}(conctable::AbstractDataTable,
signaltable::Union{AbstractDataTable, Nothing},
signal::Symbol,
Expand Down
21 changes: 14 additions & 7 deletions src/io.jl
Original file line number Diff line number Diff line change
Expand Up @@ -41,14 +41,14 @@ function read_calibration(file::String; analytetype = String, delim = '\t')
end

"""
read_datatable(file::String, T; analytetype::Type{A} = String, delim = '\\t') -> AbstractDataTable{A, S <: T}
read_datatable(file::String, T; analytetype::Type{A} = String, delim = '\\t', levelname = :level) -> AbstractDataTable{A, S <: T}
Read ".dt" file into julia as `ColumnDataTable` or `RowDataTable`. `T` is the sink function for tabular data, `analytetype` is a concrete type for `analyte` which msut have a method for string input,
and `delim` specifies delimiter for tabular data if `config[:delim]` does not exist.
and `delim` specifies delimiter for tabular data if `config[:delim]` does not exist. `level` is specifically used for methodtable, indicating the column representing calibration level; this column should be all integers.
See README.md for the structure of ".dt" file.
"""
function read_datatable(file::String, T; analytetype = String, delim = '\t')
function read_datatable(file::String, T; analytetype = String, delim = '\t', levelname = nothing)
endswith(file, ".dt") || throw(ArgumentError("The file is not a valid table directory"))
config = read_config(joinpath(file, "config.txt"))
delim = get(config, :delim, delim)
Expand All @@ -62,7 +62,7 @@ function read_datatable(file::String, T; analytetype = String, delim = '\t')
RowDataTable(getproperty(tbl, analyte_name), analyte_name, sample_name, tbl)
else
sample_name = Symbol(first(split(config[:Sample], "\t")))
tbl = CSV.read(joinpath(file, "table.txt"), T; delim, typemap = Dict(Int => Float64), types = Dict(sample_name => String))
tbl = CSV.read(joinpath(file, "table.txt"), T; delim, typemap = Dict(Int => Float64), types = Dict(sample_name => String, levelname => Int), validate = false)
analyte_name = String.(filter!(!isempty, vectorize(config[:Analyte])))
config[:Type] == "C" || throw(ArgumentError("StackDataTable is not implemented yet."))
for i in Symbol.(analyte_name)
Expand Down Expand Up @@ -111,8 +111,10 @@ function read_methodtable(file::String, T; tabletype = T, analytetype = String,
isd = replace(analytetable.isd, missing => 0)
conctable = read_datatable(joinpath(file, "true_concentration.dt"), T; analytetype, delim)
if length(conctable.sample) > 1
signaltable = read_datatable(joinpath(file, "$signal.dt"), T; analytetype, delim)
if haskey(config, :pointlevel)
signaltable = read_datatable(joinpath(file, "$signal.dt"), T; analytetype, delim, levelname = get(config, :levelname, nothing))
if haskey(config, :levelname) && in(Symbol(config[:levelname]), propertynames(signaltable))
pointlevel = getproperty(signaltable, Symbol(config[:levelname]))
elseif haskey(config, :pointlevel)
pointlevel = parse.(Int, config[:pointlevel])
else
nl = length(conctable.sample)
Expand Down Expand Up @@ -281,8 +283,13 @@ function write(file::String, tbl::MethodTable; delim = '\t')
mkpath(file)
write(joinpath(file, "true_concentration.dt"), tbl.conctable; delim)
isnothing(tbl.signaltable) || write(joinpath(file, "$(tbl.signal).dt"), tbl.signaltable; delim)
id = nothing
if isa(tbl.signaltable, ColumnDataTable)
id = findfirst(x -> getproperty(tbl.signaltable.table, x) == tbl.pointlevel, propertynames(tbl.signaltable.table))
end
open(joinpath(file, "config.txt"), "w+") do config
Base.write(config, "[signal]\n", tbl.signal, "\n\n[delim]\n", escape_string(string(delim)), "\n\n[pointlevel]\n", join(tbl.pointlevel, "\n"))
isnothing(id) ? Base.write(config, "[signal]\n", tbl.signal, "\n\n[delim]\n", escape_string(string(delim)), "\n\n[pointlevel]\n", join(tbl.pointlevel, "\n")) :
Base.write(config, "[signal]\n", tbl.signal, "\n\n[delim]\n", escape_string(string(delim)), "\n\n[levelname]\n", propertynames(tbl.signaltable.table)[id])
end
CSV.write(joinpath(file, "analytetable.txt"), tbl.analytetable; delim)
end
Expand Down
38 changes: 19 additions & 19 deletions test/data/initial_mc_c.batch/method.mt/area.dt/table.txt
Original file line number Diff line number Diff line change
@@ -1,19 +1,19 @@
Sample Analyte1 Analyte2 Analyte3 Analyte4 Analyte5
1_1 102.4212472 96.42752442 208.2656569 21.66200012 152.3809399
1_2 105.4410582 106.9120134 212.3767634 22.22889158 156.5128989
1_3 103.2483322 103.5855335 206.6471272 21.48193963 150.577609
2_1 193.3835925 105.4176522 391.3028828 26.26856958 279.3593859
2_2 189.8952021 101.023409 384.1323374 25.25630343 273.4204046
2_3 180.5397414 107.4028693 364.5058631 25.70375674 261.389156
5_1 538.3684182 96.11969193 1077.669754 23.9966764 758.3638326
5_2 467.3572557 105.6740928 936.1787229 25.68064986 662.5070483
5_3 479.9276947 103.9678058 969.3611237 24.22180868 684.2497913
10_1 954.7897112 91.23431149 1909.639386 25.64967851 1344.238317
10_2 999.231432 98.04434403 2002.599781 25.02116125 1406.082423
10_3 1057.796026 94.88047873 2118.20738 25.32224711 1486.806282
20_1 2090.971679 103.2068489 4188.142591 25.25308104 2935.471746
20_2 1942.319983 104.7321677 3884.899366 25.33092479 2727.256898
20_3 2048.176173 93.39275376 4096.421883 25.76772176 2872.776165
50_1 5077.33551 99.20135272 10155.93833 24.62228954 7115.878054
50_2 5123.199858 100.9526035 10253.68608 24.32681841 7184.179874
50_3 5142.329223 97.2009272 10288.52817 26.27945024 7206.298386
Sample Level Analyte1 Analyte2 Analyte3 Analyte4 Analyte5
1_1 1 102.4212472 96.42752442 208.2656569 21.66200012 152.3809399
1_2 1 105.4410582 106.9120134 212.3767634 22.22889158 156.5128989
1_3 1 103.2483322 103.5855335 206.6471272 21.48193963 150.577609
2_1 2 193.3835925 105.4176522 391.3028828 26.26856958 279.3593859
2_2 2 189.8952021 101.023409 384.1323374 25.25630343 273.4204046
2_3 2 180.5397414 107.4028693 364.5058631 25.70375674 261.389156
5_1 3 538.3684182 96.11969193 1077.669754 23.9966764 758.3638326
5_2 3 467.3572557 105.6740928 936.1787229 25.68064986 662.5070483
5_3 3 479.9276947 103.9678058 969.3611237 24.22180868 684.2497913
10_1 4 954.7897112 91.23431149 1909.639386 25.64967851 1344.238317
10_2 4 999.231432 98.04434403 2002.599781 25.02116125 1406.082423
10_3 4 1057.796026 94.88047873 2118.20738 25.32224711 1486.806282
20_1 5 2090.971679 103.2068489 4188.142591 25.25308104 2935.471746
20_2 5 1942.319983 104.7321677 3884.899366 25.33092479 2727.256898
20_3 5 2048.176173 93.39275376 4096.421883 25.76772176 2872.776165
50_1 6 5077.33551 99.20135272 10155.93833 24.62228954 7115.878054
50_2 6 5123.199858 100.9526035 10253.68608 24.32681841 7184.179874
50_3 6 5142.329223 97.2009272 10288.52817 26.27945024 7206.298386
3 changes: 3 additions & 0 deletions test/data/initial_mc_c.batch/method.mt/config.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,9 @@ area
[delim]
\t

[levelname]
Level

[pointlevel]
1
1
Expand Down
38 changes: 19 additions & 19 deletions test/data/save_mc_c.batch/method.mt/area.dt/table.txt
Original file line number Diff line number Diff line change
@@ -1,19 +1,19 @@
Sample Analyte1 Analyte2 Analyte3 Analyte4 Analyte5
1_1 102.4212472 96.42752442 208.2656569 21.66200012 152.3809399
1_2 105.4410582 106.9120134 212.3767634 22.22889158 156.5128989
1_3 103.2483322 103.5855335 206.6471272 21.48193963 150.577609
2_1 193.3835925 105.4176522 391.3028828 26.26856958 279.3593859
2_2 189.8952021 101.023409 384.1323374 25.25630343 273.4204046
2_3 180.5397414 107.4028693 364.5058631 25.70375674 261.389156
5_1 538.3684182 96.11969193 1077.669754 23.9966764 758.3638326
5_2 467.3572557 105.6740928 936.1787229 25.68064986 662.5070483
5_3 479.9276947 103.9678058 969.3611237 24.22180868 684.2497913
10_1 954.7897112 91.23431149 1909.639386 25.64967851 1344.238317
10_2 999.231432 98.04434403 2002.599781 25.02116125 1406.082423
10_3 1057.796026 94.88047873 2118.20738 25.32224711 1486.806282
20_1 2090.971679 103.2068489 4188.142591 25.25308104 2935.471746
20_2 1942.319983 104.7321677 3884.899366 25.33092479 2727.256898
20_3 2048.176173 93.39275376 4096.421883 25.76772176 2872.776165
50_1 5077.33551 99.20135272 10155.93833 24.62228954 7115.878054
50_2 5123.199858 100.9526035 10253.68608 24.32681841 7184.179874
50_3 5142.329223 97.2009272 10288.52817 26.27945024 7206.298386
Sample Level Analyte1 Analyte2 Analyte3 Analyte4 Analyte5
1_1 1 102.4212472 96.42752442 208.2656569 21.66200012 152.3809399
1_2 1 105.4410582 106.9120134 212.3767634 22.22889158 156.5128989
1_3 1 103.2483322 103.5855335 206.6471272 21.48193963 150.577609
2_1 2 193.3835925 105.4176522 391.3028828 26.26856958 279.3593859
2_2 2 189.8952021 101.023409 384.1323374 25.25630343 273.4204046
2_3 2 180.5397414 107.4028693 364.5058631 25.70375674 261.389156
5_1 3 538.3684182 96.11969193 1077.669754 23.9966764 758.3638326
5_2 3 467.3572557 105.6740928 936.1787229 25.68064986 662.5070483
5_3 3 479.9276947 103.9678058 969.3611237 24.22180868 684.2497913
10_1 4 954.7897112 91.23431149 1909.639386 25.64967851 1344.238317
10_2 4 999.231432 98.04434403 2002.599781 25.02116125 1406.082423
10_3 4 1057.796026 94.88047873 2118.20738 25.32224711 1486.806282
20_1 5 2090.971679 103.2068489 4188.142591 25.25308104 2935.471746
20_2 5 1942.319983 104.7321677 3884.899366 25.33092479 2727.256898
20_3 5 2048.176173 93.39275376 4096.421883 25.76772176 2872.776165
50_1 6 5077.33551 99.20135272 10155.93833 24.62228954 7115.878054
50_2 6 5123.199858 100.9526035 10253.68608 24.32681841 7184.179874
50_3 6 5142.329223 97.2009272 10288.52817 26.27945024 7206.298386
21 changes: 2 additions & 19 deletions test/data/save_mc_c.batch/method.mt/config.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,22 +4,5 @@ area
[delim]
\t

[pointlevel]
1
1
1
2
2
2
3
3
3
4
4
4
5
5
5
6
6
6
[levelname]
Level
2 changes: 1 addition & 1 deletion test/runtests.jl
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ end
:point;
analytetype = AnalyteTest
)
global method = MethodTable(conctable, signaltable, :area, repeat(1:7, 3); analyte = AnalyteTest.(analyte_names), isd = [2, -1, 4, -1], calibration = [1, -1, 3, -1])
global method = MethodTable(conctable, signaltable, :area, :point; analyte = AnalyteTest.(analyte_names), isd = [2, -1, 4, -1], calibration = [1, -1, 3, -1])
global cdata = AnalysisTable([:area], [
ColumnDataTable(
DataFrame(
Expand Down

0 comments on commit 1acf1aa

Please sign in to comment.