Allow specifying column of pointlevel

yufongpeng · Jan 9, 2024 · 1acf1aa · 1acf1aa
1 parent c5e938d
commit 1acf1aa
Show file tree

Hide file tree

Showing 8 changed files with 80 additions and 76 deletions.
diff --git a/README.md b/README.md
@@ -11,7 +11,7 @@ This package provides two basic wrappers, `ColumnDataTable{A, T}` and `RowDataTa
 |----------|---------------------|------------------|
 |`analytename`|`Vector{Symbol}`, the column names that are analytes names|`Vector{Symbol}`, symbols transformed from column `analytecol`.|
 |`samplename`|`Vector{Symbol}`, symbols transformed from column `samplecol`.|`Vector{Symbol}`, the column names that are sample names.|
-|`analyte`|`Vector{A}` stored in field `config`, analytes in user-defined types.|`Vector{A}`, analytes in user-defined types.|
+|`analyte`|`Vector{A}`, analytes in user-defined types.|same|
 |`sample`|`Vector`, the column `samplecol`.|`Vector{Symbol}`, the column names that are sample names..|
 |`table`|Tabular data of type `T`|same|
 
@@ -23,7 +23,7 @@ The package provides another two wrappers, `MethodTable{A, T}`, and `AnalysisTab
 This type is used for storing method, containing all analytes, their internal standards and calibration curve setting, and data for fitting calibration curve.
 |Property|Description|
 |----------|---------|
-|`analytetable`|`Table` with 3 columns, `analytes` identical to property `analytes`, `isd`, matching each analyte to index of its internal standard, and `calibration` matching each analyte to index of other analyte for fitting its calibration curve. `-1` indicates the analyte itself is internal standard, and `0` indicates no internal standard. For example, a row `(analytes = AnalyteX, isd = 2, calibration = 3)` means that internal standard of `AnalyteX` is the second analyte, and it will be quantified using calibration curve of the third analyte.|
+|`analytetable`|`Table` with at least 3 columns, `analytes` identical to property `analytes`, `isd`, matching each analyte to index of its internal standard, and `calibration` matching each analyte to index of other analyte for fitting its calibration curve. `-1` indicates the analyte itself is internal standard, and `0` indicates no internal standard. For example, a row `(analytes = AnalyteX, isd = 2, calibration = 3)` means that internal standard of `AnalyteX` is the second analyte, and it will be quantified using calibration curve of the third analyte.|
 |`signal`|`Symbol`, propertyname for extracting signal data from an `AnalysisTable`|
 |`pointlevel`|`Vector{Int}` matching each point to level. It can be empty if there is only one level in `conctable`.|
 |`conctable`|`AbstractDataTable{A, <: T}` containing concentration data for each level. Sample names must be symbol or string of integers for multiple levels. One level indicates using `SingleCalibration`.|
@@ -119,7 +119,7 @@ batch_name.batch
    ├──1_quantity2.dt 
    └──2_quantity3.dt
 ```
-Config files has the following general forms
+Config files have the following general forms
 ```
 [property1]
 value
@@ -139,7 +139,7 @@ The property `delim` determines the default delimiter for `table.txt` in this di
 ### *.dt
 All `*.dt` files will be read as `ColumnDataTable` or `RowDataTable`. They contain `config.txt` and `table.txt`.
 
-Config file for `ColumnDataTable` needs at least the following two properties.
+Config file for `ColumnDataTable` needs the following properties.
 ```
 [Type]
 C
@@ -157,7 +157,7 @@ analyte_col_name_2
 .
 .
 ``` 
-Config file for `RowDataTable` needs at least the following two properties.
+Config file for `RowDataTable` needs the following properties.
 ```
 [Type]
 R
@@ -177,29 +177,34 @@ sample_col_name_2
 ``` 
 
 ### *.mt
-It must contain two `*dt` files. `true_concentrstion.dt` contains true concentration for each analyte and level. The sample names must be integers.
+It must contain two `*dt` files. `true_concentration.dt` contains true concentration for each analyte and level. The sample names must be integers.
 Another `*.dt` file is signal data for each analyte and calibration point. The file name is determined by `config.txt`.
 
-Config file for `method.mt` needs two properties, `signal` and `pointlevel`.
+Config file for `method.mt` needs the following  properties.
 ```
 [signal]
 area
 
 [delim]
 \t
 
+[levelname]
+level
+
 [pointlevel]
 level_for_1st_point
 level_for_2nd_point
 .
 .
 .
 ```
-`signal` specify which `.dt` file serving as signal data. For the above file, `method.mt/area.dt` will be `method.signaltable`.
+`signal` specifys which `.dt` file serving as signal data. For the above file, `method.mt/area.dt` will become `method.signaltable`.
+
+`pointlevel` maps each point to level which should be integers.
 
-`pointlevel` maps each point to level which should be integers as well.
+`level` specifys the column representing property `pointlevel` of `MethodTable`. It only works for which `signaltable` is `ColumnDataTable`; otherwise, it falls back to use `pointlevel`.
 
-`analytetable.txt` contains analyte names, index of their internal standards, and index of of other analytes whose calibration curve is used. 
+`analytetable.txt` needs to contain analyte names, index of their internal standards, and index of of other analytes whose calibration curve is used. 
 ```
 analytes isd   calibration
 analyte1 isd1  calibration_analyte_id1
@@ -277,7 +282,7 @@ signaltable = ColumnDataTable(
    :point; 
    analytetype = AnalyteTest
 )
-method = MethodTable(conctable, signaltable, :area, repeat(1:7, 3); analyte = AnalyteTest.(analyte_names), isd = [2, -1, 4, -1], calibration = [1, -1, 3, -1])
+method = MethodTable(conctable, signaltable, :area, :point; analyte = AnalyteTest.(analyte_names), isd = [2, -1, 4, -1], calibration = [1, -1, 3, -1])
 
 # Create sample data
 rdata = AnalysisTable([:area], [
@@ -329,6 +334,7 @@ cdata.area[1, "G1(drug_a)"] == 6
 collect(eachanalyte(cdata.area))
 collect(eachsample(cdata.area))
 getanalyte(cdata.area, AnalyteG1("G1(drug_b)"))
+getanalyte(cdata.area, 1)
 getsample(cdata.area, "S2")
 dynamic_range(cbatch.calibration[1])
 signal_range(rbatch.calibration[2])

diff --git a/src/ChemistryQuantitativeAnalysis.jl b/src/ChemistryQuantitativeAnalysis.jl
@@ -285,6 +285,7 @@ propertynames(tbl::MethodTable) = (:analytetable, :signal, :pointlevel, :conctab
 """
     MethodTable(conctable::AbstractDataTable, signaltable::Union{AbstractDataTable, Nothing}, signal, pointlevel = []; kwargs...)
     MethodTable{T}(conctable::AbstractDataTable, signaltable::Union{AbstractDataTable, Nothing}, signal, pointlevel = []; kwargs...)
+    MethodTable(conctable::AbstractDataTable, signaltable::ColumnDataTable, signal::Symbol, levelname::Symbol; kwargs...)
 
 User-friendly contructors for `MethodTable`. `kwargs` will be columns in `analytetable`; when `analyte`, `isd` and `calibration` are not provided, it will use analyte in `conctable`.
 """
@@ -296,6 +297,10 @@ MethodTable(conctable::AbstractDataTable{A, T},
             signaltable::Nothing,
             signal::Symbol,
             pointlevel::Vector{Int} = Int[]; kwargs...) where {A, T} = MethodTable{A, T}(conctable, signaltable, signal, pointlevel; kwargs...)
+MethodTable(conctable::AbstractDataTable{A, T}, 
+            signaltable::ColumnDataTable{B, S},
+            signal::Symbol,
+            levelname::Symbol; kwargs...) where {A, B, T, S} = MethodTable{promote_type(T, S)}(conctable, signaltable, signal, getproperty(signaltable, levelname); kwargs...)
 function MethodTable{T}(conctable::AbstractDataTable, 
                     signaltable::Union{AbstractDataTable, Nothing},
                     signal::Symbol,

diff --git a/src/io.jl b/src/io.jl
@@ -41,14 +41,14 @@ function read_calibration(file::String; analytetype = String, delim = '\t')
 end
 
 """
-    read_datatable(file::String, T; analytetype::Type{A} = String, delim = '\\t') -> AbstractDataTable{A, S <: T}
+    read_datatable(file::String, T; analytetype::Type{A} = String, delim = '\\t', levelname = :level) -> AbstractDataTable{A, S <: T}
 
 Read ".dt" file into julia as `ColumnDataTable` or `RowDataTable`. `T` is the sink function for tabular data, `analytetype` is a concrete type for `analyte` which msut have a method for string input, 
-and `delim` specifies delimiter for tabular data if `config[:delim]` does not exist.
+and `delim` specifies delimiter for tabular data if `config[:delim]` does not exist. `level` is specifically used for methodtable, indicating the column representing calibration level; this column should be all integers.
 
 See README.md for the structure of ".dt" file.
 """
-function read_datatable(file::String, T; analytetype = String, delim = '\t')
+function read_datatable(file::String, T; analytetype = String, delim = '\t', levelname = nothing)
     endswith(file, ".dt") || throw(ArgumentError("The file is not a valid table directory"))
     config = read_config(joinpath(file, "config.txt"))
     delim = get(config, :delim, delim)
@@ -62,7 +62,7 @@ function read_datatable(file::String, T; analytetype = String, delim = '\t')
         RowDataTable(getproperty(tbl, analyte_name), analyte_name, sample_name, tbl)
     else
         sample_name = Symbol(first(split(config[:Sample], "\t")))
-        tbl = CSV.read(joinpath(file, "table.txt"), T; delim, typemap = Dict(Int => Float64), types = Dict(sample_name => String))
+        tbl = CSV.read(joinpath(file, "table.txt"), T; delim, typemap = Dict(Int => Float64), types = Dict(sample_name => String, levelname => Int), validate = false)
         analyte_name = String.(filter!(!isempty, vectorize(config[:Analyte])))
         config[:Type] == "C" || throw(ArgumentError("StackDataTable is not implemented yet."))
         for i in Symbol.(analyte_name)
@@ -111,8 +111,10 @@ function read_methodtable(file::String, T; tabletype = T, analytetype = String,
     isd = replace(analytetable.isd, missing => 0)
     conctable = read_datatable(joinpath(file, "true_concentration.dt"), T; analytetype, delim)
     if length(conctable.sample) > 1
-        signaltable = read_datatable(joinpath(file, "$signal.dt"), T; analytetype, delim)
-        if haskey(config, :pointlevel)
+        signaltable = read_datatable(joinpath(file, "$signal.dt"), T; analytetype, delim, levelname = get(config, :levelname, nothing))
+        if haskey(config, :levelname) && in(Symbol(config[:levelname]), propertynames(signaltable))
+            pointlevel = getproperty(signaltable, Symbol(config[:levelname]))
+        elseif haskey(config, :pointlevel)
             pointlevel = parse.(Int, config[:pointlevel])
         else
             nl = length(conctable.sample)
@@ -281,8 +283,13 @@ function write(file::String, tbl::MethodTable; delim = '\t')
     mkpath(file)
     write(joinpath(file, "true_concentration.dt"), tbl.conctable; delim)
     isnothing(tbl.signaltable) || write(joinpath(file, "$(tbl.signal).dt"), tbl.signaltable; delim)
+    id = nothing
+    if isa(tbl.signaltable, ColumnDataTable)
+        id = findfirst(x -> getproperty(tbl.signaltable.table, x) == tbl.pointlevel, propertynames(tbl.signaltable.table))
+    end
     open(joinpath(file, "config.txt"), "w+") do config
-        Base.write(config, "[signal]\n", tbl.signal, "\n\n[delim]\n", escape_string(string(delim)), "\n\n[pointlevel]\n", join(tbl.pointlevel, "\n"))
+        isnothing(id) ? Base.write(config, "[signal]\n", tbl.signal, "\n\n[delim]\n", escape_string(string(delim)), "\n\n[pointlevel]\n", join(tbl.pointlevel, "\n")) : 
+            Base.write(config, "[signal]\n", tbl.signal, "\n\n[delim]\n", escape_string(string(delim)), "\n\n[levelname]\n", propertynames(tbl.signaltable.table)[id])
     end
     CSV.write(joinpath(file, "analytetable.txt"), tbl.analytetable; delim)
 end

diff --git a/test/data/initial_mc_c.batch/method.mt/area.dt/table.txt b/test/data/initial_mc_c.batch/method.mt/area.dt/table.txt
@@ -1,19 +1,19 @@
-Sample	Analyte1	Analyte2	Analyte3	Analyte4	Analyte5
-1_1	102.4212472	96.42752442	208.2656569	21.66200012	152.3809399
-1_2	105.4410582	106.9120134	212.3767634	22.22889158	156.5128989
-1_3	103.2483322	103.5855335	206.6471272	21.48193963	150.577609
-2_1	193.3835925	105.4176522	391.3028828	26.26856958	279.3593859
-2_2	189.8952021	101.023409	384.1323374	25.25630343	273.4204046
-2_3	180.5397414	107.4028693	364.5058631	25.70375674	261.389156
-5_1	538.3684182	96.11969193	1077.669754	23.9966764	758.3638326
-5_2	467.3572557	105.6740928	936.1787229	25.68064986	662.5070483
-5_3	479.9276947	103.9678058	969.3611237	24.22180868	684.2497913
-10_1	954.7897112	91.23431149	1909.639386	25.64967851	1344.238317
-10_2	999.231432	98.04434403	2002.599781	25.02116125	1406.082423
-10_3	1057.796026	94.88047873	2118.20738	25.32224711	1486.806282
-20_1	2090.971679	103.2068489	4188.142591	25.25308104	2935.471746
-20_2	1942.319983	104.7321677	3884.899366	25.33092479	2727.256898
-20_3	2048.176173	93.39275376	4096.421883	25.76772176	2872.776165
-50_1	5077.33551	99.20135272	10155.93833	24.62228954	7115.878054
-50_2	5123.199858	100.9526035	10253.68608	24.32681841	7184.179874
-50_3	5142.329223	97.2009272	10288.52817	26.27945024	7206.298386
+Sample	Level	Analyte1	Analyte2	Analyte3	Analyte4	Analyte5
+1_1	1	102.4212472	96.42752442	208.2656569	21.66200012	152.3809399
+1_2	1	105.4410582	106.9120134	212.3767634	22.22889158	156.5128989
+1_3	1	103.2483322	103.5855335	206.6471272	21.48193963	150.577609
+2_1	2	193.3835925	105.4176522	391.3028828	26.26856958	279.3593859
+2_2	2	189.8952021	101.023409	384.1323374	25.25630343	273.4204046
+2_3	2	180.5397414	107.4028693	364.5058631	25.70375674	261.389156
+5_1	3	538.3684182	96.11969193	1077.669754	23.9966764	758.3638326
+5_2	3	467.3572557	105.6740928	936.1787229	25.68064986	662.5070483
+5_3	3	479.9276947	103.9678058	969.3611237	24.22180868	684.2497913
+10_1	4	954.7897112	91.23431149	1909.639386	25.64967851	1344.238317
+10_2	4	999.231432	98.04434403	2002.599781	25.02116125	1406.082423
+10_3	4	1057.796026	94.88047873	2118.20738	25.32224711	1486.806282
+20_1	5	2090.971679	103.2068489	4188.142591	25.25308104	2935.471746
+20_2	5	1942.319983	104.7321677	3884.899366	25.33092479	2727.256898
+20_3	5	2048.176173	93.39275376	4096.421883	25.76772176	2872.776165
+50_1	6	5077.33551	99.20135272	10155.93833	24.62228954	7115.878054
+50_2	6	5123.199858	100.9526035	10253.68608	24.32681841	7184.179874
+50_3	6	5142.329223	97.2009272	10288.52817	26.27945024	7206.298386
diff --git a/test/data/initial_mc_c.batch/method.mt/config.txt b/test/data/initial_mc_c.batch/method.mt/config.txt
@@ -4,6 +4,9 @@ area
 [delim]
 \t
 
+[levelname]
+Level
+
 [pointlevel]
 1
 1

diff --git a/test/data/save_mc_c.batch/method.mt/area.dt/table.txt b/test/data/save_mc_c.batch/method.mt/area.dt/table.txt
@@ -1,19 +1,19 @@
-Sample	Analyte1	Analyte2	Analyte3	Analyte4	Analyte5
-1_1	102.4212472	96.42752442	208.2656569	21.66200012	152.3809399
-1_2	105.4410582	106.9120134	212.3767634	22.22889158	156.5128989
-1_3	103.2483322	103.5855335	206.6471272	21.48193963	150.577609
-2_1	193.3835925	105.4176522	391.3028828	26.26856958	279.3593859
-2_2	189.8952021	101.023409	384.1323374	25.25630343	273.4204046
-2_3	180.5397414	107.4028693	364.5058631	25.70375674	261.389156
-5_1	538.3684182	96.11969193	1077.669754	23.9966764	758.3638326
-5_2	467.3572557	105.6740928	936.1787229	25.68064986	662.5070483
-5_3	479.9276947	103.9678058	969.3611237	24.22180868	684.2497913
-10_1	954.7897112	91.23431149	1909.639386	25.64967851	1344.238317
-10_2	999.231432	98.04434403	2002.599781	25.02116125	1406.082423
-10_3	1057.796026	94.88047873	2118.20738	25.32224711	1486.806282
-20_1	2090.971679	103.2068489	4188.142591	25.25308104	2935.471746
-20_2	1942.319983	104.7321677	3884.899366	25.33092479	2727.256898
-20_3	2048.176173	93.39275376	4096.421883	25.76772176	2872.776165
-50_1	5077.33551	99.20135272	10155.93833	24.62228954	7115.878054
-50_2	5123.199858	100.9526035	10253.68608	24.32681841	7184.179874
-50_3	5142.329223	97.2009272	10288.52817	26.27945024	7206.298386
+Sample	Level	Analyte1	Analyte2	Analyte3	Analyte4	Analyte5
+1_1	1	102.4212472	96.42752442	208.2656569	21.66200012	152.3809399
+1_2	1	105.4410582	106.9120134	212.3767634	22.22889158	156.5128989
+1_3	1	103.2483322	103.5855335	206.6471272	21.48193963	150.577609
+2_1	2	193.3835925	105.4176522	391.3028828	26.26856958	279.3593859
+2_2	2	189.8952021	101.023409	384.1323374	25.25630343	273.4204046
+2_3	2	180.5397414	107.4028693	364.5058631	25.70375674	261.389156
+5_1	3	538.3684182	96.11969193	1077.669754	23.9966764	758.3638326
+5_2	3	467.3572557	105.6740928	936.1787229	25.68064986	662.5070483
+5_3	3	479.9276947	103.9678058	969.3611237	24.22180868	684.2497913
+10_1	4	954.7897112	91.23431149	1909.639386	25.64967851	1344.238317
+10_2	4	999.231432	98.04434403	2002.599781	25.02116125	1406.082423
+10_3	4	1057.796026	94.88047873	2118.20738	25.32224711	1486.806282
+20_1	5	2090.971679	103.2068489	4188.142591	25.25308104	2935.471746
+20_2	5	1942.319983	104.7321677	3884.899366	25.33092479	2727.256898
+20_3	5	2048.176173	93.39275376	4096.421883	25.76772176	2872.776165
+50_1	6	5077.33551	99.20135272	10155.93833	24.62228954	7115.878054
+50_2	6	5123.199858	100.9526035	10253.68608	24.32681841	7184.179874
+50_3	6	5142.329223	97.2009272	10288.52817	26.27945024	7206.298386
diff --git a/test/data/save_mc_c.batch/method.mt/config.txt b/test/data/save_mc_c.batch/method.mt/config.txt
@@ -4,22 +4,5 @@ area
 [delim]
 \t
 
-[pointlevel]
-1
-1
-1
-2
-2
-2
-3
-3
-3
-4
-4
-4
-5
-5
-5
-6
-6
-6
+[levelname]
+Level
diff --git a/test/runtests.jl b/test/runtests.jl
@@ -81,7 +81,7 @@ end
             :point; 
             analytetype = AnalyteTest
         )
-        global method = MethodTable(conctable, signaltable, :area, repeat(1:7, 3); analyte = AnalyteTest.(analyte_names), isd = [2, -1, 4, -1], calibration = [1, -1, 3, -1])
+        global method = MethodTable(conctable, signaltable, :area, :point; analyte = AnalyteTest.(analyte_names), isd = [2, -1, 4, -1], calibration = [1, -1, 3, -1])
         global cdata = AnalysisTable([:area], [
             ColumnDataTable(
                 DataFrame(
-Original file line number
+Diff line change
@@ Expand Up / @@ -4,6 +4,9 @@ area @@
     [delim]
     \t
+    [levelname]
+    Level
     [pointlevel]
@@ Expand Down @@