Improve parameter optimisation algorithm #11

jeffersonfparil · 2024-02-01T01:43:12Z

It's starting to seem like the parameter spaces are not smooth and unimodal. See the sensitivity analysis currently running in perf.md.

jeffersonfparil · 2024-02-01T04:31:14Z

We can try the search from both ends, i.e. from one end then the other with the stopping criteria remaining the same, i.e. if stepping further increases MAE then we stop and use the lowest parameter value yielding the lowest MAE. We do this from smallest to larges parameter value, then again from the largest to the smallest value. Finally, among the two "optimal" parameter values we select the one which yields the smaller MAE.

jeffersonfparil · 2024-02-01T04:56:40Z

The previous proposal does not fully appreciate the complexity in the parameter spaces! The trends in MAE across one parameter space is unlikely to be the same across all the levels of all the other parameters! This means that the current simplistic coordinate-descent-like optimisation may prove to be suboptimal.

However, we can perform a similar coordinate-descent-like optimisation where we start from each corner of the 4-dimensional parameter space! Do we zigzag from each corner? Currently, what we do is start from one corner then move along a single dimension, then once the minimum MAE is found, we fix this dimension at the "optimum" value, and jump to the next dimension, and so on. Can we efficiently lick all 4 dimensions before each step from each corner?

Also, just a hopeful sidenote: Despite these, maybe the current output is too granular, and maybe as we get more info it all smooths out and the parameter spaces become more or less smooth and unimodal?

tested optimisation modifcations and resolcing issue #11

jeffersonfparil · 2024-02-04T23:13:34Z

My optimisation revision is resulting to worse concordances and MAE! See commit 91b023b for comparison.

jeffersonfparil · 2024-02-05T00:50:52Z

Testing optimisation changes here. Main changes:

optimisations of first corr, then dist, then l, and finally k
performing forwards then backwards pass per parameter space

jeffersonfparil · 2024-02-05T02:32:37Z

adding parameter space exploration from the middle forwards and backwards up to the ends see commit aec01bc

jeffersonfparil · 2024-02-06T05:10:58Z

going across all 4 parameter spaces twice in order of corr, dist, l and k
also increasing the break threshold from f64::EPSILON to 0.0001
see commit 8c10911

jeffersonfparil · 2024-02-08T02:56:26Z

drafting optimisation per locus, i.e. optimising for the minimum loci correlation and maximum pool distances per missing data point

jeffersonfparil · 2024-02-09T05:23:45Z

The most recent merge from dev:

optimisation per locus across min_loci_corr and max_pool_dist only
outputting intermediate imputation files which will be helpful for large datasets
double parallelisation across chunks of the genome via std::thread::scope and per chunk across all alleles/loci via Zip

jeffersonfparil · 2024-02-12T01:53:52Z

To address the glacial pace at which the new optimisation algorithm proceeds, we added an additional inner-loop-break condition, i.e. when the last 3 consecutive MAEs are the same.

jeffersonfparil · 2024-02-12T22:56:59Z

Still too slow!

reducing the number of steps per parameter space to explore
using &[T] instead of &Vec<T>
~~using format! instead of + to concatenate strings~~ (expensive and have to be owned to store in variables)

jeffersonfparil · 2024-02-13T23:28:07Z

improving computation time by skipping involved imputation (and optimisation) of the last allele per locus and just using the additive inverse of the sum of the imputed allele frequencies prior to fixing over-flows --> did not improve time noticeably and so removed

jeffersonfparil · 2024-02-15T00:27:29Z

looking like we have the first stable version I am happy with
Main things I have stuck with:
- using default parameters: min_loci_corr=0.9, max_pool_dist=0.1, min_l_loci=20, and min_k_neighbours=5
- using optimisation per locus where we have min_l_loci and min_k_neighbours fixed and we only explore the finitely upper-bounded min_loci_corr and max_pool_dist parameter spaces ranging from 0.0 to 1.0 and stepping 0.1 at a time.
- performing independent imputation per allele per locus and then correcting for under- and over-flows is like an extra imputation replicate which is improving our accuracy

jeffersonfparil added a commit that referenced this issue Feb 2, 2024

Merge pull request #12 from jeffersonfparil/dev

f73de13

tested optimisation modifcations and resolcing issue #11

jeffersonfparil closed this as completed Feb 2, 2024

jeffersonfparil reopened this Feb 4, 2024

jeffersonfparil added a commit that referenced this issue Feb 15, 2024

looking like this is our first stable version (see issue #11)

4e3164c

jeffersonfparil mentioned this issue Feb 16, 2024

Move out of the complexity of R packaging #19

Closed

jeffersonfparil closed this as completed Feb 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve parameter optimisation algorithm #11

Improve parameter optimisation algorithm #11

jeffersonfparil commented Feb 1, 2024

jeffersonfparil commented Feb 1, 2024

jeffersonfparil commented Feb 1, 2024 •

edited

Loading

jeffersonfparil commented Feb 4, 2024

jeffersonfparil commented Feb 5, 2024 •

edited

Loading

jeffersonfparil commented Feb 5, 2024

jeffersonfparil commented Feb 6, 2024

jeffersonfparil commented Feb 8, 2024

jeffersonfparil commented Feb 9, 2024

jeffersonfparil commented Feb 12, 2024

jeffersonfparil commented Feb 12, 2024 •

edited

Loading

jeffersonfparil commented Feb 13, 2024 •

edited

Loading

jeffersonfparil commented Feb 15, 2024

Improve parameter optimisation algorithm #11

Improve parameter optimisation algorithm #11

Comments

jeffersonfparil commented Feb 1, 2024

jeffersonfparil commented Feb 1, 2024

jeffersonfparil commented Feb 1, 2024 • edited Loading

jeffersonfparil commented Feb 4, 2024

jeffersonfparil commented Feb 5, 2024 • edited Loading

jeffersonfparil commented Feb 5, 2024

jeffersonfparil commented Feb 6, 2024

jeffersonfparil commented Feb 8, 2024

jeffersonfparil commented Feb 9, 2024

jeffersonfparil commented Feb 12, 2024

jeffersonfparil commented Feb 12, 2024 • edited Loading

jeffersonfparil commented Feb 13, 2024 • edited Loading

jeffersonfparil commented Feb 15, 2024

jeffersonfparil commented Feb 1, 2024 •

edited

Loading

jeffersonfparil commented Feb 5, 2024 •

edited

Loading

jeffersonfparil commented Feb 12, 2024 •

edited

Loading

jeffersonfparil commented Feb 13, 2024 •

edited

Loading