-
Notifications
You must be signed in to change notification settings - Fork 123
volesti vs cobra toolbox
In this thread we usevolesti
to sample feasible flux solutions in a metabolic network and we compare its efficiency with Matlab toolbox cobra. Given the stoichiometric matrix of a metabolic network the set of feasible flux distributions (steady states) is defined by a convex polytope. Sampling uniformly distributed steady states is a powerful tool to study metabolism under changing conditions. In particular, one could estimate the probability density mass of the marginal distribution of a specific reductase flux.
We present this topic to state that volesti
could be a good alternative to cobra for studying metabolic networks. More interestingly, using volesti
could lead in more efficient scaling with respect to the dimension of polytopes and thus possible to study currently intractable models. To this end, the three following steps should be implemented:
- Implement the rounding algorithms in here and here.
- Implement preprocess methods to compute a high dimensional polytope from the data set.
- Implement an interface to illustrate sampling outcomes.
We uniformly sample steady states from the metabolic network of Escherichia coli given here to illustrate the marginal distribution of ATP synthase reductase flux. In this example e-coli
results to a 24-dimensional polytope that we have to sample uniformly distributed points from it.
The following Matlab script samples uniformly distributed points from the convex polytope induced by the metabolic network of Escherichia coli and creates the histogram of the values of coordinates that correspond to ATP synthase reductase flux (12th coordinate). It uses the CHRR algorithm.
load('e_coli_core.mat')
model = e_coli_core;
ATP = 'ATPS4r';
ibm = find(ismember(model.rxns, ATP)); % column index of the ATP demand reaction
model.c = zeros(length(model.c),1);
model.csense=model.csense';
[~, roundedPolytope, ~, ~] = chrrSampler(model, 1, 1,1);
nbins = 20;
sampling_times = [];
counter = 1;
for numSteps = [1 5:5:40]
tic
samples = genSamples( roundedPolytope,numSteps, 2000);
sampling_times = [sampling_times toc];
[yUn, xUn] = hist(samples(ibm, :), nbins);
subplot(3, 3, counter);
plot(xUn, yUn);
xlabel('Flux (mmol/gDW/h)')
ylabel('# samples')
title(strcat('number of Steps = ', num2str(numSteps)));
counter = counter + 1;
end
We get the following plots
volesti
does not support low dimensional sampling neither the rounding method implemented in cobra
.
The following Matlab script saves in a cell the matrices we need to construct the polytope in R
and then map the sampled points to the low dimensional polytope in up dimension.
poly=cell(4,1);
poly{1} = roundedPolytope.A;
poly{2} = roundedPolytope.b;
poly{5} = roundedPolytope.N;
poly{6} = roundedPolytope.p_shift;
save('roundedPolytope.mat','poly')
end
Now we use R
package R.matlab to load the matrices in R
and to construct the polytope.
library(volesti)
library(R.matlab)
modelmat = readMat('roundedPolytope.mat')
roundedPoly = modelmat$poly
A = roundedPoly[1]
A = A[[1]]
A = A[[1]]
b = roundedPoly[2]
b = b[[1]]
b = b[[1]]
G = roundedPoly[5]
G = G[[1]]
G = G[[1]]
shift = roundedPoly[6]
shift = shift[[1]]
shift = shift[[1]]
P = Hpolytope$new(A=A, b=b)
Now we sample with both Coordinate Hit and Run (CDHR), which is the same with CHRR as we sample in the rounded polytope, and Billiard walk.
N=2000
sampling_time_cdhr = c()
for (walk_length in c(1, seq(from=5,to=40,by=5))) {
tim = system.time({ points = sample_points(P, n=N, random_walk = list("walk" = "CDHR",
"walk_length"=walk_length))
points = G%*%points + matrix(rep(shift,N),ncol=N) })
sampling_time_cdhr = c(sampling_time_cdhr, as.numeric(tim)[3])
}
sampling_time_biw = c()
for (walk_length in 1:9) {
tim = system.time({ points = sample_points(P, n=N, random_walk = list("walk" = "aBiW",
"walk_length"=walk_length))
points = G%*%points + matrix(rep(shift,N),ncol=N) })
sampling_time_biw = c(sampling_time_biw, as.numeric(tim)[3])
}
For CDHR we get the following histograms, which behave almost the same with those of cobra.
For Billiard walk we get the following histograms.
Notice that the histogram implies the correct distribution even the walk length equals to one. The latter implies that Billiard walk mixes faster than CDHR (and CHRR).
--- | volesti | volesti | COBRA |
---|---|---|---|
Walk length | Billiard walk time | CDHR time | CHRR time |
1/1/1 | 0.099 | 0.007 | 3.5299 |
2/5/5 | 0.179 | 0.011 | 3.5976 |
3/10/10 | 0.268 | 0.015 | 3.7210 |
4/15/15 | 0.347 | 0.019 | 3.8905 |
5/20/20 | 0.443 | 0.023 | 4.0119 |
6/25/25 | 0.522 | 0.028 | 4.0606 |
7/30/30 | 0.607 | 0.032 | 4.2878 |
8/35/35 | 0.685 | 0.036 | 4.5809 |
9/40/40 | 0.775 | 0.040 | 5.0799 |
The implementation of CDHR in volesti
is 100 times faster than CDRR. However, we expect that billiard walk will be much more efficient than both CDHR and CHRR for high dimensional metabolic networks due to its superior mixing rate.