Thermalization of CPU LB broken #3804

pkreissl · 2020-07-17T10:22:20Z

IMHO broken for CPU. Minimal example:

import argparse

parser = argparse.ArgumentParser()
parser.add_argument("--use_gpu", default=False, action="store_true")
parser.add_argument("--kt", type=float, default=0)
args = parser.parse_args()

import espressomd
from espressomd.observables import LBFluidPressureTensor

system = espressomd.System(box_l=[20, 20, 20])
system.time_step = 0.01
system.cell_system.skin = 0.4
if args.use_gpu:
    lbf = espressomd.lb.LBFluidGPU(agrid=1.0, dens=1.9, visc=1.0, tau=0.01, kT=args.kt, seed=7)
else:
    lbf = espressomd.lb.LBFluid(agrid=1.0, dens=1.9, visc=1.0, tau=0.01, kT=args.kt, seed=7)
system.actors.add(lbf)
lb_pressure = LBFluidPressureTensor()
for i in range(2):  # change as you like
    system.integrator.run(1000)  # same here, problem persists
    print(lb_pressure.calculate())

Testing via run_minimal.sh:

#!/bin/bash

echo "CPU, kt=0:"
./pypresso minimal.py
echo "GPU, kt=0:"
./pypresso minimal.py --use_gpu
echo "CPU, kt=1:"
./pypresso minimal.py --kt 1
echo "GPU, kt=1:"
./pypresso minimal.py --use_gpu --kt 1

yields:

$ ./run_minimal.sh 
CPU, kt=0:
[[6333.33325386    0.            0.        ]
 [   0.         6333.33325386    0.        ]
 [   0.            0.         6333.33325386]]
[[6333.33325386    0.            0.        ]
 [   0.         6333.33325386    0.        ]
 [   0.            0.         6333.33325386]]
GPU, kt=0:
WARNING: More than one GPU detected, please note ESPResSo uses device 0 by default regardless of usage or capability. The GPU to be used can be modified by setting System.cuda_init_handle.device.
[[6333.33353698    0.            0.        ]
 [   0.         6333.33353698    0.        ]
 [   0.            0.         6333.33353698]]
[[6333.33353698    0.            0.        ]
 [   0.         6333.33353698    0.        ]
 [   0.            0.         6333.33353698]]
CPU, kt=1:
[[6.39378045e+03 1.90899139e+00 1.95662446e+00]
 [1.90899139e+00 6.38988958e+03 1.94024253e+00]
 [1.95662446e+00 1.94024253e+00 6.38844944e+03]]
[[6.39384017e+03 1.91773972e+00 2.03000427e+00]
 [1.91773972e+00 6.39001967e+03 1.98502662e+00]
 [2.03000427e+00 1.98502662e+00 6.38863714e+03]]
GPU, kt=1:
WARNING: More than one GPU detected, please note ESPResSo uses device 0 by default regardless of usage or capability. The GPU to be used can be modified by setting System.cuda_init_handle.device.
[[ 6.33442971e+03 -2.38058610e-02 -6.62311537e-02]
 [-2.38058610e-02  6.33435239e+03  4.31334905e-02]
 [-6.62311537e-02  4.31334905e-02  6.33439990e+03]]
[[ 6.33472237e+03 -5.51996531e-02 -2.82031257e-02]
 [-5.51996531e-02  6.33468426e+03  1.38701318e-02]
 [-2.82031257e-02  1.38701318e-02  6.33475568e+03]]

kT=0 case is fine, but with thermalization, CPU produces off-dagonal pressure values of order $10^0$, instead of GPU, order $10^{-2}$ which seems more reasonable, I think.

The text was updated successfully, but these errors were encountered:

RudolfWeeber · 2020-07-17T10:57:57Z

Can you please check in Ulf Schillers PhD thesis what are the expectation values for <sigma_{ij}^2> for thermalized lb? What are the integrals of the stress acf for cpu and gpu, respectively?

pkreissl · 2020-07-17T11:00:58Z

With gpu the acf integral produces ~ resonable values for the viscosity via Green-Kubo, for cpu, the acf does not even decay yet (~1e6 ~~steps~~ samples, runs A LOT slower than gpu...)

pkreissl · 2020-07-17T11:45:53Z

Concerning Ulfs PhD thesis, I have not yet had to much to do with it, apart from browsing a bit and I am thus not immediately familiar with the different notations and definitions used there. Will give it a try, however, it will probably be a lot faster, if someone more involved with this, e.g. @mkuron or @KaiSzuttor could comment...

pkreissl · 2020-07-17T11:53:53Z

ACF of off-diagonal elements using 400000 samles (and a slightly differen lb parameter set, however nothing fancy), evaluated every second time step, for GPU:

and for CPU:

pkreissl · 2020-07-21T10:42:10Z

In offline discussion, @RudolfWeeber suggested that this issue might be a regression. I checked for ESPResSo version 4.1.0, same problem, for 4.0.0 there doesn't seem to be a fluid stress observable (neither LBFluidPressureTensor nor LBFluidStress)?! So this problem does indeed seem to have been around for some time.

KaiSzuttor · 2020-07-22T07:29:43Z

How do you know that the issue has been around for some time if you cannot check for versions older than 4.1.0 (which is less than a year old)?

pkreissl · 2020-07-22T09:50:04Z

How do you know that the issue has been around for some time if you cannot check for versions older than 4.1.0 (which is less than a year old)?

Let me rephrase that: ~~I couldn't find the feature before 4.1.0 (if it had just a different name, please do tell me, I'll check for the issue)~~ The LBFluidStress observable was introduced with 4.1.0 (PR #2054), so it seems to have been broken from the beginning...

@KaiSzuttor

Fixes #3804 , fixes #3772 The issue was actually a regression that happened with the switch to philox in commit [f3cc4ba](f3cc4ba). Random numbers formerly part of the interval (-0.5,0.5] were replaced by random numers in (0,1]. With this fix the `lb_pressure_tensor_acf.py` runs successfully for both CPU and GPU. However, as @KaiSzuttor correctly mentioned in PR #3831 the CPU part of the test takes a while to execute (on my machine, single core the whole test takes 136 s). I could try to make that faster which, however, would require tweaking the tolerance limits `tol_node` and `tol_global`. @RudolfWeeber , what was your reasoning behind the chosen limits? Or are they semi-arbitrary choices? Further, this PR corrects the comparison of the off-diagonal elements `avg_ij` vs. `avg_ji` in the test.

@KaiSzuttor

Fixes espressomd#3804 , fixes espressomd#3772 The issue was actually a regression that happened with the switch to philox in commit [f3cc4ba](espressomd@f3cc4ba). Random numbers formerly part of the interval (-0.5,0.5] were replaced by random numers in (0,1]. With this fix the `lb_pressure_tensor_acf.py` runs successfully for both CPU and GPU. However, as @KaiSzuttor correctly mentioned in PR espressomd#3831 the CPU part of the test takes a while to execute (on my machine, single core the whole test takes 136 s). I could try to make that faster which, however, would require tweaking the tolerance limits `tol_node` and `tol_global`. @RudolfWeeber , what was your reasoning behind the chosen limits? Or are they semi-arbitrary choices? Further, this PR corrects the comparison of the off-diagonal elements `avg_ij` vs. `avg_ji` in the test.

fweik mentioned this issue Jul 30, 2020

Test: add test for lb pressure tensor acf against viscosity #3831

Merged

pkreissl changed the title ~~Combination of Thermalization & LBFluidStressTensor broken~~ Thermalization of CPU LB broken Aug 3, 2020

pkreissl mentioned this issue Aug 3, 2020

Fix regression in CPU LB thermalization. #3847

Merged

kodiakhq bot closed this as completed in #3847 Sep 30, 2020

KaiSzuttor mentioned this issue Oct 26, 2020

Walberla: Thermalization + force != momentum conservation #3969

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Thermalization of CPU LB broken #3804

Thermalization of CPU LB broken #3804

pkreissl commented Jul 17, 2020

RudolfWeeber commented Jul 17, 2020 via email

pkreissl commented Jul 17, 2020 •

edited

Loading

pkreissl commented Jul 17, 2020

pkreissl commented Jul 17, 2020 •

edited

Loading

pkreissl commented Jul 21, 2020

KaiSzuttor commented Jul 22, 2020

pkreissl commented Jul 22, 2020 •

edited

Loading

Thermalization of CPU LB broken #3804

Thermalization of CPU LB broken #3804

Comments

pkreissl commented Jul 17, 2020

RudolfWeeber commented Jul 17, 2020 via email

pkreissl commented Jul 17, 2020 • edited Loading

pkreissl commented Jul 17, 2020

pkreissl commented Jul 17, 2020 • edited Loading

pkreissl commented Jul 21, 2020

KaiSzuttor commented Jul 22, 2020

pkreissl commented Jul 22, 2020 • edited Loading

pkreissl commented Jul 17, 2020 •

edited

Loading

pkreissl commented Jul 17, 2020 •

edited

Loading

pkreissl commented Jul 22, 2020 •

edited

Loading