Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SegFault when #optimizing trivial bracket #7

Closed
orottier opened this issue Mar 4, 2014 · 7 comments · Fixed by #477
Closed

SegFault when #optimizing trivial bracket #7

orottier opened this issue Mar 4, 2014 · 7 comments · Fixed by #477
Labels
bug Something isn't working

Comments

@orottier
Copy link

orottier commented Mar 4, 2014

Hello,

The following form program causes a segmentation fault:

Symbol x;
Local expr = x;
Bracket x;
Print;
.sort

Format O3;
#optimize expr
Print;
.end

If the bracket is nontrivial,

Symbol x,y;
Local expr = x*(12+y);

the program runs fine.

This issue also occurred with Format O0, but that has been fixed with commit 8ee418f (I think)

@vermaseren
Copy link
Owner

Hi Otto,

Ik zal er eens naar kijken. Waarschijnlijk totaal triviaal.

Jos

Quoting Otto Rottier [email protected]:

Hello,

The following form program causes a segmentation fault:

Symbol x;
Local expr = x;
Bracket x;
Print;
.sort

Format O3;
#optimize expr
Print;
.end

If the bracket is nontrivial,

Symbol x,y;
Local expr = x*(12+y);

the program runs fine.

This issue also occurred with Format O0, but that has been fixed
with commit 8ee418f (I think)


Reply to this email directly or view it on GitHub:
#7

@jPhy
Copy link

jPhy commented Mar 27, 2019

I got a bug report concerning pySecDec today which I could track down to an issue of exactly this kind.

In my case, the minimal program I could come up with is

Symbols x,y,z;
L expr = 1;
AB x,y,z;
Format O3;
.sort
#optimize expr
.end

It works with optimize lower than 3 but fails with O3 and O4.

@tueda
Copy link
Collaborator

tueda commented Jul 18, 2020

Now we have a memory leak:

$ valgrind --leak-check=full ./vorm test.frm
==195783== Memcheck, a memory error detector
==195783== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==195783== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==195783== Command: ./vorm test.frm
==195783==
==195783== Warning: set address range perms: large range [0xfbe1040, 0x2d8b7540) (undefined)
==195783== Warning: set address range perms: large range [0x2d8b8040, 0x4b58e540) (undefined)
==195783== Warning: set address range perms: large range [0x59eaf040, 0x9e678bc0) (undefined)
FORM 4.2.1 (Jul 17 2020, v4.2.1-20-g9c0c031) 64-bits  Run: Sat Jul 18 17:18:08 2020
    Symbol x;
    Local expr = x;
    Bracket x;
    Print;
    .sort

Time =       0.07 sec    Generated terms =          1
            expr         Terms in output =          1
                         Bytes used      =         48

   expr =
       + x * ( 1 );


    Format O3;
    #optimize expr

Time =       0.25 sec    Generated terms =          1
            expr         Terms in output =          1
                         Bytes used      =         36
    Print;
    .end

Time =       0.25 sec    Generated terms =          1
            expr         Terms in output =          1
                         Bytes used      =         36
      expr=x;


  0.26 sec out of 0.26 sec
==195783==
==195783== HEAP SUMMARY:
==195783==     in use at exit: 2,315,431,072 bytes in 134 blocks
==195783==   total heap usage: 423 allocs, 289 frees, 2,338,358,484 bytes allocated
==195783==
==195783== 48 bytes in 1 blocks are definitely lost in loss record 35 of 131
==195783==    at 0x4C2AF73: malloc (vg_replace_malloc.c:309)
==195783==    by 0x505EA7: Malloc1 (tools.c:2250)
==195783==    by 0x46D842: generate_output(std::vector<int, std::allocator<int> > const&, int, int, std::vector<std::vector<int, std::allocator<int> >, std::allocator<std::vector<int, std::allocator<int> > > > const&) (optimize.cc:4364)
==195783==    by 0x479E79: Optimize (optimize.cc:4708)
==195783==    by 0x4B0146: DoOptimize (pre.c:6653)
==195783==    by 0x4B505D: PreProInstruction (pre.c:1160)
==195783==    by 0x4B5BC5: PreProcessor (pre.c:936)
==195783==    by 0x4EF371: main (startup.c:1619)
==195783==
==195783== LEAK SUMMARY:
==195783==    definitely lost: 48 bytes in 1 blocks
==195783==    indirectly lost: 0 bytes in 0 blocks
==195783==      possibly lost: 0 bytes in 0 blocks
==195783==    still reachable: 2,315,431,024 bytes in 133 blocks
==195783==         suppressed: 0 bytes in 0 blocks
==195783== Reachable blocks (those to which a pointer was found) are not shown.
==195783== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==195783==
==195783== For lists of detected and suppressed errors, rerun with: -s
==195783== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

benruijl added a commit that referenced this issue Jul 26, 2020
- Fixed memory leak for trivial bracket, mentioned in issue #7
tueda added a commit that referenced this issue Aug 10, 2020
Note: the test "Issue7_2" is affected by #364.
@tueda
Copy link
Collaborator

tueda commented Aug 10, 2020

A function in optimize.cc, generate_output(), still has a subtle problem: we are never allowed to pass a null pointer to the 2nd argument of memcpy, which is detected by the compiler sanitizer, see the log.

This may cause a catastrophic result in a simple example, but I think fortunately our routine is too complicated and compilers at the present day are not so smart to abuse this undefined behaviour for optimizations.

I will fix it in the next commit.

@tueda
Copy link
Collaborator

tueda commented Aug 10, 2020

Then I need to think about a memory leak in optimization with ParFORM, which should be easy to fix once I set up an environment where MPI and Valgrind are correctly working.

@tueda
Copy link
Collaborator

tueda commented Apr 20, 2023

Let's reconsider the Otto's example:

Symbol x;
Local expr = x;
Bracket x;
Print;
.sort

Format O3;
#optimize expr
Print;
.end

Before Ben's commit 0a07b8c, FORM had crashed like

FORM 4.2.1 (Jul 17 2020, v4.2.1-16-ge66efc5) 64-bits  Run: Thu Apr 20 20:35:04 2023
    Symbol x;
    Local expr = x;
    Bracket x;
    Print;
    .sort

Time =       0.04 sec    Generated terms =          1
            expr         Terms in output =          1
                         Bytes used      =         48

   expr =
       + x * ( 1 );

    
    Format O3;
    #optimize expr
==868183== Invalid read of size 4
==868183==    at 0x190058: next_MCTS_scheme(std::vector<int, std::allocator<int> >*, std::vector<int, std::allocator<int> >*, std::vector<tree_node*, std::allocator<tree_node*> >*) (optimize.cc:1877)
==868183==    by 0x190848: find_Horner_MCTS_expand_tree() (optimize.cc:2012)
==868183==    by 0x1914D7: find_Horner_MCTS() (optimize.cc:2262)
==868183==    by 0x191A77: Optimize (optimize.cc:4640)
==868183==    by 0x1CCB74: DoOptimize (pre.c:6653)
==868183==    by 0x1D1F97: PreProInstruction (pre.c:1160)
==868183==    by 0x1D2B81: PreProcessor (pre.c:936)
==868183==    by 0x20F96F: main (startup.c:1619)
==868183==  Address 0xfffffffffffffffc is not stack'd, malloc'd or (recently) free'd
==868183== 
Program terminating at test.frm Line 7 --> 
==868183== Invalid read of size 4
==868183==    at 0x22A65E: Crash (tools.c:3771)
==868183==    by 0x20F1EF: Terminate (startup.c:1721)
==868183==    by 0x20F805: onErrSig (startup.c:1489)
==868183==    by 0x4C7751F: ??? (in /usr/lib/x86_64-linux-gnu/libc.so.6)
==868183==    by 0x190057: next_MCTS_scheme(std::vector<int, std::allocator<int> >*, std::vector<int, std::allocator<int> >*, std::vector<tree_node*, std::allocator<tree_node*> >*) (optimize.cc:1877)
==868183==    by 0x190848: find_Horner_MCTS_expand_tree() (optimize.cc:2012)
==868183==    by 0x1914D7: find_Horner_MCTS() (optimize.cc:2262)
==868183==    by 0x191A77: Optimize (optimize.cc:4640)
==868183==    by 0x1CCB74: DoOptimize (pre.c:6653)
==868183==    by 0x1D1F97: PreProInstruction (pre.c:1160)
==868183==    by 0x1D2B81: PreProcessor (pre.c:936)
==868183==    by 0x20F96F: main (startup.c:1619)
==868183==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==868183== 
==868183== 
==868183== Process terminating with default action of signal 11 (SIGSEGV)
==868183==  Access not within mapped region at address 0x0
==868183==    at 0x22A65E: Crash (tools.c:3771)
==868183==    by 0x20F1EF: Terminate (startup.c:1721)
==868183==    by 0x20F805: onErrSig (startup.c:1489)
==868183==    by 0x4C7751F: ??? (in /usr/lib/x86_64-linux-gnu/libc.so.6)
==868183==    by 0x190057: next_MCTS_scheme(std::vector<int, std::allocator<int> >*, std::vector<int, std::allocator<int> >*, std::vector<tree_node*, std::allocator<tree_node*> >*) (optimize.cc:1877)
==868183==    by 0x190848: find_Horner_MCTS_expand_tree() (optimize.cc:2012)
==868183==    by 0x1914D7: find_Horner_MCTS() (optimize.cc:2262)
==868183==    by 0x191A77: Optimize (optimize.cc:4640)
==868183==    by 0x1CCB74: DoOptimize (pre.c:6653)
==868183==    by 0x1D1F97: PreProInstruction (pre.c:1160)
==868183==    by 0x1D2B81: PreProcessor (pre.c:936)
==868183==    by 0x20F96F: main (startup.c:1619)
==868183==  If you believe this happened as a result of a stack
==868183==  overflow in your program's main thread (unlikely but
==868183==  possible), you can try to increase the size of the
==868183==  main thread stack using the --main-stacksize= flag.
==868183==  The main thread stack size used in this run was 8388608.
==868183== 
==868183== HEAP SUMMARY:
==868183==     in use at exit: 2,337,009,638 bytes in 148 blocks
==868183==   total heap usage: 187 allocs, 39 frees, 2,337,057,521 bytes allocated
==868183== 
==868183== LEAK SUMMARY:
==868183==    definitely lost: 0 bytes in 0 blocks
==868183==    indirectly lost: 0 bytes in 0 blocks
==868183==      possibly lost: 0 bytes in 0 blocks
==868183==    still reachable: 2,337,009,638 bytes in 148 blocks
==868183==         suppressed: 0 bytes in 0 blocks
==868183== Rerun with --leak-check=full to see details of leaked memory
==868183== 
==868183== For lists of detected and suppressed errors, rerun with: -s
==868183== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)

But now:

FORM 5.0.0-beta.1 (Apr 19 2023, v5.0.0-beta.1-12-g6217eca)  Run: Thu Apr 20 20:37:37 2023
    Symbol x;
    Local expr = x;
    Bracket x;
    Print;
    .sort

Time =       0.04 sec    Generated terms =          1
            expr         Terms in output =          1
                         Bytes used      =         48

   expr =
       + x * ( 1 );

    
    Format O3;
    #optimize expr
==874566== Conditional jump or move depends on uninitialised value(s)
==874566==    at 0x1FE597: TestSub (proces.c:1021)
==874566==    by 0x1FA24A: Generator (proces.c:3260)
==874566==    by 0x19E989: generate_expression(int) (optimize.cc:4457)
==874566==    by 0x1B06C5: Optimize (optimize.cc:4785)
==874566==    by 0x1EBB6E: DoOptimize (pre.c:6807)
==874566==    by 0x1F1478: PreProInstruction (pre.c:1232)
==874566==    by 0x1F208E: PreProcessor (pre.c:1009)
==874566==    by 0x23087D: main (startup.c:1688)
==874566== 
==874566== Conditional jump or move depends on uninitialised value(s)
==874566==    at 0x1925B9: Normalize (normal.c:2051)
==874566==    by 0x1FA26E: Generator (proces.c:3272)
==874566==    by 0x19E989: generate_expression(int) (optimize.cc:4457)
==874566==    by 0x1B06C5: Optimize (optimize.cc:4785)
==874566==    by 0x1EBB6E: DoOptimize (pre.c:6807)
==874566==    by 0x1F1478: PreProInstruction (pre.c:1232)
==874566==    by 0x1F208E: PreProcessor (pre.c:1009)
==874566==    by 0x23087D: main (startup.c:1688)
==874566== 
==874566== Conditional jump or move depends on uninitialised value(s)
==874566==    at 0x1925C2: Normalize (normal.c:2051)
==874566==    by 0x1FA26E: Generator (proces.c:3272)
==874566==    by 0x19E989: generate_expression(int) (optimize.cc:4457)
==874566==    by 0x1B06C5: Optimize (optimize.cc:4785)
==874566==    by 0x1EBB6E: DoOptimize (pre.c:6807)
==874566==    by 0x1F1478: PreProInstruction (pre.c:1232)
==874566==    by 0x1F208E: PreProcessor (pre.c:1009)
==874566==    by 0x23087D: main (startup.c:1688)
==874566== 
==874566== Conditional jump or move depends on uninitialised value(s)
==874566==    at 0x1925CB: Normalize (normal.c:2055)
==874566==    by 0x1FA26E: Generator (proces.c:3272)
==874566==    by 0x19E989: generate_expression(int) (optimize.cc:4457)
==874566==    by 0x1B06C5: Optimize (optimize.cc:4785)
==874566==    by 0x1EBB6E: DoOptimize (pre.c:6807)
==874566==    by 0x1F1478: PreProInstruction (pre.c:1232)
==874566==    by 0x1F208E: PreProcessor (pre.c:1009)
==874566==    by 0x23087D: main (startup.c:1688)
==874566== 
==874566== Conditional jump or move depends on uninitialised value(s)
==874566==    at 0x18DE09: Normalize (normal.c:279)
==874566==    by 0x1FA26E: Generator (proces.c:3272)
==874566==    by 0x19E989: generate_expression(int) (optimize.cc:4457)
==874566==    by 0x1B06C5: Optimize (optimize.cc:4785)
==874566==    by 0x1EBB6E: DoOptimize (pre.c:6807)
==874566==    by 0x1F1478: PreProInstruction (pre.c:1232)
==874566==    by 0x1F208E: PreProcessor (pre.c:1009)
==874566==    by 0x23087D: main (startup.c:1688)
==874566== 

...FORM freezes...

It seems that something different is happening...

@tueda tueda reopened this Apr 20, 2023
@tueda
Copy link
Collaborator

tueda commented Aug 14, 2023

If I revert the change for the simultaneous optimization of expressions (namely, brackets) in generate_output():

1d4b775?w=1#diff-b041575ff5a324d5d3b06ee7e9d3ac313fab602aad39112ca26348a28ce039baL4330-R4359

then the code seems to work correctly, at least for the trivial case.

So, what was the intention and the background of this change? Did it try to fix something and actually break the code in the trivial case?

tueda added a commit to tueda/form that referenced this issue Feb 24, 2024
tueda added a commit to tueda/form that referenced this issue Feb 24, 2024
@tueda tueda linked a pull request Feb 24, 2024 that will close this issue
tueda added a commit to tueda/form that referenced this issue Feb 24, 2024
tueda added a commit that referenced this issue Mar 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants