Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lexical variables in regex code blocks not propagated (SEGFAULT) #2683

Closed
p5pRT opened this issue Oct 5, 2000 · 10 comments
Closed

Lexical variables in regex code blocks not propagated (SEGFAULT) #2683

p5pRT opened this issue Oct 5, 2000 · 10 comments

Comments

@p5pRT
Copy link

p5pRT commented Oct 5, 2000

Migrated from rt.perl.org#4383 (status was 'resolved')

Searchable as RT4383$

@p5pRT
Copy link
Author

p5pRT commented Oct 5, 2000

From @Tux

--8<---
# cat re.pl
#!/usr/bin/perl -w

use strict;

my %hash = (
  old => {
  Data => [ "Date=05 Oct 2000, 15​:17", "Version=1.00" ],
  Version => "0.00",
  Date => "01 Jan 1970, 00​:00",
  },
  new => {
  Data => [ "Date=05 Oct 2000, 13​:11", "Version=0.07" ],
  Version => "0.00",
  Date => "01 Jan 1970, 00​:00",
  },
  );

print "\n=== Extract lexical\n";
foreach my $h (qw( old new )) {
  grep m/^(?{print "$_\n"})
  (Date|Version)=(.*)
  (?{ $hash{$h}{$1} = $2 })/x => @​{$hash{$h}{Data}};
  }

print "=== Report\n";
foreach my $h (qw( old new )) {
  foreach my $k (qw( Date Version )) {
  print join ("." => $h, $k, $hash{$h}{$k}), "\n";
  }
  }

print "\n=== Extract global\n";
use vars qw($h);
foreach $h (qw( old new )) {
  grep m/^(?{print "$_\n"})
  (Date|Version)=(.*)
  (?{ $hash{$h}{$1} = $2 })/x => @​{$hash{$h}{Data}};
  }

print "=== Report\n";
foreach $h (qw( old new )) {
  foreach my $k (qw( Date Version )) {
  print join ("." => $h, $k, $hash{$h}{$k}), "\n";
  }
  }
# perl5.7.0 re.pl

=== Extract lexical
Date=05 Oct 2000, 15​:17
Use of uninitialized value in hash element at (re_eval 2) line 1.
Use of uninitialized value in hash element at (re_eval 2) line 1.
Version=1.00
Use of uninitialized value in hash element at (re_eval 2) line 1.
Date=05 Oct 2000, 13​:11
Use of uninitialized value in hash element at (re_eval 2) line 1.
Version=0.07
Use of uninitialized value in hash element at (re_eval 2) line 1.
=== Report
old.Date.01 Jan 1970, 00​:00
old.Version.0.00
new.Date.01 Jan 1970, 00​:00
new.Version.0.00

=== Extract global
Date=05 Oct 2000, 15​:17
Version=1.00
Date=05 Oct 2000, 13​:11
Version=0.07
=== Report
old.Date.05 Oct 2000, 15​:17
old.Version.1.00
new.Date.05 Oct 2000, 13​:11
new.Version.0.07
#
-->8---

where it proves that a lexical $h isn't known inside (?{ $hash{$h}{$1} = $2 })
where the global (localized by foreach) $h is.

# perl5.7.0 -V
Summary of my perl5 (revision 5.0 version 7 subversion 0) configuration​:
  Platform​:
  osname=hpux, osvers=11.00, archname=PA-RISC2.0
  uname='hp-ux l1 b.11.00 u 9000800 527706567 unlimited-user license '
  config_args='-ds'
  hint=recommended, useposix=true, d_sigaction=define
  usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef
  useperlio=undef d_sfio=undef uselargefiles=define usesocks=undef
  use64bitint=undef use64bitall=undef uselongdouble=undef
  Compiler​:
  cc='cc', ccflags =' -DDEBUGGING -Ae -D_HPUX_SOURCE -I/pro/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 ',
  optimize='+O2 +Onolimit',
  cppflags='-DDEBUGGING -Ae -D_HPUX_SOURCE -I/pro/local/include'
  ccversion='A.11.01.21505.GP', gccversion='', gccosandvers=''
  intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=4321
  d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
  ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
  alignbytes=8, usemymalloc=y, prototype=define
  Linker and Libraries​:
  ld='ld', ldflags ='-L/pro/local/lib -Wl,+vnocompatwarnings'
  libpth=/pro/local/lib /lib /usr/lib /usr/ccs/lib /usr/local/lib
  libs=-lnsl -lnm -lndbm -lgdbm -ldb -ldld -lm -lc -lndir -lcrypt -lsec
  libc=/lib/libc.sl, so=sl, useshrplib=false, libperl=libperl.a
  Dynamic Linking​:
  dlsrc=dl_hpux.xs, dlext=sl, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-B,deferred '
  cccdlflags='+z', lddlflags='-b +vnocompatwarnings -L/pro/local/lib'

Characteristics of this binary (from libperl)​:
  Compile-time options​: DEBUGGING USE_LARGE_FILES
  Locally applied patches​:
  DEVEL7149
  Built under hpux
  Compiled at Oct 5 2000 11​:47​:16
  @​INC​:
  /pro/lib/perl5/5.7.0/PA-RISC2.0
  /pro/lib/perl5/5.7.0
  /pro/lib/perl5/site_perl/5.7.0/PA-RISC2.0
  /pro/lib/perl5/site_perl/5.7.0
  /pro/lib/perl5/site_perl/5.005/PA-RISC2.0
  /pro/lib/perl5/site_perl/5.005
  /pro/lib/perl5/site_perl
  .

@p5pRT
Copy link
Author

p5pRT commented Oct 5, 2000

From @vanstyn

In <20001005152535.33DD.H.M.BRAND@​hccnet.nl>, "H.Merijn Brand" writes​:
[...]
:where it proves that a lexical $h isn't known inside (?{ $hash{$h}{$1} = $2 })
:where the global (localized by foreach) $h is.

Thanks for the report. This is a known problem, and on my list. :(
The primary problem, I think, is that it is not well defined how
lexicals in re_eval blocks should be looked up - I have had some
discussion with Sarathy about this. He noted in particular that
sv_compile_2op has no clearly defined scratchpad associated with
the compiled tree, which violates assumptions made by the rest of
perl. He suggested it must have a well-defined scratchpad that​:
- is associated with it for its entire lifetime
- does not violate any assumptions made in the pad_*() functions
  about a scratchpad having an associated array of names that
  doesn't grow at runtime
- has a PL_curpad that is stable (i.e. it can't "go away" before
  the body of code compiles and subsequently executes)
- is capable of supporting recursion of the optree associated with
  it (i.e. it is a stacked array of arrays)
- is capable of being cloned when appropriate (just like any pad
  associated with anon-CVs and any thing "callable" is)
- and possibly more

Sarathy​: <<
  One approach to fixing it would be to either make it more like a
pure subroutine, or like a pure eval"" (depending on whichever semantic
is more appropriate for (?{...}). I'm inclined towards the former.

A particularly difficult aspect of the problem, however, is the
question of when the regexp (or fragment) is compiled. In code such
as this​:

  our $a = 1;
  my $re;
  {
  my $a = 2;
  $re = qr{(?{ print $a })(??{ $a })};
  }
  {
  my $a = 3;
  "123" =~ $re and print $&amp;;
  "123" =~ /($re)/ and print $1;
  }

.. it is not at all clear to me what results we should expect, only
that we ain't there yet.

Hugo

@p5pRT
Copy link
Author

p5pRT commented Feb 25, 2002

From @Tux

On Thu 05 Oct 2000 16​:26, Hugo <hv@​crypt.compulink.co.uk> wrote​:

In <20001005152535.33DD.H.M.BRAND@​hccnet.nl>, "H.Merijn Brand" writes​:
[...]
:where it proves that a lexical $h isn't known inside (?{ $hash{$h}{$1} = $2 })
:where the global (localized by foreach) $h is.

Thanks for the report. This is a known problem, and on my list. :(

Solved before 5.8?
Still on your list?
My latest blead still complains, and there are several bug reports 'bout this
in bugtron

The primary problem, I think, is that it is not well defined how
lexicals in re_eval blocks should be looked up - I have had some
discussion with Sarathy about this. He noted in particular that
sv_compile_2op has no clearly defined scratchpad associated with
the compiled tree, which violates assumptions made by the rest of
perl. He suggested it must have a well-defined scratchpad that​:
- is associated with it for its entire lifetime
- does not violate any assumptions made in the pad_*() functions
about a scratchpad having an associated array of names that
doesn't grow at runtime
- has a PL_curpad that is stable (i.e. it can't "go away" before
the body of code compiles and subsequently executes)
- is capable of supporting recursion of the optree associated with
it (i.e. it is a stacked array of arrays)
- is capable of being cloned when appropriate (just like any pad
associated with anon-CVs and any thing "callable" is)
- and possibly more

Sarathy​: <<
One approach to fixing it would be to either make it more like a
pure subroutine, or like a pure eval"" (depending on whichever semantic
is more appropriate for (?{...}). I'm inclined towards the former.

A particularly difficult aspect of the problem, however, is the
question of when the regexp (or fragment) is compiled. In code such
as this​:

our $a = 1;
my $re;
{
my $a = 2;
$re = qr{(?{ print $a })(??{ $a })};
}
{
my $a = 3;
"123" =~ $re and print $&amp;;
"123" =~ /($re)/ and print $1;
}

.. it is not at all clear to me what results we should expect, only
that we ain't there yet.

Hugo

--
H.Merijn Brand Amsterdam Perl Mongers (http​://amsterdam.pm.org/)
using perl-5.6.1, 5.7.2 & 631 on HP-UX 10.20 & 11.00, AIX 4.2, AIX 4.3,
  WinNT 4, Win2K pro & WinCE 2.11. Smoking perl CORE​: smokers@​perl.org
http​://archives.develooper.com/daily-build@​perl.org/ perl-qa@​perl.org
send smoke reports to​: smokers-reports@​perl.org, QA​: http​://qa.perl.org

@p5pRT
Copy link
Author

p5pRT commented Feb 27, 2002

From @vanstyn

"H.Merijn Brand" <h.m.brand@​hccnet.nl> wrote​:
:On Thu 05 Oct 2000 16​:26, Hugo <hv@​crypt.compulink.co.uk> wrote​:
:> In <20001005152535.33DD.H.M.BRAND@​hccnet.nl>, "H.Merijn Brand" writes​:
:> [...]
:> :where it proves that a lexical $h isn't known inside (?{ $hash{$h}{$1} = $2 })
:> :where the global (localized by foreach) $h is.
:>
:> Thanks for the report. This is a known problem, and on my list. :(
:
:Solved before 5.8?

Unlikely, now.

:Still on your list?

Yes. Volunteers welcome, though. :)

Hugo

@p5pRT
Copy link
Author

p5pRT commented Feb 28, 2002

From @Tux

On Wed 27 Feb 2002 21​:38, Hugo van der Sanden <hv@​crypt.compulink.co.uk> wrote​:

"H.Merijn Brand" <h.m.brand@​hccnet.nl> wrote​:
:On Thu 05 Oct 2000 16​:26, Hugo <hv@​crypt.compulink.co.uk> wrote​:
:> In <20001005152535.33DD.H.M.BRAND@​hccnet.nl>, "H.Merijn Brand" writes​:
:> [...]
:> :where it proves that a lexical $h isn't known inside (?{ $hash{$h}{$1} = $2 })
:> :where the global (localized by foreach) $h is.
:>
:> Thanks for the report. This is a known problem, and on my list. :(
:
:Solved before 5.8?

Unlikely, now.

:Still on your list?

Yes. Volunteers welcome, though. :)

If I wave my hands any faster I will lift off

Hugo

--
H.Merijn Brand Amsterdam Perl Mongers (http​://amsterdam.pm.org/)
using perl-5.6.1, 5.7.2 & 631 on HP-UX 10.20 & 11.00, AIX 4.2, AIX 4.3,
  WinNT 4, Win2K pro & WinCE 2.11. Smoking perl CORE​: smokers@​perl.org
http​://archives.develooper.com/daily-build@​perl.org/ perl-qa@​perl.org
send smoke reports to​: smokers-reports@​perl.org, QA​: http​://qa.perl.org

@p5pRT
Copy link
Author

p5pRT commented Nov 8, 2005

From @smpeters

[hmbrand - Wed Oct 04 23​:48​:50 2000]​:

--8<---
# cat re.pl
#!/usr/bin/perl -w

use strict;

my %hash = (
old => {
Data => [ "Date=05 Oct 2000, 15​:17", "Version=1.00" ],
Version => "0.00",
Date => "01 Jan 1970, 00​:00",
},
new => {
Data => [ "Date=05 Oct 2000, 13​:11", "Version=0.07" ],
Version => "0.00",
Date => "01 Jan 1970, 00​:00",
},
);

print "\n=== Extract lexical\n";
foreach my $h (qw( old new )) {
grep m/^(?{print "$_\n"})
(Date|Version)=(.*)
(?{ $hash{$h}{$1} = $2 })/x => @​{$hash{$h}{Data}};
}

print "=== Report\n";
foreach my $h (qw( old new )) {
foreach my $k (qw( Date Version )) {
print join ("." => $h, $k, $hash{$h}{$k}), "\n";
}
}

print "\n=== Extract global\n";
use vars qw($h);
foreach $h (qw( old new )) {
grep m/^(?{print "$_\n"})
(Date|Version)=(.*)
(?{ $hash{$h}{$1} = $2 })/x => @​{$hash{$h}{Data}};
}

print "=== Report\n";
foreach $h (qw( old new )) {
foreach my $k (qw( Date Version )) {
print join ("." => $h, $k, $hash{$h}{$k}), "\n";
}
}
# perl5.7.0 re.pl

=== Extract lexical
Date=05 Oct 2000, 15​:17
Use of uninitialized value in hash element at (re_eval 2) line 1.
Use of uninitialized value in hash element at (re_eval 2) line 1.
Version=1.00
Use of uninitialized value in hash element at (re_eval 2) line 1.
Date=05 Oct 2000, 13​:11
Use of uninitialized value in hash element at (re_eval 2) line 1.
Version=0.07
Use of uninitialized value in hash element at (re_eval 2) line 1.
=== Report
old.Date.01 Jan 1970, 00​:00
old.Version.0.00
new.Date.01 Jan 1970, 00​:00
new.Version.0.00

=== Extract global
Date=05 Oct 2000, 15​:17
Version=1.00
Date=05 Oct 2000, 13​:11
Version=0.07
=== Report
old.Date.05 Oct 2000, 15​:17
old.Version.1.00
new.Date.05 Oct 2000, 13​:11
new.Version.0.07
#
-->8---

where it proves that a lexical $h isn't known inside (?{ $hash{$h}{$1}
= $2 })
where the global (localized by foreach) $h is.

I just tried this with bleadperl, and got a coredump for my efforts.
It appears that the unitialized variables in the regex are wreaking
havoc on S_varname(). Sorry, I don't a Perl built with -g at the
moment, but when I do, I'll post the backtrace.

@p5pRT
Copy link
Author

p5pRT commented Nov 8, 2005

From @smpeters

[stmpeters - Mon Nov 07 16​:10​:22 2005]​:
I just tried this with bleadperl, and got a coredump for my efforts.
It appears that the unitialized variables in the regex are wreaking
havoc on S_varname(). Sorry, I don't a Perl built with -g at the
moment, but when I do, I'll post the backtrace.

#0 0x1c097ac6 in S_varname (gv=0x0, gvtype=36 '$', targ=2, keyname=0x0,
  aindex=0, subscript_type=1) at sv.c​:704
704 sv = *av_fetch(av, targ, FALSE);
(gdb) bt
#0 0x1c097ac6 in S_varname (gv=0x0, gvtype=36 '$', targ=2, keyname=0x0,
  aindex=0, subscript_type=1) at sv.c​:704
#1 0x1c097f58 in S_find_uninit_var (obase=0x7ff4aa80,
uninit_sv=0x82e60130,
  match=0 '\0') at sv.c​:812
#2 0x1c098172 in S_find_uninit_var (obase=0x7ff4aaa0,
uninit_sv=0x82e60130,
  match=0 '\0') at sv.c​:864
#3 0x1c098825 in Perl_report_uninit (uninit_sv=0x82e60130) at sv.c​:1056
#4 0x1c09cd70 in Perl_sv_2pv_flags (sv=0x82e60130, lp=0xcfbf864c, flags=34)
  at sv.c​:3214
#5 0x1c08465d in S_hv_fetch_common (hv=0x7da00ea0, keysv=0x82e60130,
key=0x0,
  klen=0, flags=0, action=4, val=0x0, hash=0) at hv.c​:433
#6 0x1c0845e9 in Perl_hv_fetch_ent (hv=0x7da00ea0, keysv=0x82e60130,
lval=1,
  hash=0) at hv.c​:411
#7 0x1c091502 in Perl_pp_helem () at pp_hot.c​:1711
#8 0x1c0757bf in Perl_runops_debug () at dump.c​:1597
#9 0x1c0f5623 in S_regmatch (prog=0x852a8394) at regexec.c​:3213
#10 0x1c0f90e0 in S_regmatch (prog=0x852a8374) at regexec.c​:4131
#11 0x1c0f374c in S_regmatch (prog=0x852a8344) at regexec.c​:2734
#12 0x1c0f1f58 in S_regtry (prog=0x852a8300,
  startpos=0x7ff4aa00 "Date=05 Oct 2000, 15​:17") at regexec.c​:2223
#13 0x1c0f06a2 in Perl_regexec_flags (prog=0x852a8300,
  stringarg=0x7ff4aa00 "Date=05 Oct 2000, 15​:17", strend=0x7ff4aa17 "",
  strbeg=0x7ff4aa00 "Date=05 Oct 2000, 15​:17", minend=0, sv=0x7da002f0,
  data=0x0, flags=3) at regexec.c​:1764
#14 0x1c08fbb0 in Perl_pp_match () at pp_hot.c​:1298
#15 0x1c0757bf in Perl_runops_debug () at dump.c​:1597
#16 0x1c01a748 in S_run_body (oldscope=1) at perl.c​:2298
#17 0x1c01a2dd in perl_run (my_perl=0x8b444030) at perl.c​:2225
#18 0x1c015bf3 in main (argc=2, argv=0xcfbf923c, env=0xcfbf9248)
  at perlmain.c​:103

@p5pRT
Copy link
Author

p5pRT commented Nov 8, 2005

From @iabyn

On Mon, Nov 07, 2005 at 04​:14​:46PM -0800, Steve Peters via RT wrote​:

[stmpeters - Mon Nov 07 16​:10​:22 2005]​:
I just tried this with bleadperl, and got a coredump for my efforts.
It appears that the unitialized variables in the regex are wreaking
havoc on S_varname(). Sorry, I don't a Perl built with -g at the
moment, but when I do, I'll post the backtrace.

I think this another instance of "lexical in re_evals are borked". I'll
add it to my list of things to check one I've fixed that.

--
The Enterprise's efficient long-range scanners detect a temporal vortex
distortion in good time, allowing it to be safely avoided via a minor
course correction.
  -- Things That Never Happen in "Star Trek" #21

@p5pRT
Copy link
Author

p5pRT commented Jun 14, 2012

From @cpansprout

This has been fixed by the commits leading up to eb58a7e.

@p5pRT
Copy link
Author

p5pRT commented Jun 14, 2012

@cpansprout - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant