Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

explain using a precompiled pattern is for speed #2

Open
urbanjost opened this issue Dec 23, 2022 · 0 comments
Open

explain using a precompiled pattern is for speed #2

urbanjost opened this issue Dec 23, 2022 · 0 comments

Comments

@urbanjost
Copy link

In your documentation you indicate you are not sure why the pattern should be precompiled. When the same pattern is used
repeatedly precompiling it is faster. Using gfortran and looking for "war" and "peace" in a text version of "War and Peace" the
same code using a precompiled pattern and string pattern runs significantly faster. With debug flags the speed up is from 19 seconds to 13.5 seconds; with optimization flags on from around 5.15 to 2.5 seconds. Of course grep does it in 0.03 seconds so there is room for
optimization! (My version is much slower than grep as well :>).

# default fpm install
+  seek           war    +  wc  1219  15378  99104  real  0m13.551s  user  0m13.549s  sys  0m0.081s
+  fortran-regex  war    +  wc  1219  15378  99104  real  0m18.924s  user  0m18.909s  sys  0m0.097s
+  seek           peace  +  wc  128   1596   10145  real  0m13.721s  user  0m13.709s  sys  0m0.027s
+  fortran-regex  peace  +  wc  128   1596   10145  real  0m19.231s  user  0m19.219s  sys  0m0.028s
+  grep           peace  +  wc  128   1468   8516   real  0m0.027s   user  0m0.014s   sys  0m0.023s
# release profile
+  seek           war    +  wc  1219  15378  99104  real  0m2.467s  user  0m2.498s  sys  0m0.046s
+  fortran-regex  war    +  wc  1219  15378  99104  real  0m5.145s  user  0m5.153s  sys  0m0.076s
+  seek           peace  +  wc  128   1596   10145  real  0m2.463s  user  0m2.447s  sys  0m0.031s
+  fortran-regex  peace  +  wc  128   1596   10145  real  0m5.140s  user  0m5.120s  sys  0m0.038s
+  grep           peace  +  wc  128   1468   8516   real  0m0.026s  user  0m0.021s  sys  0m0.015s

$ cat seek.f90

program demo_regex
use regex_module
implicit none
character(len=1024) :: line=''
character(len=:),allocatable :: argument
integer             :: ios, ln, indx, i
type(regex_op) :: re
   call get_command_argument(1,length=ln)
   allocate(character(len=ln) :: argument)
   call get_command_argument(1,argument)
   if(argument.eq.'')stop 'missing regular expression'

   ! Parse pattern into a regex structure 
   re = parse_pattern(argument)

   INFINITE: do i=1,huge(0)-1
      read(*,'(a)',iostat=ios)line
      if(ios.ne.0)exit INFINITE
      indx=regex(string=line,pattern=re,length=ln)
      if (ln>0) then
         write(*,'((i6.6,":",i0,"-",i0,": ",g0))')i,indx,indx+ln-1,trim(line)
      endif
   enddo INFINITE
end program demo_regex

You just might want to mention that when using the same pattern over and over that precompiling it can provide a significant improvement in speed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant