Perfomance improvement of Tickets->_parser call

Hello.
What is patch for:

  1. Get rid from copy of incoming args in _match.

  2. Decrease _match() calss.

    1. $curent = XXX if(_match) sequence changed to if(_match) {$current =
      XXX} elsif(_match) sequence in reverse order. So one _match on loop step.

    2. implement $want check before _match, bin logic op surely faster then
      (sub call+regexp). This also descrease _match calls.

  3. base reqexps refactoring. [><=] better then (?:>|<|=).

  4. another small things.

Profiling with attached script:

  1. without patch
    -d:DProf bug_report3.pl && dprofpp -G “(::BEGIN)” 2>/dev/null
    Option G Grouping: [(::BEGIN)]
    Grouping [(::BEGIN)] Calls: [796]
    Grouping [(::BEGIN)] Times: [66.5214000000006]
    Grouping [(::BEGIN)] IncTimes: [421.515600000001]
    Total Elapsed Time = 10.18922 Seconds
    User+System Time = 9.539225 Seconds
    Exclusive Times
    %Time ExclSec CumulS #Calls sec/call Csec/c Name
    34.7 3.315 3.315 62250 0.0001 0.0001 RT::tickets::_match
    9.41 0.898 6.717 656 0.0014 0.0102 RT::tickets::_parser
    6.97 0.665 4.215 796 0.0008 0.0053 (::BEGIN)
    4.49 0.428 0.477 5244 0.0001 0.0001
    DBIx::SearchBuilder::_GenericRestriction
    4.08 0.389 0.608 5244 0.0001 0.0001 Regexp::Common::_decache
    4.06 0.387 1.011 5244 0.0001 0.0002 DBIx::SearchBuilder::Limit
    2.33 0.222 0.315 3 0.0742 0.1051
    Locale::Maketext::Lexicon::Gettext::parse
    2.20 0.210 0.210 5244 0.0000 0.0000
    Regexp::Common::delimited::gen_delimited
    1.99 0.190 0.190 5244 0.0000 0.0000
    Regexp::Common::Entry::_clone_with
    1.80 0.172 1.039 4589 0.0000 0.0002 RT::SearchBuilder::Limit
    1.68 0.160 0.468 655 0.0002 0.0007 DBIx::SearchBuilder::_DoCount
    1.36 0.130 0.154 2621 0.0000 0.0001 RT::tickets::Limit
    1.35 0.129 0.368 5244 0.0000 0.0001
    Regexp::Common::Entry::ANON
    1.27 0.121 0.278 5241 0.0000 0.0001 RT::CurrentUser::loc
    1.24 0.118 0.187 10488 0.0000 0.0000 Regexp::Common::new

  2. With patch
    sudo perl -d:DProf bug_report3.pl && dprofpp -G “(::BEGIN)” 2>/dev/null
    Option G Grouping: [(::BEGIN)]
    Grouping [(::BEGIN)] Calls: [797]
    Grouping [(::BEGIN)] Times: [69.4010000000002]
    Grouping [(::BEGIN)] IncTimes: [437.141750000001]
    Total Elapsed Time = 6.195472 Seconds
    User+System Time = 5.705472 Seconds
    Exclusive Times
    %Time ExclSec CumulS #Calls sec/call Csec/c Name
    12.8 0.734 0.734 18347 0.0000 0.0000 RT::tickets::_match
    12.1 0.694 4.371 797 0.0009 0.0055 (::BEGIN)
    7.97 0.455 2.402 656 0.0007 0.0037 RT::tickets::_parser
    6.45 0.368 0.961 5244 0.0001 0.0002 DBIx::SearchBuilder::Limit
    5.92 0.338 0.428 5244 0.0001 0.0001
    DBIx::SearchBuilder::_GenericRestriction
    3.70 0.211 0.304 3 0.0702 0.1014
    Locale::Maketext::Lexicon::Gettext::parse
    2.93 0.167 0.999 4589 0.0000 0.0002 RT::SearchBuilder::Limit
    2.42 0.138 0.138 657 0.0002 0.0002 RT::User::_ClassAccessible
    2.23 0.127 0.160 2621 0.0000 0.0001 RT::tickets::Limit
    2.02 0.115 0.503 655 0.0002 0.0008 DBIx::SearchBuilder::_DoCount
    1.96 0.112 0.210 655 0.0002 0.0003 RT::Principal::HasRight
    1.89 0.108 0.108 656 0.0002 0.0002
    RT::tickets::_RestrictionsToClauses
    1.58 0.090 0.090 7889 0.0000 0.0000
    DBIx::SearchBuilder::Record::id
    1.56 0.089 0.166 3291 0.0000 0.0001
    DBIx::SearchBuilder::Record::Cachable::_gen_primary_cache_key
    1.56 0.089 0.571 655 0.0001 0.0009 RT::tickets::_LinkLimit

bug_report3.pl (324 Bytes)

_parser_perfomance.patch (3.74 KB)

  1. Get rid from copy of incoming args in _match.

  2. Decrease _match() calss.

  3. $curent = XXX if(_match) sequence changed to if(_match)
    {$current = XXX} elsif(_match) sequence in reverse order. So one
    _match on loop step.

  4. implement $want check before _match, bin logic op surely faster
    then (sub call+regexp). This also descrease _match calls.

  5. base reqexps refactoring. [><=] better then (?:>|<|=).

  6. another small things.

These are all microoptimizations and I highly doubt they’re causing
the time difference you’re seeing. Some of them make the code harder
to read or less consistent, which is a negative.

-R

Robert Spier wrote:

  1. Get rid from copy of incoming args in _match.

  2. Decrease _match() calss.

  3. $curent = XXX if(_match) sequence changed to if(_match)
    {$current = XXX} elsif(_match) sequence in reverse order. So one
    _match on loop step.

  4. implement $want check before _match, bin logic op surely faster
    then (sub call+regexp). This also descrease _match calls.

  5. base reqexps refactoring. [><=] better then (?:>|<|=).

  6. another small things.

These are all microoptimizations and I highly doubt they’re causing
the time difference you’re seeing. Some of them make the code harder
to read or less consistent, which is a negative.

Say it Linux Kernels developers especially about __* subs :slight_smile:
_parser and _match are private calls. Exactly this calls should be as
fast as posible, because they do work that in genaral case RT shouldn’t
do: deparse SQL query to finaly produce same SQL query. I don’t complain
on RT waste CPU time at all.

If you look on profile then you see that _match called 60000 times
without patch. it’s simple redexp check, it should be inline, but perl
can’t inline such code so IMHO it should be optimized or better inlined
by hands. Wasting ticks. PARSER which do useless calls is bad parser.
Patch decrease number of calls ~3times and makes them only when it’s
really needed.
I never see CodingStyle guidelines which say “readability has advantage
over code sanity”.

_parser implement $want flag, but doesn’t use this, patch fix this issue.

I wrote comments about four different issues, but you just say that it
all is crap. sad.
Good luck. Ruslan.

Hello, Robert.
I revert regexps changes and some other, but add other changes.

Not it has next changes:

  1. get rid from _match.

  2. split PAREN into OPEN_PAREN & CLOSE_PAREN so parser know more about
    errors, such as ‘)(’, ‘OR )’…

  3. fix one ‘fixme’

  4. fix always negative error check.

     	Best regards. Ruslan.
    

[snip]

_parser_perfomance.patch (6.18 KB)

So, a year later, I find this in my inbox.

Sheepish apologies for sitting on this.

It looks good to me. Can you write up a Benchmark for it so we can
make sure it actually makes things faster?

Also,

  • Highest priority is last

  • $current = OP if _match($re_op,$val);
  • $current = VALUE if _match($re_value,$val);
  • $current = KEYWORD if _match($re_keyword,$val) && ($want & KEYWORD);
  • $current = AGGREG if _match($re_aggreg,$val);
  • $current = PAREN if _match($re_paren,$val);
  • foreach( OPEN_PAREN, CLOSE_PAREN, AGGREG, OP, KEYWORD, VALUE ) {

The priority order has changed. In theory, that could change how
certain constructs are parsed, because they match another regexp. In
practice, it may not matter.

-R

At Mon, 17 May 2004 03:03:46 +0400,
Ruslan U. Zakirov wrote: