Behaviour correction for SB-0.88 (improves Oracle support)

Brook_Schofield1 · July 16, 2003, 4:15pm

Jesse (and interested Oracle parties),

The _isJoined function is supposed to “Returns true if this
Searchbuilder requires joins between tables” which is more like - will
it or has it performed a join between tables. It seems all too eager to
return true. This causes problems for the Oracle implementation with
SQL being generated:

SELECT main.* FROM ( SELECT DISTINCT main.id FROM Queues main )
distinctquery, Queues main WHERE (main.id = distinctquery.id) AND
((main.Disabled = ‘0’))

rather than the more simpler:

SELECT main.* FROM Queues main WHERE (main.Disabled = ‘0’)

so I suggest the following patch to SB.pm - which could reduce the
overhead for other databases as well.

sub _isJoined {
my $self = shift;
if (keys(%{$self->{‘left_joins’}})) {
return(1);
} else {
return(@{$self->{‘aliases’}});
}
}

Which return ‘true’ if there is more than 2 aliases or left_join keys.
This should be the correct behaviour for all databases - but testing is
obviously required.

I’ve also changed the order of the build of the SQL statements within
_DoSearch (no changes required for _DoCount). Potentially the ORDER BY
clause could be moved ahead of the DISTINCT component of the query -
and this move would be required if a query was ever ordered by a column
not in the primary table. I don’t know whether that is possible within
SearchBuilder - but useful for reference. I don’t know whether the next
join preserves the order of the DISTINCT sub select - I’ll ask some
DBAs. If it does - then I’ll suggest the movement of the _OrderClass
call.

FYI the SQL being generated was nesting the DISTINCT without the WHERE
clauses which meant that the identifiers for the query weren’t
available in the outer select - causing an error.

I’m still having trouble with the Apache::Session stuff - but looking
good otherwise. RT3 (3.0.3pre3 because I haven’t updated yet) operates
as expected - other than having to continually login - but I’m getting
there.

Also, checkout http://rt3.fsck.com/Ticket/Display.html?id=2432 if you
need any support files to start playing with Oracle/RT - not much
should have changed since then. I’ll roll new patches against 3.0.4
if/when I get Apache::Session support working.

-Brook

sb_88_oracle.patch (3.13 KB)

Paul_Wagland · July 17, 2003, 9:35am

Figured I would chime in… I can’t say whether the change is correct or
not, will look into testing this next Monday… off to Frankfurt for the REM
concert in the meantime

But with postgres, this statement appears to be true:

return true. This causes problems for the Oracle implementation with
SQL being generated:

SELECT main.* FROM ( SELECT DISTINCT main.id FROM Queues main )
distinctquery, Queues main WHERE (main.id = distinctquery.id) AND
((main.Disabled = ‘0’))

rather than the more simpler:

SELECT main.* FROM Queues main WHERE (main.Disabled = ‘0’)

so I suggest the following patch to SB.pm - which could reduce the
overhead for other databases as well.

Here is the explain texts for these. (We have 18 queues, 1 of which is
disabled). Mind you, this probably won’t make a huge difference in the
interface… but every little bit helps I guess…

rt3=# explain SELECT main.* FROM ( SELECT DISTINCT main.id FROM Queues
main ) distinctquery, Queues main WHERE (main.id = distinctquery.id) AND
((main.Disabled = ‘0’));
QUERY PLAN
Hash Join (cost=1.65…2.96 rows=1 width=116)
Hash Cond: (“outer”.id = “inner”.id)
→ Seq Scan on queues main (cost=0.00…1.23 rows=17 width=112)
Filter: (disabled = 0::smallint)
→ Hash (cost=1.65…1.65 rows=2 width=4)
→ Subquery Scan distinctquery (cost=1.56…1.65 rows=2 width=4)
→ Unique (cost=1.56…1.65 rows=2 width=4)
→ Sort (cost=1.56…1.60 rows=18 width=4)
Sort Key: id
→ Seq Scan on queues main (cost=0.00…1.18
rows=18 width=4)
(10 rows)

rt3=# explain SELECT main.* FROM Queues main WHERE (main.Disabled = ‘0’);
QUERY PLAN
Seq Scan on queues main (cost=0.00…1.23 rows=17 width=112)
Filter: (disabled = 0::smallint)
(2 rows)

Cheers,
Paul

Brook_Schofield1 · July 17, 2003, 12:20pm

Figured I would chime in… I can’t say whether the change is correct
or
not, will look into testing this next Monday… off to Frankfurt for
the REM
concert in the meantime

The issue isn’t whether the change is correct or not - its whether the
SQL that is generated gives the same result set as previous
implementations (or a result set the same as the MySQL or PostgreSQL
implementations - which wasn’t the case).

This change to the distinct code does not change the result set
returned - but does improve the SQL generated.

This also made it easier for me to make the next change to SB-0.88 - in
that changing the order of the building of the SQL removed the syntax
problems with the SQL generated for Oracle.

The only issue I could see creeping up is the ($key IS NULL or $key =
‘’) change - as that may not be portable between other databases. This
is due to the differences between CLOB and VARCHAR2 in Oracle (this has
been detailed in a previous post of mine).

But with postgres, this statement appears to be true:

This SQL isn’t generated for PostgreSQL. Only Oracle uses the nested
SELECT DISTINCT to avoid the problem of being unable to perform a
DISTINCT on a table with a CLOB column type.

I believe that the previous code would have generated an excessive
amount of SELECT DISTINCT queries when they are unnecessary.

Here is the explain texts for these. (We have 18 queues, 1 of which is
disabled). Mind you, this probably won’t make a huge difference in the
interface… but every little bit helps I guess…

Since this improvement is present in all queries - there would be an
improvement by not needing a DISTINCT. Remember that this change would
only modify the SQL for PostgreSQL/MySQL from:

SELECT DISTINCT main.* FROM Queues main WHERE (main.Disabled = ‘0’)

to:

SELECT main.* FROM Queues main WHERE (main.Disabled = ‘0’)

which should result in very little cost saving - but a much bigger
improvement for Oracle users. Not that there are any Oracle users (or
at least not many - because support is not complete).

BTW: There are many other SQL queries that were generated using an
unnecessary DISTINCT - so there will be savings by removing this when
it isn’t necessary - especially on large datasets most useful for
installations which have large numbers of users.

Additional Oracle support is still on the way!

-Brook

Jesse_Vincent · July 18, 2003, 10:07pm

This change to the distinct code does not change the result set
returned - but does improve the SQL generated.

nod I actually ran into this about a month ago and was pretty
positive that I’d made an equivalent change. This is clearly not the
case. I’ve put your patchset through the testsuite with both mysql and
Pg and all seems in order. Tonight, I’ll upload 0.89_01 to CPAN.

The only issue I could see creeping up is the ($key IS NULL or $key =
‘’) change - as that may not be portable between other databases. This
is due to the differences between CLOB and VARCHAR2 in Oracle (this has
been detailed in a previous post of mine).

That does vaguely worry me, but I don’t actually see any cases where it
breaks anything.

Request Tracker... So much more than a help desk — Best Practical Solutions – Trouble Ticketing. Free.