DBIx::SearchBuilder DistinctQuery for Oracle

Jesse, Pavel,
attached is a patch against DBIx::SearchBuilder to allow for
DISTINCT queries in Oracle.

a) DISTINCT queries aren’t performed on a query against a single table. I
can’t see a reason why a select on a single table wouldn’t return a
distinct set of data - since that table on its own is distinct. If someone
can give an example that disproves this please tell me.

b) DISTINCT queries are off loaded to the individual database specific
Handles - just incase PostgreSQL or SQLServer prefer a different format.

Since Oracle doesn’t like DISINCT in a select clause containing CLOB/BLOB
types I’ve moved it into a subselect to ease the pain.

Original Query (MySQL compatible):
SELECT DISTINCT main.* FROM Tickets main, Users users WHERE
(main.Owner = users.id) ORDER BY main.id ASC

Oracle Query:
SELECT main.* FROM (
SELECT DISTINCT main.id FROM Tickets main, Users users WHERE
(main.Owner = users.id)
^^
) distinctquery, Tickets main WHERE (main.id = distinctquery.id)

This format query also allows for the wrappering by the ApplyLimits patch.

Although I believe that a more efficient version of the DISTINCT and LIMIT
subqueries would be to perform the LIMIT before the DISTINCT join, as in:

SELECT main.* FROM (
SELECT * FROM (
SELECT limitquery.*,rownum rn FROM (
SELECT DISTINCT main.id FROM Tickets main, Users
users WHERE
(main.Owner = users.id) ORDER BY main.id ASC
) limitquery WHERE rownum <= 50
) WHERE rn >= 1
) distinctquery, Tickets main WHERE (main.id = distinctquery.id)

as this would limit the result set before the join with the original table.
Until I find
(or someone else finds) a better way of structuring the LIMIT and DISTINCT
manipulations this format will have to do:

SELECT * FROM (
SELECT limitquery.,rownum rn FROM (
SELECT main.
FROM (
SELECT DISTINCT main.id FROM Tickets main, Users
users WHERE
(main.Owner = users.id) ORDER BY main.id ASC
) distinctquery, Tickets main WHERE (main.id =
distinctquery.id)
) limitquery WHERE rownum <= 50
) WHERE rn >= 1

Those running Oracle will just have to use beefy database servers :wink:

NB:- The patch also changes a section in CleanSlate() from ‘tables’ to ‘table’
which I believe is a typo!

-Brook

= /// /// /// /// _/ _/ Brook Schofield =
= _/ _/ _/ _/ _/ _/ _/ _/ _/ / B.Schofield@griffith.edu.au =
= // //
/ _/ _/ _/ _/ // Ph: +61 7 387 53779 - WCN 0.28 =
= _/ _/ _/ _/ _/ _/ _/ _/ _/ / Directory Services Integration =
= //
/ / / /// /// _/ _/ Griffith University QLD 4111 =

oracle_distinct.patch (3.78 KB)

a) DISTINCT queries aren’t performed on a query against a single table. I

can’t see a reason why a select on a single table wouldn’t return a

distinct set of data - since that table on its own is distinct.

If someone

can give an example that disproves this please tell me.

If you do not select all attributes in a table OR if the table does not have
a primary key (which is unlikely) than a DISTINCT query on a single table
does make sense.

For example (not an RT example):
select distinct city from addresses;
makes sense.

Martin

At 09:01 AM 17/03/2003 +0100, Martin Schapendonk wrote:

If you do not select all attributes in a table OR if the table does not have
a primary key (which is unlikely) than a DISTINCT query on a single table
does make sense.

For example (not an RT example):
select distinct city from addresses;
makes sense.

Martin,

I was actually asking the question in the context of DBIx::SearchBuilder
which always prefaces its query as:

SELECT DISTINCT main.* FROM addresses main;

so while the city column would be returned - so would the primary key ‘id’
which I believe is a requirement for SearchBuilder. So the result set
returned would still be distinct. So in this case I would see that:

SELECT main.* FROM addresses main;

would have the same effect - and result in not needing a DISTINCT (which is
what Oracle has a problem with if you use CLOB datatypes). The problem only
effects the RT database tables: Attachments, Scrips, Users, Templates,
sessions; but you don’t want to build RT specific functionality into
SearchBuilder - hence the DISTINCT only when it is necessary and having it
in a subquery otherwise.

My DBAs have put me on another track with the distinct issue so I am going
to look at that to see if it improves the efficiency/readability/quality of
the code.

-Brook

= /// /// /// /// _/ _/ Brook Schofield =
= _/ _/ _/ _/ _/ _/ _/ _/ _/ / B.Schofield@griffith.edu.au =
= // //
/ _/ _/ _/ _/ // Ph: +61 7 387 53779 - WCN 0.28 =
= _/ _/ _/ _/ _/ _/ _/ _/ _/ / Directory Services Integration =
= //
/ / / /// /// _/ _/ Griffith University QLD 4111 =

Hi Brook,

I think you are on the right way. I have also checked all the database
schemas for rt2 and rt3 and every table has an ID column, which is PRIMARY
KEY. And we all know, the PRIMARY KEY is always unique, so searching it
DISTINCT does not make sense. So we never need to put DISTINCT in a simple
query with one alias, where the primary key (id) column is also searched
(like main.* or main.id).
I have not found any query with one alias, where we need the DISTINCT. I am
only afraid of the LEFT JOINS, are they extending the query with new
aliases, or not? I do not understand all the SearchBuilder internals well,
so I can not say if there may be any risk.

I like your patch, I will test it and give you a note. (I think, that using
this way of modifications we can also support left joins on 8.1.x Oracle DB.
This will be also great.)

Pavel-----Original Message-----
From: Brook Schofield [mailto:B.Schofield@mailbox.gu.edu.au]
Sent: Sunday, March 16, 2003 11:15 PM
To: rt-devel@fsck.com
Cc: pavel.behal@hp.com; jesse@bestpractical.com
Subject: DBIx::SearchBuilder DistinctQuery for Oracle

Jesse, Pavel,
attached is a patch against DBIx::SearchBuilder to allow for
DISTINCT queries in Oracle.

a) DISTINCT queries aren’t performed on a query against a single table. I
can’t see a reason why a select on a single table wouldn’t return a
distinct set of data - since that table on its own is distinct. If someone
can give an example that disproves this please tell me.

b) DISTINCT queries are off loaded to the individual database specific
Handles - just incase PostgreSQL or SQLServer prefer a different format.

Since Oracle doesn’t like DISINCT in a select clause containing CLOB/BLOB
types I’ve moved it into a subselect to ease the pain.

Original Query (MySQL compatible):
SELECT DISTINCT main.* FROM Tickets main, Users users WHERE
(main.Owner = users.id) ORDER BY main.id ASC

Oracle Query:
SELECT main.* FROM (
SELECT DISTINCT main.id FROM Tickets main, Users users WHERE
(main.Owner = users.id)
^^
) distinctquery, Tickets main WHERE (main.id = distinctquery.id)

This format query also allows for the wrappering by the ApplyLimits patch.

Although I believe that a more efficient version of the DISTINCT and LIMIT
subqueries would be to perform the LIMIT before the DISTINCT join, as in:

SELECT main.* FROM (
SELECT * FROM (
SELECT limitquery.*,rownum rn FROM (
SELECT DISTINCT main.id FROM Tickets main, Users
users WHERE
(main.Owner = users.id) ORDER BY main.id ASC
) limitquery WHERE rownum <= 50
) WHERE rn >= 1
) distinctquery, Tickets main WHERE (main.id = distinctquery.id)

as this would limit the result set before the join with the original table.
Until I find
(or someone else finds) a better way of structuring the LIMIT and DISTINCT
manipulations this format will have to do:

SELECT * FROM (
SELECT limitquery.,rownum rn FROM (
SELECT main.
FROM (
SELECT DISTINCT main.id FROM Tickets main, Users
users WHERE
(main.Owner = users.id) ORDER BY main.id ASC
) distinctquery, Tickets main WHERE (main.id =
distinctquery.id)
) limitquery WHERE rownum <= 50
) WHERE rn >= 1

Those running Oracle will just have to use beefy database servers :wink:

NB:- The patch also changes a section in CleanSlate() from ‘tables’ to
‘table’
which I believe is a typo!

-Brook

= /// /// /// /// _/ _/ Brook Schofield =
= _/ _/ _/ _/ _/ _/ _/ _/ _/ / B.Schofield@griffith.edu.au =
= // //
/ _/ _/ _/ _/ // Ph: +61 7 387 53779 - WCN 0.28 =
= _/ _/ _/ _/ _/ _/ _/ _/ _/ / Directory Services Integration =
= //
/ / / /// /// _/ _/ Griffith University QLD 4111 =

Martin,

I was actually asking the question in the context of DBIx::SearchBuilder

which always prefaces its query as:

SELECT DISTINCT main.* FROM addresses main;

In that case I apologize for my useless remark, since I am not quite
familiar with the SearchBuilder API.

Regards,

Martin

Brook, regarding your last notice, where you change section in CleanSlate()
from ‘tables’ to ‘table’, it is not a typo. After you modify it, it stops
working. Please, test it and remove it from your patch.

Pavel-----Original Message-----
From: BEHAL,PAVEL (HP-Czechia,ex1) [mailto:pavel.behal@hp.com]
Sent: Tuesday, March 18, 2003 1:12 PM
To: ‘Brook Schofield’; rt-devel@fsck.com
Cc: jesse@bestpractical.com
Subject: [rt-devel] RE: DBIx::SearchBuilder DistinctQuery for Oracle

Hi Brook,

I think you are on the right way. I have also checked all the database
schemas for rt2 and rt3 and every table has an ID column, which is PRIMARY
KEY. And we all know, the PRIMARY KEY is always unique, so searching it
DISTINCT does not make sense. So we never need to put DISTINCT in a simple
query with one alias, where the primary key (id) column is also searched
(like main.* or main.id).
I have not found any query with one alias, where we need the DISTINCT. I am
only afraid of the LEFT JOINS, are they extending the query with new
aliases, or not? I do not understand all the SearchBuilder internals well,
so I can not say if there may be any risk.

I like your patch, I will test it and give you a note. (I think, that using
this way of modifications we can also support left joins on 8.1.x Oracle DB.
This will be also great.)

Pavel

-----Original Message-----
From: Brook Schofield [mailto:B.Schofield@mailbox.gu.edu.au]
Sent: Sunday, March 16, 2003 11:15 PM
To: rt-devel@fsck.com
Cc: pavel.behal@hp.com; jesse@bestpractical.com
Subject: DBIx::SearchBuilder DistinctQuery for Oracle

Jesse, Pavel,
attached is a patch against DBIx::SearchBuilder to allow for
DISTINCT queries in Oracle.

a) DISTINCT queries aren’t performed on a query against a single table. I
can’t see a reason why a select on a single table wouldn’t return a
distinct set of data - since that table on its own is distinct. If someone
can give an example that disproves this please tell me.

b) DISTINCT queries are off loaded to the individual database specific
Handles - just incase PostgreSQL or SQLServer prefer a different format.

Since Oracle doesn’t like DISINCT in a select clause containing CLOB/BLOB
types I’ve moved it into a subselect to ease the pain.

Original Query (MySQL compatible):
SELECT DISTINCT main.* FROM Tickets main, Users users WHERE
(main.Owner = users.id) ORDER BY main.id ASC

Oracle Query:
SELECT main.* FROM (
SELECT DISTINCT main.id FROM Tickets main, Users users WHERE
(main.Owner = users.id)
^^
) distinctquery, Tickets main WHERE (main.id = distinctquery.id)

This format query also allows for the wrappering by the ApplyLimits patch.

Although I believe that a more efficient version of the DISTINCT and LIMIT
subqueries would be to perform the LIMIT before the DISTINCT join, as in:

SELECT main.* FROM (
SELECT * FROM (
SELECT limitquery.*,rownum rn FROM (
SELECT DISTINCT main.id FROM Tickets main, Users
users WHERE
(main.Owner = users.id) ORDER BY main.id ASC
) limitquery WHERE rownum <= 50
) WHERE rn >= 1
) distinctquery, Tickets main WHERE (main.id = distinctquery.id)

as this would limit the result set before the join with the original table.
Until I find
(or someone else finds) a better way of structuring the LIMIT and DISTINCT
manipulations this format will have to do:

SELECT * FROM (
SELECT limitquery.,rownum rn FROM (
SELECT main.
FROM (
SELECT DISTINCT main.id FROM Tickets main, Users
users WHERE
(main.Owner = users.id) ORDER BY main.id ASC
) distinctquery, Tickets main WHERE (main.id =
distinctquery.id)
) limitquery WHERE rownum <= 50
) WHERE rn >= 1

Those running Oracle will just have to use beefy database servers :wink:

NB:- The patch also changes a section in CleanSlate() from ‘tables’ to
‘table’
which I believe is a typo!

-Brook

= /// /// /// /// _/ _/ Brook Schofield =
= _/ _/ _/ _/ _/ _/ _/ _/ _/ / B.Schofield@griffith.edu.au =
= // //
/ _/ _/ _/ _/ // Ph: +61 7 387 53779 - WCN 0.28 =
= _/ _/ _/ _/ _/ _/ _/ _/ _/ / Directory Services Integration =
= //
/ / / /// /// _/ _/ Griffith University QLD 4111 =
rt-devel mailing list
rt-devel@lists.fsck.com
http://lists.fsck.com/mailman/listinfo/rt-devel

Hi Brook,

bad news, I have found one problem after your patch. Look at this query:

SELECT main.* FROM ( SELECT DISTINCT main.id FROM Transactions main, Tickets
Tickets_1 WHERE ((Tickets_1.EffectiveId = ‘1’)) AND main.Ticket =
Tickets_1.id ORDER BY main.Created ASC ) distinctquery, Transactions main
WHERE (main.id = distinctquery.id)

It looks fine, but Oracle screams “Error:ORA-01791: not a SELECTed
expression”, because there is an ORDER BY clause on “main.Created” column,
but this column is not mentioned in select statement.
This is working version of this query:

SELECT main.* FROM ( SELECT DISTINCT main.id,main.Created FROM Transactions
main, Tickets Tickets_1 WHERE ((Tickets_1.EffectiveId = ‘1’)) AND
main.Ticket = Tickets_1.id ORDER BY main.Created ASC ) distinctquery,
Transactions main WHERE (main.id = distinctquery.id)

You see? This query is generated when you would like to see the whole
details of a ticket, it selects the whole history of comments.

Regards,

Pavel-----Original Message-----
From: BEHAL,PAVEL (HP-Czechia,ex1) [mailto:pavel.behal@hp.com]
Sent: Tuesday, March 18, 2003 1:12 PM
To: ‘Brook Schofield’; rt-devel@fsck.com
Cc: jesse@bestpractical.com
Subject: [rt-devel] RE: DBIx::SearchBuilder DistinctQuery for Oracle

Hi Brook,

I think you are on the right way. I have also checked all the database
schemas for rt2 and rt3 and every table has an ID column, which is PRIMARY
KEY. And we all know, the PRIMARY KEY is always unique, so searching it
DISTINCT does not make sense. So we never need to put DISTINCT in a simple
query with one alias, where the primary key (id) column is also searched
(like main.* or main.id).
I have not found any query with one alias, where we need the DISTINCT. I am
only afraid of the LEFT JOINS, are they extending the query with new
aliases, or not? I do not understand all the SearchBuilder internals well,
so I can not say if there may be any risk.

I like your patch, I will test it and give you a note. (I think, that using
this way of modifications we can also support left joins on 8.1.x Oracle DB.
This will be also great.)

Pavel

-----Original Message-----
From: Brook Schofield [mailto:B.Schofield@mailbox.gu.edu.au]
Sent: Sunday, March 16, 2003 11:15 PM
To: rt-devel@fsck.com
Cc: pavel.behal@hp.com; jesse@bestpractical.com
Subject: DBIx::SearchBuilder DistinctQuery for Oracle

Jesse, Pavel,
attached is a patch against DBIx::SearchBuilder to allow for
DISTINCT queries in Oracle.

a) DISTINCT queries aren’t performed on a query against a single table. I
can’t see a reason why a select on a single table wouldn’t return a
distinct set of data - since that table on its own is distinct. If someone
can give an example that disproves this please tell me.

b) DISTINCT queries are off loaded to the individual database specific
Handles - just incase PostgreSQL or SQLServer prefer a different format.

Since Oracle doesn’t like DISINCT in a select clause containing CLOB/BLOB
types I’ve moved it into a subselect to ease the pain.

Original Query (MySQL compatible):
SELECT DISTINCT main.* FROM Tickets main, Users users WHERE
(main.Owner = users.id) ORDER BY main.id ASC

Oracle Query:
SELECT main.* FROM (
SELECT DISTINCT main.id FROM Tickets main, Users users WHERE
(main.Owner = users.id)
^^
) distinctquery, Tickets main WHERE (main.id = distinctquery.id)

This format query also allows for the wrappering by the ApplyLimits patch.

Although I believe that a more efficient version of the DISTINCT and LIMIT
subqueries would be to perform the LIMIT before the DISTINCT join, as in:

SELECT main.* FROM (
SELECT * FROM (
SELECT limitquery.*,rownum rn FROM (
SELECT DISTINCT main.id FROM Tickets main, Users
users WHERE
(main.Owner = users.id) ORDER BY main.id ASC
) limitquery WHERE rownum <= 50
) WHERE rn >= 1
) distinctquery, Tickets main WHERE (main.id = distinctquery.id)

as this would limit the result set before the join with the original table.
Until I find
(or someone else finds) a better way of structuring the LIMIT and DISTINCT
manipulations this format will have to do:

SELECT * FROM (
SELECT limitquery.,rownum rn FROM (
SELECT main.
FROM (
SELECT DISTINCT main.id FROM Tickets main, Users
users WHERE
(main.Owner = users.id) ORDER BY main.id ASC
) distinctquery, Tickets main WHERE (main.id =
distinctquery.id)
) limitquery WHERE rownum <= 50
) WHERE rn >= 1

Those running Oracle will just have to use beefy database servers :wink:

NB:- The patch also changes a section in CleanSlate() from ‘tables’ to
‘table’
which I believe is a typo!

-Brook

= /// /// /// /// _/ _/ Brook Schofield =
= _/ _/ _/ _/ _/ _/ _/ _/ _/ / B.Schofield@griffith.edu.au =
= // //
/ _/ _/ _/ _/ // Ph: +61 7 387 53779 - WCN 0.28 =
= _/ _/ _/ _/ _/ _/ _/ _/ _/ / Directory Services Integration =
= //
/ / / /// /// _/ _/ Griffith University QLD 4111 =
rt-devel mailing list
rt-devel@lists.fsck.com
http://lists.fsck.com/mailman/listinfo/rt-devel

At 02:54 PM 18/03/2003 +0100, you wrote:

Brook, regarding your last notice, where you change section in CleanSlate()
from ‘tables’ to ‘table’, it is not a typo. After you modify it, it stops
working. Please, test it and remove it from your patch.

I’m really wondering what ‘tables’ is for now. I can’t seem to find it
anywhere in the code?
Consider it stricken from the patch!

Jesse: does that line $self->{‘tables’} = “”; need to be there at all?
Answer at your leisure.

-Brook

= /// /// /// /// _/ _/ Brook Schofield =
= _/ _/ _/ _/ _/ _/ _/ _/ _/ / B.Schofield@griffith.edu.au =
= // //
/ _/ _/ _/ _/ // Ph: +61 7 387 53779 - WCN 0.28 =
= _/ _/ _/ _/ _/ _/ _/ _/ _/ / Directory Services Integration =
= //
/ / / /// /// _/ _/ Griffith University QLD 4111 =

At 03:21 PM 18/03/2003 +0100, BEHAL,PAVEL (HP-Czechia,ex1) wrote:

bad news, I have found one problem after your patch. Look at this query:

New patch attached. This is still against DBIx::SearchBuilder-0.80 and
moves the _OrderClause to after the DISTINCT modification so that the
following SQL can be generated:

SELECT main.* FROM (
SELECT DISTINCT main.id FROM Transactions main, Tickets Tickets_1
WHERE ((Tickets_1.EffectiveId = ‘1’)) AND main.Ticket =
Tickets_1.id
) distinctquery, Transactions main
WHERE (main.id = distinctquery.id) ORDER BY main.Created ASC

this will stop the errors that Pavel reported earlier.

-Brook

SELECT main.* FROM ( SELECT DISTINCT main.id FROM Transactions main, Tickets
Tickets_1 WHERE ((Tickets_1.EffectiveId = ‘1’)) AND main.Ticket =
Tickets_1.id ORDER BY main.Created ASC ) distinctquery, Transactions main
WHERE (main.id = distinctquery.id)

It looks fine, but Oracle screams “Error:ORA-01791: not a SELECTed
expression”, because there is an ORDER BY clause on “main.Created” column,
but this column is not mentioned in select statement.
This is working version of this query:

SELECT main.* FROM ( SELECT DISTINCT main.id,main.Created FROM Transactions
main, Tickets Tickets_1 WHERE ((Tickets_1.EffectiveId = ‘1’)) AND
main.Ticket = Tickets_1.id ORDER BY main.Created ASC ) distinctquery,
Transactions main WHERE (main.id = distinctquery.id)

You see? This query is generated when you would like to see the whole
details of a ticket, it selects the whole history of comments.

Regards,

Pavel

= /// /// /// /// _/ _/ Brook Schofield =
= _/ _/ _/ _/ _/ _/ _/ _/ _/ / B.Schofield@griffith.edu.au =
= // //
/ _/ _/ _/ _/ // Ph: +61 7 387 53779 - WCN 0.28 =
= _/ _/ _/ _/ _/ _/ _/ _/ _/ / Directory Services Integration =
= //
/ / / /// /// _/ _/ Griffith University QLD 4111 =

oracle_distinct_v2.patch (3.62 KB)