Help! I enabled Full Text search and now Simple search won't look at Subjects!

I am running 4.2.11

I have install mariadb 10.0

I have enabled native mySQL full text search

It works fine - EXCEPT: it seems that I can have Full Text -or- Subject from the Simple Search box, but not both?

If I disable the full text from the RT_Site_config then the simple search box searches subjects but not attachments

If I enable the full text from the RT_Site_config then the simple search box searches attachments but not subjects

I don’t know how to proceed, this is not going to go over well…

I also have a custom field my users insisted on that will carry text — can I index and search more than one table (attachments) ?

thanks
al;

I am running 4.2.11

I have install mariadb 10.0

I have enabled native mySQL full text search

It works fine - EXCEPT: it seems that I can have Full Text -or- Subject from the Simple Search box, but not both?

If I disable the full text from the RT_Site_config then the simple search box searches subjects but not attachments

If I enable the full text from the RT_Site_config then the simple search box searches attachments but not subjects

I don’t know how to proceed, this is not going to go over well…

I also have a custom field my users insisted on that will carry text — can I index and search more than one table (attachments) ?

thanks
al;

Hi Al,

It definitely looks like a bug to me. Have you reported it? It also looks like
it will only handle a single table search so you will need to write code to add
support for your other custom field. I had something similar in my PostgreSQL
patch for older systems, but I have not looked recently.

Regards,
Ken

I also have a custom field my users insisted on that will carry text — can I index and search more than one table (attachments) ?

Current RT does not support full text searching (FTS) on CFs. I
remember conversing with one of the devs and the ACLs of the CFs was
one point of programming concern. Obviously ticket contents also have
ACLs associated with them so I don’t quite understand why he mentioned
the ACLs.

-m

I am running 4.2.11

I have install mariadb 10.0

I have enabled native mySQL full text search

It works fine - EXCEPT: it seems that I can have Full Text -or- Subject from the Simple Search box, but not both?

If I disable the full text from the RT_Site_config then the simple search box searches subjects but not attachments

If I enable the full text from the RT_Site_config then the simple search box searches attachments but not subjects

I don’t know how to proceed, this is not going to go over well…

I also have a custom field my users insisted on that will carry text — can I index and search more than one table (attachments) ?

thanks
al;

Hi Al,

It looks like you should be able to change the following function in
lib/RT/Search/Simple.pm:

$_[0] is $self

$_[1] is escaped value without surrounding single quotes

$_[2] is a boolean of “was quoted by the user?”

ensure this is false before you do smart matching like $_[1] eq “me”

$_[3] is escaped subkey, if any (see HandleCf)

sub HandleDefault {
my $fts = RT->Config->Get(‘FullTextSearch’);
if ($fts->{Enable} and $fts->{Indexed}) {
return default => “Content LIKE ‘$[1]’";
} else {
return default => "Subject LIKE '$
[1]’”;
}
}

to something like the following (untested):

sub HandleDefault {
my $fts = RT->Config->Get(‘FullTextSearch’);
if ($fts->{Enable} and $fts->{Indexed}) {
return default => “Subject LIKE ‘$[1]’ OR Content LIKE '$[1]’”;
} else {
return default => “Subject LIKE ‘$_[1]’”;
}
}

Regards,
Ken

It looks like you should be able to change the following function in
lib/RT/Search/Simple.pm: [snip]

RT 4.2.10 made the change to only search the Content field, and not
also the Subject field, intentionally; see the commit the did so [1]
for the full rationale.

In short, it is impossible to have a performant full-text search if
you search both Subject and Content. In most RT installs, the
auto-reply from the ticket creation includes the subject of the
ticket, which means it is indexed along with the ticket, so removing
the Subject clause still results in finding the appropriate ticket.

You’re welcome to revert the change locally – though I would suggest
doing so via an overlay and not by editing the file directly, or your
change will be lost when you next upgrade. If you do make the change,
I expect you’ll find the simple search to be noticeably (and perhaps
unusably) slower.

  • Alex

[1] https://github.com/bestpractical/rt/commit/8450f0a9f233d6a761ac22dbdf14926abc54d7fa

We are using sphinx (since last night) for FTS searches.

Setup is as described in this excellent guide:

We also ran in to the problem described here, and I was thinking if one
could resolve it by including the subject in the sphinx database…

Modifying the query in the guide to simply include the t.Subject like
this…

/etc/sphinxsearch/sphinx.conf

sql_query =
SELECT a.id id, t.Subject, a.content FROM Attachments a
JOIN Transactions txn ON a.TransactionId = txn.id AND txn.ObjectType =
‘RT::Ticket’
JOIN Tickets t ON txn.ObjectId = t.id
WHERE a.ContentType LIKE ‘%text%’ AND t.Status != ‘deleted’

The problem with this is, if there is a ticket that does not have an
attachment, it will not get indexed and is not searchable through FTS…

Maybe this solution could help someone, but also I’m interested if it
would be possible to use an “TicketsIndex” table or somehow extend the
"AttachmentsIndex" to include ticketId to index “tickets” rather than
"attachments"…?

Claes Merin
G�stabudstaden ABOn 2015-10-09 08:09, Alex Vandiver wrote:

On Wed, Oct 07, 2015 at 04:23:02PM -0500, ktm@rice.edu wrote:

It looks like you should be able to change the following function in
lib/RT/Search/Simple.pm: [snip]

RT 4.2.10 made the change to only search the Content field, and not
also the Subject field, intentionally; see the commit the did so [1]
for the full rationale.

In short, it is impossible to have a performant full-text search if
you search both Subject and Content. In most RT installs, the
auto-reply from the ticket creation includes the subject of the
ticket, which means it is indexed along with the ticket, so removing
the Subject clause still results in finding the appropriate ticket.

You’re welcome to revert the change locally – though I would suggest
doing so via an overlay and not by editing the file directly, or your
change will be lost when you next upgrade. If you do make the change,
I expect you’ll find the simple search to be noticeably (and perhaps
unusably) slower.

  • Alex

[1] https://github.com/bestpractical/rt/commit/8450f0a9f233d6a761ac22dbdf14926abc54d7fa

We are using sphinx (since last night) for FTS searches.

Setup is as described in this excellent guide:

http://blog.christosoft.de/2016/07/rt4-request-tracker4-fulltext-search-sphinx-debian-jessie/

Is there a reason you’re not using MySQL’s built-in search, available
since 4.2.10? It’s more performant, and has none of the caveats:
https://docs.bestpractical.com/rt/4.2/full_text_indexing.html#Caveats1

Especially since it allows for incremental indexing of new attachment
rows, I know of no reason to prefer Sphinx.

We also ran in to the problem described here, and I was thinking if one
could resolve it by including the subject in the sphinx database…

I expect that should work fine.

Modifying the query in the guide to simply include the t.Subject like
this…

/etc/sphinxsearch/sphinx.conf

sql_query =
SELECT a.id id, t.Subject, a.content FROM Attachments a
JOIN Transactions txn ON a.TransactionId = txn.id AND txn.ObjectType =
‘RT::Ticket’
JOIN Tickets t ON txn.ObjectId = t.id
WHERE a.ContentType LIKE ‘%text%’ AND t.Status != ‘deleted’

The problem with this is, if there is a ticket that does not have an
attachment, it will not get indexed and is not searchable through FTS…

The “attachments” table is used to index all content in RT. This
includes textual content, such as the bodies of messages. As such, it
is impossible for a ticket to not have at least one Attachment record.

  • Alex

I beg to differ about the attachments table, in our setup we rely on
customfields and there parameters. We have quite a few tickets that does
not have any attachments.
If you do not write anything in the message body when you create a
ticket (through web-ui or REST), there will be no attachment connected
to that ticket.

Also i did a comparison between mysql/mariadb FULLTEXT index and sphinx,
and it seems that sphinxsearch is able to extract more info than the
mysql FTS. I guess it’s about the data processing that sphinx does with
encoding and other things.

Our goal is to include CF’s in the fulltext index as well. As far as I
can see right now, there are two ways to do this.

a) Modify the RT simplesearch use an “TicketsIndex” instead of the
"AttachmentsIndex".

Use sphinx to create the TicketsIndex table and modify the collector
query to include both attachment, ticket.subject and customfields in the
the database. (Simple join statements)

b) Create “null” attachments and attach these to tickets that does not
have any current attachments, just to create and relation between ticket
and attachment. The advantage with this is that we do not have to modify
any of the RT code, but of course it adds some rows to the database.

Claes MerinOn 2016-08-20 06:34, Alex Vandiver wrote:

On Fri, 19 Aug 2016 16:35:07 +0200 Claes Merin claes@gastabud.com wrote:

We are using sphinx (since last night) for FTS searches.

Setup is as described in this excellent guide:

http://blog.christosoft.de/2016/07/rt4-request-tracker4-fulltext-search-sphinx-debian-jessie/

Is there a reason you’re not using MySQL’s built-in search, available
since 4.2.10? It’s more performant, and has none of the caveats:
https://docs.bestpractical.com/rt/4.2/full_text_indexing.html#Caveats1

Especially since it allows for incremental indexing of new attachment
rows, I know of no reason to prefer Sphinx.

We also ran in to the problem described here, and I was thinking if one
could resolve it by including the subject in the sphinx database…

I expect that should work fine.

Modifying the query in the guide to simply include the t.Subject like
this…

/etc/sphinxsearch/sphinx.conf

sql_query =
SELECT a.id id, t.Subject, a.content FROM Attachments a
JOIN Transactions txn ON a.TransactionId = txn.id AND txn.ObjectType =
‘RT::Ticket’
JOIN Tickets t ON txn.ObjectId = t.id
WHERE a.ContentType LIKE ‘%text%’ AND t.Status != ‘deleted’

The problem with this is, if there is a ticket that does not have an
attachment, it will not get indexed and is not searchable through FTS…

The “attachments” table is used to index all content in RT. This
includes textual content, such as the bodies of messages. As such, it
is impossible for a ticket to not have at least one Attachment record.

  • Alex

I beg to differ about the attachments table, in our setup we rely on
customfields and there parameters. We have quite a few tickets that does
not have any attachments.
If you do not write anything in the message body when you create a
ticket (through web-ui or REST), there will be no attachment connected
to that ticket.

Creating tickets via the web UI creates empty Attachment rows if the
user enters no content, from my testing on 4.4. I thought we’d
resolved that REST didn’t do the same, but I concur that still isn’t the
case.

Also i did a comparison between mysql/mariadb FULLTEXT index and sphinx,
and it seems that sphinxsearch is able to extract more info than the
mysql FTS. I guess it’s about the data processing that sphinx does with
encoding and other things.

I’d be curious to hear more about this. What do you mean by “extract
more info”? Do you mean better stemming, or what?

Our goal is to include CF’s in the fulltext index as well. As far as I
can see right now, there are two ways to do this.

Be aware that this can have security implications, as RT has no way to
apply ACLs if all of the different content sources are in the same
index; it becomes possible to find tickets based on custom field values
you cannot actually see.

a) Modify the RT simplesearch use an “TicketsIndex” instead of the
"AttachmentsIndex".

Use sphinx to create the TicketsIndex table and modify the collector
query to include both attachment, ticket.subject and customfields in the
the database. (Simple join statements)

b) Create “null” attachments and attach these to tickets that does not
have any current attachments, just to create and relation between ticket
and attachment. The advantage with this is that we do not have to modify
any of the RT code, but of course it adds some rows to the database.

(b) makes more sense to me. You currently can limit on transaction
date and creator, and those limits apply in conjunction with content
limits. If you effectively lump all attachment content to be
cross-referenced by ticket-id instead of attachment-id, you lose that
functionality.

  • Alex