Ticket content not displaying via RT's web 'Search' but shows up with sphinx's 'search' binary

After successfully setting up RT with sphinx I’m trying to search for a
string
in a ticket’s ‘content’. When I enter the string into ‘Search’ box of RT it
returns no result even when I’m using fulltext: format of
search.
When the same is searched with Sphinx’s ‘search’ binary, the results are
coming
up fine. Is RT not communicating with Sphinx over here?

A section of my RT_SiteConfig.pm:

Set( %FullTextSearch,
Enable => 1,
Indexed => 1,
Table => ‘AttachmentsIndex’,
MaxMatches => ‘10000’,
);

mysql> show create table AttachmentsIndex \G

Table: AttachmentsIndex
Create Table: CREATE TABLE AttachmentsIndex (
id int(10) unsigned NOT NULL,
weight int(11) NOT NULL,
query varchar(3072) NOT NULL,
KEY query (query(1024))
) ENGINE=SPHINX DEFAULT CHARSET=utf8
CONNECTION=‘sphinx://localhost:3312/rt,rt_delta’

NB:

  1. I have stopped,started Apache after all the sphinx integration was done.
    Also did it again when search resultswere not returned from
    web-interface search.

  2. ‘searchd’ is listening at 127.0.0.1:3312

This issue now seems to be with only ‘updated’ and ‘newly’ tickets. The
custom
string which I create by replying to tickets are not being returned via RT’s
web search. But are being returned when searched from command line using
sphinx’s ‘search’ binary.

Thanks,

This issue now seems to be with only ‘updated’ and ‘newly’ tickets. The
custom
string which I create by replying to tickets are not being returned via RT’s
web search. But are being returned when searched from command line using
sphinx’s ‘search’ binary.

Turn on RT’s SQL statement logging via the StatementLog option:
RT Config - RT 4.0.25 Documentation - Best Practical and
then either look at the log output or use Configuration → Tools → SQL
queries (easier) to find the fulltext query. You’ll see it queries the
AttachmentsIndex table. You can run by hand the query RT is running to
start seeing where the problem lies.

I’ve SQL query logging onto a file rt.log. While running searches I’m
suffixing “grep AttachmentsIndex”
since there is no much of other log data.

I’ve noticed, for every “successful” string search(a search that returns
expected matching results)
from RT webUI there are 2 sets of SQL queries performed on
‘AttachmentsIndex’. One begins with
‘SELECT COUNT(DISTINCT main.id)…’ and other ‘SELECT DISTINCT
main.*…’. Please see the output below
from the logs for the search query ‘fulltext:uniqops2reply1’ where
‘uniqops2reply1’//was the only
content of a particular ticket that was indexed.
/
//SELECT COUNT(DISTINCT main.id) FROM Tickets main JOIN Transactions
Transactions_1 ON
( Transactions_1.ObjectType = ‘RT::Ticket’ ) AND (
Transactions_1.ObjectId = main.id )
JOIN Attachments Attachments_2 ON ( Attachments_2.TransactionId =
Transactions_1.id )
JOIN AttachmentsIndex AttachmentsIndex_3 ON ( AttachmentsIndex_3.id =
Attachments_2.id )
WHERE (main.Status != ‘deleted’) AND ( ( ( AttachmentsIndex_3.query =
‘uniqops2reply1;limit=10000;maxmatches=10000’ ) )
AND ( main.Status = ‘new’ OR main.Status = ‘open’ OR main.Status =
‘stalled’ ) ) AND (main.Type = ‘ticket’)
AND (main.EffectiveId = main.id) ;
(/opt/rt4/sbin/…/lib/RT/Interface/Web.pm:1183)/
/SELECT DISTINCT main. FROM Tickets main JOIN Transactions
Transactions_1 ON
( Transactions_1.ObjectType = ‘RT::Ticket’ ) AND (
Transactions_1.ObjectId = main.id )
JOIN Attachments Attachments_2 ON ( Attachments_2.TransactionId =
Transactions_1.id )
JOIN AttachmentsIndex AttachmentsIndex_3 ON ( AttachmentsIndex_3.id =
Attachments_2.id )
WHERE (main.Status != ‘deleted’) AND ( ( ( AttachmentsIndex_3.query =
‘uniqops2reply1;limit=10000;maxmatches=10000’ ) )
AND ( main.Status = ‘new’ OR main.Status = ‘open’ OR main.Status =
‘stalled’ ) )
AND (main.Type = ‘ticket’) AND (main.EffectiveId = main.id) ORDER BY
main.id ASC
LIMIT 50; (/opt/rt4/sbin/…/lib/RT/Interface/Web.pm:1183)//
/
Next, I made a 2nd reply to the same ticket with the content as
‘uniqops2reply2’. I ran ‘indexer’ with
–rotate and was successful. For clarity, I issued CLI search of sphinx
and it exactly matched
the string ‘uniqops2reply2’ in the same ticket. But on querying
‘fulltext:uniqops2reply2’ on
web-UI it failed with 0 results! I checked the logs to see that only 1
out of the 2 SQL queries
were executed i.e. only the below was seen for the failed web-UI search:

/SELECT COUNT(DISTINCT main.id) FROM Tickets main JOIN Transactions
Transactions_1 ON
( Transactions_1.ObjectType = ‘RT::Ticket’ ) AND (
Transactions_1.ObjectId = main.id )
JOIN Attachments Attachments_2 ON ( Attachments_2.TransactionId =
Transactions_1.id )
JOIN AttachmentsIndex AttachmentsIndex_3 ON ( AttachmentsIndex_3.id =
Attachments_2.id )
WHERE (main.Status != ‘deleted’) AND ( ( ( AttachmentsIndex_3.query =
‘uniqops2reply2;limit=10000;maxmatches=10000’ ) )
AND ( main.Status = ‘new’ OR main.Status = ‘open’ OR main.Status =
‘stalled’ ) ) AND (main.Type = ‘ticket’)
AND (main.EffectiveId = main.id) ;
(/opt/rt4/sbin/…/lib/RT/Interface/Web.pm:1183)

/The /‘SELECT DISTINCT…’ /query//was missing. I re-ran ‘indexer’ again
but the same story repeated.
I executed this missing query by hand on the MySQL server and it
returned the matching result!

Testing further, another 3rd reply was made to the same ticket with
content ‘uniqops2reply3’.
I *did not *run ‘indexer’. Then I ran the previous failed query
‘fulltext:uniqops2reply2’.
Surprisingly it returned the results! Also on checking rt.log both the
SQL queries
/‘SELECT COUNT…’ /and ‘/SELECT DISTINCT…’ /were executed!

I ran ‘indexer’ to index the 3rd reply. Same story;
‘fulltext:uniqops2reply3’ failed with 0
results and with only /‘SELECT COUNT…’/ being executed. I created a
new ticket instead
of replying to the same ticket. I issued the previously failed
‘fulltext:uniqops2reply3’ again,
but this time it returned the expected result with both SQL queries
executed.

So it seems that after an update is made and is indexed, it requires
another update for the previously
updated content to be available from RT’s webUI search OR putting it in
more specific terms: it requires
another update so that ‘/SELECT DISTINCT…’ /is also executed on
AttachmentsIndex while searching
for previously indexed content. What could possibly be the problem here?

Thanks,

So it seems that after an update is made and is indexed, it requires
another update for the previously updated content to be available
from RT’s webUI search OR putting it in more specific terms: it
requires another update so that ‘/SELECT DISTINCT…’ /is also
executed on AttachmentsIndex while searching for previously indexed
content. What could possibly be the problem here?

The first query is selecting the number of rows that would match; when
RT sees that MySQL reports 0 rows, it doesn’t bother to run the query
which fetches the actual ticket data. It’s not a bug that the second
query isn’t run when the first returns 0. The question is why does the
first query return 0, and I expect that’s more MySQL and Sphinx related
than RT related.

It’s not clear what you’re reindexing when you run ‘indexer’. Are you
reindexing the main index or the delta index or both?

Do you have this problem if you don’t use a delta index (i.e. is this
caused by the main+delta setup you’re using for Sphinx)?

… the second query isn’t run when the first returns 0.
This hint helped narrow down things. I figured out the pattern/situation
under which search failure occurs, which is:

“If I create/update a ticket and immediately perform fulltext: search
of the same content before reindexing is run(on delta), it will fail to
show up on fulltext searches even after the content is later
reindexed. In this situation if I run a ‘search’ from CLI(on delta index
specifically) it returns successful hits. Running SELECT COUNT by hand
also returns >0 integer at this point. If I do not search for newly created
ticket-content before its reindexed, this won’t occur. Things are normal
in that case.”

Yes, it is bit awkward but that is the pattern I observe after 8-9 hit and
trials ! I’ll try clarify things in steps for you:

  1. Created a new ticket. Subject - uniqsupport1 , Content - uniqsupport1tic.
    So I’ll be searching for this 1 word content from web end.

  2. Searched “fulltext:uniqsupport1tic” from RT’s web end. Yes obviously it
    didn’t return a result.

  3. Executed SELECT COUNT query that I got from the logs for this search
    and it returned 0. All as expected.

  4. Indexed delta index (rt_delta) with --rotate. From the command line,
    I did “search -i rt_delta uniqsupport1tic”. Got a matching result ticket
    as expected. I executed the same SELECT COUNT query from Step 2
    by hand and it returned 1.

  5. I issued “fulltext:uniqsupport1tic” for 2nd time from RT’s web end.
    Surprisingly, 0 matching result. Observing rt.log I find that only the
    SELECT COUNT was being executed in response to this 2nd time fulltext
    search. This was happening when the same SELECT COUNT returned
    1 for the search of the same ticket content ‘uniqsupport1tic’.

After Step 5, I reindexed delta again but proved no use. In this
situation I created a “new” ticket with some random content. I ran the
previous failed search again (fulltext:uniqsupport1tic) for 3rd time
and this time, it returned the exact matching result. On checking the
logs, the 2 SQL queries were also executed consecutively. This also
worked if I replied to ANY existing ticket instead of creating a new one!

The interesting fact here is that had I NOT done ‘Step 2’ after updating
or creating a ticket, this whole issue don’t exist and I get exact matching
results from fulltext search “without creating or updating a ticket”.
Yes. I redid Steps 1-5 with a new ticket “skipping only Step 2” and in
the 5th step I got the exact matching result.

Thanks,

… the second query isn’t run when the first returns 0.
This hint helped narrow down things. I figured out the pattern/situation
under which search failure occurs, which is:

“If I create/update a ticket and immediately perform fulltext: search
of the same content before reindexing is run(on delta), it will fail to
show up on fulltext searches even after the content is later
reindexed. In this situation if I run a ‘search’ from CLI(on delta index
specifically) it returns successful hits. Running SELECT COUNT by hand
also returns >0 integer at this point. If I do not search for newly created
ticket-content before its reindexed, this won’t occur. Things are normal
in that case.”

Thanks for your detailed investigation and description (which I’ve
trimmed from below). It sounds very much like you’re hitting a cache on
the SELECT COUNT query that isn’t expired until you create a new ticket
(or presumably after some longer amount of time). RT doesn’t cache
database queries this way, so I suspect it’s MySQL’s standard query
cache or, perhaps more likely, some cache in the Sphinx engine for MySQL
(not the sphinxd server itself). Modifying one of the tables involved
in the query, as you do by creating a new ticket or replying to an
existing ticket, then triggers the cache expiry.

If you’re interested in further tracking this down, I’d investigate
caching at the MySQL and Sphinx level. However, it may be moot in a
production environment where the caching is beneficial and there’s
enough database activity to keep it from being noticeably stale.

… the second query isn’t run when the first returns 0.
This hint helped narrow down things. I figured out the pattern/situation
under which search failure occurs, which is:

“If I create/update a ticket and immediately perform fulltext: search
of the same content before reindexing is run(on delta), it will fail to
show up on fulltext searches even after the content is later
reindexed. In this situation if I run a ‘search’ from CLI(on delta index
specifically) it returns successful hits. Running SELECT COUNT by hand
also returns >0 integer at this point. If I do not search for newly created
ticket-content before its reindexed, this won’t occur. Things are normal
in that case.”
Thanks for your detailed investigation and description (which I’ve
trimmed from below). It sounds very much like you’re hitting a cache on
the SELECT COUNT query that isn’t expired until you create a new ticket
(or presumably after some longer amount of time). RT doesn’t cache
database queries this way, so I suspect it’s MySQL’s standard query
cache or, perhaps more likely, some cache in the Sphinx engine for MySQL
(not the sphinxd server itself). Modifying one of the tables involved
in the query, as you do by creating a new ticket or replying to an
existing ticket, then triggers the cache expiry.

If you’re interested in further tracking this down, I’d investigate
caching at the MySQL and Sphinx level. However, it may be moot in a
production environment where the caching is beneficial and there’s
enough database activity to keep it from being noticeably stale.

Spot on! It was the mysql query cache that was responsible.
I disabled it and search just works fine without need to update
tables. All along I thought RT did some caching in this situation.

Well I guess this query caching can be left ON anyway since on
a busy production environment the likeliness of the cache to remain
state is very less.

Thank you for your effort and continued support.

Thanks for posting so much detail on your problem.

I thought that this might be the answer for me as I am experiencing exactly
the same issue!

I’m running rt 4.2.3 with MariaDB and Sphinx.

Searching ticket content from bash using the “search” command returns plenty
of results but the web search function returns 0!

I’ve enabled the statement logging as suggested and disable query caching to
see if this might make a difference but to no avail.

I’m struggling to work out what my next step would be and if it wasn’t so
hard I’d be tempted to migrate over to PostgreSQL (that seems even more hard
work).

Do you guys have any suggestions?

View this message in context: http://requesttracker.8502.n7.nabble.com/ticket-content-not-displaying-via-RT-s-web-Search-but-shows-up-with-sphinx-s-search-binary-tp52602p57352.html

Thanks for posting so much detail on your problem.

I thought that this might be the answer for me as I am experiencing exactly
the same issue!

I’m running rt 4.2.3 with MariaDB and Sphinx.

Searching ticket content from bash using the “search” command returns plenty
of results but the web search function returns 0!

I’ve enabled the statement logging as suggested and disable query caching to
see if this might make a difference but to no avail.

I expect you’re running into one of the situations where MySQL chooses
to not use the magic index that causes it to talk to the Sphinx indexer,
and instead looks at the contents of the AttachmentsIndex table
row-by-row – which is empty. You can determine if this is the case by
first determining the SQL your query produces; first, go to the
“Advanced” tab in RT, and copy the TicketSQL you’re running.

Then, from the command line, feed it to the following snippet to show
what RT is generating for the database; adjust the last line to match
your TicketSQL query:

perl -I/opt/rt4/lib -MRT=-init -le ‘$t=RT::Tickets->new(RT->SystemUser);
$t->FromSQL(“@ARGV”); print $t->BuildSelectQuery;’
“Content LIKE ‘moose’”

Take the SQL that generates, and feed it to MySQL, prefixing with EXPLAIN:

mysql> EXPLAIN SELECT DISTINCT main.*
FROM Tickets main
JOIN Transactions Transactions_1
ON ( Transactions_1.ObjectType = ‘RT::Ticket’ )
AND ( Transactions_1.ObjectId = main.id )
JOIN Attachments Attachments_2
ON ( Attachments_2.TransactionId = Transactions_1.id )
JOIN AttachmentsIndex AttachmentsIndex_3
ON ( AttachmentsIndex_3.id = Attachments_2.id )
WHERE (main.IsMerged IS NULL)
AND (main.Status != ‘deleted’)
AND (main.Type = ‘ticket’)
AND ( ( AttachmentsIndex_3.query
= ‘moose;limit=10000;maxmatches=10000’ ) );

This will return something like (omitting most columns for brevity):

| table              | key     |
| AttachmentsIndex_3 | query   |
| Attachments_2      | PRIMARY |
| Transactions_1     | PRIMARY |
| main               | NULL    |

If the AttachmentsIndex table does not list “query” as the chosen key,
then MySQL’s query planner has decided to not use the magic Sphinx index.

I’m struggling to work out what my next step would be and if it wasn’t so
hard I’d be tempted to migrate over to PostgreSQL (that seems even more hard
work).

Do you guys have any suggestions?

If you were on plain MySQL, and not MariaDB, I’d suggest the new
4.2/mysql-native-fts branch that is currently in testing, as described
at Carbon60: Managed Cloud Services I do not know how
it will work in the context of MariaDB, however.

  • Alex

Thanks for the quick response!

Glad to have some assistance on this as I certainly don’t have the advanced
skills required to debug this stuff.

Followed your instructions and got the following result:

http://requesttracker.8502.n7.nabble.com/file/n57365/rt_query.png

I guess this means that it is using the magic sphinx index? Any clues as to
where to look next?

View this message in context: http://requesttracker.8502.n7.nabble.com/ticket-content-not-displaying-via-RT-s-web-Search-but-shows-up-with-sphinx-s-search-binary-tp52602p57365.html

Thanks for the quick response!

Glad to have some assistance on this as I certainly don’t have the advanced
skills required to debug this stuff.

Followed your instructions and got the following result:

http://requesttracker.8502.n7.nabble.com/file/n57365/rt_query.png

I guess this means that it is using the magic sphinx index? Any clues as to
where to look next?

Please show both the TicketSQL and the SQL that produced that.

  • Alex

Morning. Should have provided more detail for you.

Here is the ticket SQL for my search:

“( Content LIKE ‘sace’ ) AND ( Status = ‘new’ OR Status = ‘open’ OR Status =
‘stalled’ )”

I then piped this into the perl command as per your instructions:

perl -I/opt/rt4/lib -MRT=-init -le ‘$t=RT::Tickets->new(RT->SystemUser);
$t->FromSQL(“@ARGV”); print $t->BuildSelectQuery;’ \ “Content LIKE ‘sace’
AND ( Status = ‘new’ OR Status = ‘open’ OR Status = ‘stalled’ )”

Which gave me the following SQL query:

SELECT DISTINCT main.* FROM Tickets main JOIN Transactions Transactions_1
ON ( Transactions_1.ObjectType = ‘RT::Ticket’ ) AND (
Transactions_1.ObjectId = main.id ) JOIN Attachments Attachments_2 ON (
Attachments_2.TransactionId = Transactions_1.id ) JOIN AttachmentsIndex
AttachmentsIndex_3 ON ( AttachmentsIndex_3.id = Attachments_2.id ) WHERE
(main.IsMerged IS NULL) AND (main.Status != ‘deleted’) AND (main.Type =
‘ticket’) AND ( ( AttachmentsIndex_3.query =
‘sace;limit=10000;maxmatches=10000’ ) AND ( main.Status = ‘new’ OR
main.Status = ‘open’ OR main.Status = ‘stalled’ ) )

I put this query into MySQL with “explain” on the front as such:

Mariadb [(rt4)]> explain SELECT DISTINCT main.* FROM Tickets main JOIN
Transactions Transactions_1 ON ( Transactions_1.ObjectType = ‘RT::Ticket’ )
AND ( Transactions_1.ObjectId = main.id ) JOIN Attachments Attachments_2 ON
( Attachments_2.TransactionId = Transactions_1.id ) JOIN AttachmentsIndex
AttachmentsIndex_3 ON ( AttachmentsIndex_3.id = Attachments_2.id ) WHERE
(main.IsMerged IS NULL) AND (main.Status != ‘deleted’) AND (main.Type =
‘ticket’) AND ( ( AttachmentsIndex_3.query =
‘sace;limit=10000;maxmatches=10000’ ) AND ( main.Status = ‘new’ OR
main.Status = ‘open’ OR main.Status = ‘stalled’ ) );

Any clues in there?

I turned transaction logging on and syslog doesn’t show any errors:

May 14 23:20:54 localhost RT: [4994] SQL(0.001437s): SELECT COUNT(DISTINCT
main.id) FROM Tickets main JOIN Transactions Transactions_1 ON (
Transactions_1.ObjectType = ‘RT::Ticket’ ) AND ( Transactions_1.ObjectId =
main.id ) JOIN Attachments Attachments_2 ON ( Attachments_2.TransactionId =
Transactions_1.id ) JOIN AttachmentsIndex AttachmentsIndex_3 ON (
AttachmentsIndex_3.id = Attachments_2.id ) WHERE (main.IsMerged IS NULL)
AND (main.Status != ‘deleted’) AND (main.Type = ‘ticket’) AND ( ( (
AttachmentsIndex_3.query = ‘sace;limit=10000;maxmatches=10000’ ) ) AND (
main.Status = ‘new’ OR main.Status = ‘open’ OR main.Status = ‘stalled’ ) ) ;
(/opt/rt/rt4-mysql/sbin/…/lib/RT/Interface/Web.pm:1292)

View this message in context: http://requesttracker.8502.n7.nabble.com/ticket-content-not-displaying-via-RT-s-web-Search-but-shows-up-with-sphinx-s-search-binary-tp52602p57378.html