Full text indexing in 4.4.1 - searching

Having setup full text indexing in RT 4.4.1, it appears to be working well. However…

A user of our system came to me and asked how to search for a ‘word’ which was actually a MAC address formulated something like this:

11-22-33-44-aa-bb

Attempting to search for this as a word failed despite it being present in one ticket’s content. Are there any guidelines as to how to search as opposed to how to set up the indexing! Whole words only I assume because substrings don’t seem to be searchable and I assume in the example above that there is something about the hyphens which is preventing the mac address ‘word’ form being recognised. I’ve looked for docs on this but been unable to find anything which explicitly explains what can be expected to work or if there is any means of escaping punctuation characters to make such a search succeed.

Have you tried putting the string in quotes? I find that sometimes gives better results than leaving quotes off, even for single words that shouldn’t be caught as keywords.

Hi all,

I’m having a similar issue but with IP addresses - single & double quotes and escaping the periods (.) are not helping.

The output is always based on at least the first three characters (172), which is returning multiple tickets that share those three octets but not the rest of the address. For example, if I search for 172.123.123.123 I receive the following matches within tickets:
172.123.123.123
172.9.23.78 - essentially anything that starts with 172
172.0.154.14

@timshaw61 - Were you able to resolve the issue?

Attempted searches:
fulltext:172.123.123.123
fulltext:172.123
fulltext:'172.123.123.123’
fulltext:'172.123’
fulltext:'172.123.123.123’
fulltext:"172.123.123.123"
fulltext:"172.123.123.123"
fulltext:“172.123”

Hi there,

do you tried searching with escaped special charecters.
For example: Content LIKE ‘“172\.123\.123\.123”’

I noticed that it is importand to enter a double quote at the beginning and the end. The single quotes and the second backslashes are added during the search.
So you have to enter into the field Content LIKE “172.123.123.123”.

Hope that helps.

I have not yet found a way to search for mac addresses formatted with colon separators and it appear from wht is being asked here that IP addresses cannot be extracted from a full text search either. No escaping mechansim that I have tried has any effect and surrounding the text in quotation marks has no effect either

atre forum@bestpractical.com 17/01/2018 06:07 >>>

Hi all,

I’m having a similar issue but with IP addresses - single & double quotes and escaping the periods (.) are not helping.

The output is always based on at least the first three characters (172), which is returning multiple tickets that share those three octets but not the rest of the address. For example, if I search for 172.123.123.123 I receive the following matches within tickets:
172.123.123.123
172.9.23.78 - essentially anything that starts with 172
172.0.154.14

@timshaw61 - Were you able to resolve the issue?

Attempted searches:
fulltext:172.123.123.123
fulltext:172.123
fulltext:'172.123.123.123’
fulltext:'172.123’
fulltext:'172.123.123.123’
fulltext:"172.123.123.123"
fulltext:"172.123.123.123"
fulltext:“172.123”

Hi,

What DB and fulltext index system are you using? Certainly the IP address seems to work for me using a PostgreSQL DB and fulltext index. I do not know about the MAC address search, since I do not have a simple example to try. What you can find with a fulltext index is dependent on the tokens that the text is decomposed to by the parsing process. It looks like the default parser will work with IP addresses in PostgreSQL. You will need to check and possibly modify the parsing process to generate to appropriate search tokens for your searches.

Regards,
Ken

Hello,

At the CS department of the University of BC, in Canada, we just upgraded from RT 4.0.24 to RT 4.4.2 and are experiencing the same problem when doing a full text search using a phrase that contains non-alphabetic characters, such as a period ("."). For example “fulltext:bennu.magic” was the example given to me by one of our users. The search results contained 400+ results. Many of the results contained just the word “magic”. However, searching for just “fulltext:bennu” returned the desired two records.

I’ve tried double quotes, single quotes and the escape ("") character to no avail. I’ve also tried using the escape character on the period within a quoted string. I’ve also tried using “like” rather than “fulltext”.

I did some reading here: https://docs.bestpractical.com/rt/4.4.2/full_text_indexing.html

In the end, what did work was: like:"+bennu +magic" which returned the desired two records. However this searches for both words but not necessarily adjacent to each other.

Another example which worked was: like:’“bennu.magic” (that’s a single quote, a double quote, search phrase and double quote only). I found approach on the same web page, above.

We are using MySQL 5.7 native support.

Trevor

Hi,

Try changing the minimum word length full-text parameter, e.g. innodb_ft_min_token_size for InnoDB on MySQL search indexes, witch is 3 by default.

Modify the parameter valid for your db and engine, restart the service and re-index de table with an appropiate command like ‘OPTIMIZE TABLE rt4.AttachmentsIndex;’ in my case. It may take long (hours) but once finished, you can try searching with “+11 +22 +33 +44 +aa +bb”. Non alphanumeric characters are ignored.

I’ve tried with innodb_ft_min_token_size=2 and it works nicely. Take a look at https://dev.mysql.com/doc/refman/5.7/en/fulltext-boolean.html for more info.

Hope that helps.