Lucene Index problems with "-" character, StandardAnalyzer

This is the forum for JIDE Common Layer which is open sourced at https://github.com/jidesoft/jide-oss. Please note, JIDE technical support doesn't monitor this forum as often as other forums. Please consider subscribe for technical support for JIDE Common Layer so that you can use customer only forum to get a timely response.

Moderator: JIDE Support

Forum rules
Community driven forum for open source JIDE Common Layer. JIDE technical support doesn't monitor this forum as often as other forums. If you only use JIDE Common Layer, please consider subscribing for technical support for JIDE Common Layer so that you can use customer only forum to get a timely response.

Lucene Index problems with "-" character, StandardAnalyzer

Postby jeffciara » Thu Mar 18, 2021 2:47 am

Hi, same problem as Lucene Index problems with “-” character hyphen.

I need to search full word including hyphen, and not using hyphen "-" as operator in Query Parser Syntax.

For solving, i need to change the current Lucene "StandardAnalyzer", with another "WhitespaceAnalyzer" or "SimpleAnalyzer", but i didn't found where in my code.

I'm using :
  • com.jidesoft.lucene.LuceneQuickTableFilterField
  • com.jidesoft.lucene.LuceneFilterableTableModel
  • com.jidesoft.grid.SortableTable
Code: Select all
luceneQuickTableFilterField.setTableModel(defaulTableModel);
luceneFilterableTableModel = new LuceneFilterableTableModel(luceneQuickTableFilterField.getDisplayTableModel());
sortableTable.setModel(luceneFilterableTableModel);

Actualy with table containing "MSG-123", searching :
  • "MSG" found the entry "MSG-123"
  • "MSG-" no found (cause : "-" hyphen interpreted as exclusion operator)
  • "MSG-123" no found (cause : "-" hyphen interpreted as exclusion operator)
Thanks
jeffciara
 
Posts: 10
Joined: Thu Aug 26, 2010 11:55 pm

Re: Lucene Index problems with "-" character, StandardAnalyz

Postby JIDE Support » Thu Mar 18, 2021 9:55 am

I haven't looked up in the Lucene documentation but you can't escape the "-"?
JIDE Software Technical Support Team
JIDE Support
Site Admin
 
Posts: 37219
Joined: Sun Sep 14, 2003 10:49 am

Re: Lucene Index problems with "-" character, StandardAnalyz

Postby jeffciara » Fri Mar 19, 2021 12:18 am

There is Escaping Special Characters in Apache Lucene, Query Parser Syntax.

I have have try it on my search field with escaped value, and doesn't work. Searching :
  • "MSG\-" not found
  • "MSG\-123" not found
jeffciara
 
Posts: 10
Joined: Thu Aug 26, 2010 11:55 pm

Re: Lucene Index problems with "-" character, StandardAnalyz

Postby JIDE Support » Fri Mar 19, 2021 10:08 am

Did you turn on the lucene input mode? Tap the search icon, you will see a dropdown menu where the first menu item is Lucene Input Mode.
JIDE Software Technical Support Team
JIDE Support
Site Admin
 
Posts: 37219
Joined: Sun Sep 14, 2003 10:49 am

Re: Lucene Index problems with "-" character, StandardAnalyz

Postby jeffciara » Mon Mar 22, 2021 12:44 am

I tryed to activate lucene input mode without result : my input field doesn't trigger or result in table is not refreshed.
I'll retry...
jeffciara
 
Posts: 10
Joined: Thu Aug 26, 2010 11:55 pm

Re: Lucene Index problems with "-" character, StandardAnalyz

Postby JIDE Support » Mon Mar 22, 2021 7:44 am

Activated the mode first then type.
JIDE Software Technical Support Team
JIDE Support
Site Admin
 
Posts: 37219
Joined: Sun Sep 14, 2003 10:49 am

Re: Lucene Index problems with "-" character, StandardAnalyz

Postby jeffciara » Thu Mar 25, 2021 5:15 am

Hi, it doesn't work.
With lucene input mode whith hyphen, it's listing other entries.
It's only work with full text search "Match exactly", but i would "Match anywhere" using hyphen.

All rows/values : "number", "number-", "number-zero", "number-one"
Image

With default (no Lucene Input Mode)
Image

Search "num"
Image

Search "number"
Image

Search "number-"
Image

Search "number-zero"
Image

Search ""number-zero"" (encapsulated with quotes :")
Image

With Lucene Input Mode enabled
Image

Search "num"
Image

Search "number"
Image

Search "number-"
Image

Search "number-zero"
Image

Search "number\-zero" escaping : doesn't work
Image

Search "number -zero"
Image

Search ""number-zero"" (encapsulated with quotes :")
Image
jeffciara
 
Posts: 10
Joined: Thu Aug 26, 2010 11:55 pm

Re: Lucene Index problems with "-" character, StandardAnalyz

Postby JIDE Support » Thu Mar 25, 2021 9:03 am

If the Lucene mode, you will have to do exactly what the Lucene syntax described. We actually just pass the string to Lucene and Lucence will do the query. We don't have any role during the search. In non-Lucene mode, we read your input and convert it to the Lucene syntax. In this case, I noticed we didn't escape the "-" which is why it was not working. I will fix that.
JIDE Software Technical Support Team
JIDE Support
Site Admin
 
Posts: 37219
Joined: Sun Sep 14, 2003 10:49 am

Re: Lucene Index problems with "-" character, StandardAnalyz

Postby JIDE Support » Thu Mar 25, 2021 12:10 pm

I can't figure out what's going. Even I escape the "-", still no result.

This is the query I am using if you type "number-" in the non-lucene mode. I basically translated it from the plain text to a lucene compatible query. However, it is still not working. I don't know what's wrong with it. You can switch to the lucene mode and type the same string as below to try it and see if you can get it working.

Code: Select all
categoryname:*number\-* productname:*number\-* productsales:*number\-* shippeddate:*number\-*
JIDE Software Technical Support Team
JIDE Support
Site Admin
 
Posts: 37219
Joined: Sun Sep 14, 2003 10:49 am

Re: Lucene Index problems with "-" character, StandardAnalyz

Postby jeffciara » Fri Mar 26, 2021 12:33 am

Hi, thanks for your suggest, but it also doesn't work.

numbe : no result (numbe in not a full word)
number : all result (all words contain ""number")
number- : all result (all words contain ""number")
number\- : all result (all words contain ""number")
*number* : all result
*number-* : no result
name:number : all result
name:*number* : all result
name:*number\-* : no result
name:number-zero : all result
name:"number-zero" : 1 result "number-zero"
name:number\-zero : all result
name:"number\-zero" : 1 result "number-zero"
name:*number\-zero*: no result
jeffciara
 
Posts: 10
Joined: Thu Aug 26, 2010 11:55 pm

Re: Lucene Index problems with "-" character, StandardAnalyz

Postby JIDE Support » Fri Mar 26, 2021 10:09 am

I didn't say it works. I have no idea why it is not working. Need a Lucene expert to help us out.
JIDE Software Technical Support Team
JIDE Support
Site Admin
 
Posts: 37219
Joined: Sun Sep 14, 2003 10:49 am


Return to JIDE Common Layer Open Source Project Discussion (Community Driven)

Who is online

Users browsing this forum: No registered users and 18 guests