So How Does The Elasticsearch Match Query Work⚓︎
Executive Summary⚓︎
The elasticsearch match query is your go to search query whenever starting out some analysis in elasticsearch, this post attempts to explain how the match
query works.
Default Match Query⚓︎
Lets say we had these two documents to be searched.
POST /test/_doc/1
{"id":1, "name":"MR JB BROW\nN"}
POST /test/_doc/2
{"id":2, "name":"MR JAMIE BBROWN"}
GET /test/_search
{
"query": {
"match": {
"name": "MR JAMES BEN BROWN"
}
}
}
MR
in both of the documents. If you removed MR
from the original two documents the above query would not have hit either document.
Fuzziness⚓︎
The simplest option to address minor misspellings is to set the fuzziness
parameter. So lets take out MR
from our example and try fuzziness.
POST /test/_doc/1
{"id":1, "name":"JB BROW\nN"}
POST /test/_doc/2
{"id":2, "name":"JAMIE BBROWN"}
GET /test/_search
{
"query": {
"match": {
"name": {
"query": "JAMES BEN BROWN",
"fuzziness": "AUTO"
}
}
}
}
MR
. However a search for BROWN TOLIET CLEANING
would also be a match.
AND / OR OPERATOR⚓︎
The operator flag allows you to specify if all of the searched terms within a match
query must be contained within the searched documents. By default the operator
is set to OR
, to set the operator
to AND
we use the below syntax.
GET /test/_search
{
"query": {
"match": {
"name": {
"query": "BROWN TOLIET CLEANING",
"fuzziness": "AUTO",
"operator": "AND"
}
}
}
}
MINIMUM SHOULD MATCH⚓︎
The minimum should match parameter allows you to specify how many of the searched terms are required to match. The below is the simplest usage for minimum should match
where we have specified that two of teh searched terms are required to match.
GET /test/_search
{
"query": {
"match": {
"name": {
"query": "BROWN TOLIET CLEANING",
"fuzziness": "AUTO",
"minimum_should_match": 2
}
}
}
}
GET /test/_search
{
"query": {
"match": {
"name": {
"query": "BROWN TOLIET CLEANING",
"fuzziness": "AUTO",
"minimum_should_match": "67%"
}
}
}
}
Elasticsearch also allows us to combine minimum should match
criteria as follows:
GET /test/_search
{
"query": {
"match": {
"name": {
"query": "BROWN TOLIET CLEANING",
"fuzziness": "AUTO",
"minimum_should_match": "1<2 5<60%"
}
}
}
}
Search Term | Document Hit | Example Searched Term | Example Hit |
---|---|---|---|
1 Word | Contain Searched Term | JAMES | JAMES BROWN |
2 Words | Contain 2 Searched Term | JAMES BROWN | JAMES BROWN |
3 Words | Contain 2 Searched Term | JAMES B BROWN | JAMES E BROWN |
4 Words | Contain 2 Searched Term | JAMES B D BROWN | JAMES E BROWN |
5 Words | Contain 2 Searched Term | MR JAMES B D BROWN | JAMES E BROWN |
6 Words | Contain 4 Searched Term | MR JAMES B D VAN BROWN | MR JAMES VAN BROWN |
MINIMUM SHOULD MATCH⚓︎
The minimum should match parameter allows you to specify how many of the searched terms are required to match. The below is the simplest usage for minimum should match
where we have specified that two of teh searched terms are required to match.