Full-text vs Term-level Queries — When to use which

intermediate elasticsearch query-dsl full-text term-level

Once we understand the difference between match and term, we can generalize it to two whole families of queries — full-text and term-level. Knowing which family to reach for is half the battle.

The mental model

In simple language — full-text queries are for human language, term-level queries are for structured data.

FULL-TEXT QUERIES
Analyzed → tokens → search
For: text, prose, search bars
• match
• match_phrase
• multi_match
• query_string
• simple_query_string
• match_phrase_prefix
• intervals
TERM-LEVEL QUERIES
No analysis → exact terms
For: keywords, IDs, numbers, dates
• term / terms
• range
• exists
• prefix
• wildcard
• regexp
• fuzzy
• ids

The field type connection

This isn’t an arbitrary choice — it’s tied to the field mapping.

  • text fields are analyzed. Full-text queries work here.
  • keyword, numeric, date, boolean, IP fields are not analyzed. Term-level queries work here.

When we index a string with default mapping, ES creates both:

"product_name": {
  "type": "text",
  "fields": {
    "keyword": { "type": "keyword", "ignore_above": 256 }
  }
}

So we can do match on product_name (analyzed) AND term on product_name.keyword (exact). This dual mapping is why .keyword shows up everywhere.

A real-world example combining both

Search bar query: “user typed ‘macbook’ and selected category=laptops, price under 2000, in stock.”

GET /products/_search
{
  "query": {
    "bool": {
      "must": [
        { "match": { "title": "macbook" } }
      ],
      "filter": [
        { "term":  { "category.keyword": "laptops" } },
        { "term":  { "in_stock": true } },
        { "range": { "price": { "lt": 2000 } } }
      ]
    }
  }
}

Notice the split — the natural-language part uses match (full-text, scored), the structured filters use term/range (term-level, cached, no score). That’s the standard production pattern.

Common mistakes

1. Using match on a status enum

{ "match": { "status": "shipped" } }

Works, but it’s analyzed — "shipped" gets lowercased and tokenized. If we ever index a status like "Partially Shipped", match: "shipped" will match it (wrong). Use term on a keyword field.

2. Using term on a text field

{ "term": { "title": "MacBook Pro" } }

Almost guaranteed to return nothing. The index has tokens like ["macbook", "pro"], not the literal string "MacBook Pro". Use term on title.keyword, or switch to match.

3. Using match_phrase when match would do

{ "match_phrase": { "title": "blue shoes" } }

match_phrase requires the exact word order — "shoes that are blue" won’t match. Sometimes that’s what we want, but most of the time match is more forgiving and gives better recall.

When to use the rarer ones

  • match_phrase — when word order matters. “Star Wars” should NOT match docs that contain just “star” and “wars” separately.
  • match_phrase_prefix — autocomplete. “star wa” matches “star wars”.
  • query_string — power-user search syntax (+required -excluded "phrase"). Powerful but exposes Lucene syntax to users — risky for public-facing search.
  • simple_query_string — safer subset of query_string. Invalid syntax doesn’t throw an error.
  • terms (plural) — exact match against a list. Like SQL IN: { "terms": { "category": ["laptops", "tablets"] } }.

Scoring vs filtering — second axis

There’s a second decision orthogonal to full-text vs term-level — query context vs filter context.

Query contextFilter context
Full-textmatch inside must/shouldPossible but rare (use match inside filter if no scoring needed)
Term-levelterm inside must (rare)term inside filter ← the common case

The takeaway — full-text usually goes in query context (must/should), term-level usually goes in filter context (filter/must_not).

Quick rules

  • Searching prose → full-text family (match, multi_match).
  • Filtering structured data → term-level family (term, range, exists).
  • Exact string match → term on .keyword subfield.
  • Phrase search → match_phrase.
  • User-typed search bar with multiple filters → bool with must: [match] + filter: [term, range]. The standard pattern.