pypaginate Logo

Getting Started

  • Getting Started
  • Installation
  • Quick Start
  • First Steps

User Guide

  • Pagination
  • Offset Pagination
  • Cursor/Keyset Pagination
  • In-Memory Pagination
  • Filtering
  • Basic Filtering
  • Nested Filter Groups
  • Operators Reference
  • Search
  • Text Search
    • Basic Usage
      • SearchEngine
      • MemorySearchBackend
    • Search Modes
      • CONTAINS (Default)
      • PREFIX
      • EXACT
    • Multi-Field Search
    • Weighted Fields
    • Multi-Word Queries
    • Nested Field Access
    • Max Results
    • Min Query Length
    • SQLAlchemy Search
    • Pipeline Integration
    • Text Normalization
    • Next Steps
  • Fuzzy Matching
  • Sorting
  • Basic Sorting
  • Multi-Column Sorting

Integrations

  • Framework Integrations
  • FastAPI Integration
  • SQLAlchemy Integration

API Reference

  • API Reference
  • pypaginate

Examples

  • Examples
  • Basic Pagination
  • Filtering
  • FastAPI Integration
  • Keyset (Cursor) Pagination

Concepts

  • Concepts
  • Architecture
  • Pagination Strategies
  • Cursor Encoding
  • Filter Expressions
  • Search & Relevance

Contributing

  • Contributing to pypaginate
  • Development Setup
  • Code Style
  • Testing Guide
  • Architecture Guide
  • Roadmap
  • Contributor Covenant Code of Conduct

Project

  • Competitive Analysis
  • Performance Benchmarks
pypaginate
  • Text Search
  • Edit on GitHub

Text Search

This guide covers exact text search with SearchSpec, including contains/prefix/exact modes, multi-field search, weighted fields, and pipeline integration.

Basic Usage

SearchEngine

SearchEngine searches in-memory sequences with relevance ranking:

from pypaginate import SearchSpec
from pypaginate.search.engine import SearchEngine

engine = SearchEngine()

products = [
    {"title": "Python Book", "description": "Learn Python programming"},
    {"title": "JavaScript Guide", "description": "Master JS development"},
    {"title": "Go Handbook", "description": "Golang essentials"},
]

spec = SearchSpec(query="python", fields=("title", "description"))
results = engine.apply(products, spec)
# [Python Book] -- matches both title and description

MemorySearchBackend

MemorySearchBackend satisfies the SearchBackend protocol for pipeline use:

from pypaginate import SearchSpec
from pypaginate.adapters.memory import MemorySearchBackend

backend = MemorySearchBackend()

filtered = backend.apply_search(products, SearchSpec(
    query="python",
    fields=("title", "description"),
))

Search Modes

CONTAINS (Default)

Matches when the token appears anywhere in the field value:

from pypaginate import SearchSpec, SearchFieldMode

spec = SearchSpec(
    query="python",
    fields=("title",),
    mode=SearchFieldMode.CONTAINS,  # default
)
# "Python Book" matches (contains "python")
# "Learn Python Programming" matches

PREFIX

Matches when the field value starts with the token:

spec = SearchSpec(
    query="py",
    fields=("title",),
    mode=SearchFieldMode.PREFIX,
)
# "Python Book" matches (starts with "py")
# "Learn Python" does NOT match

EXACT

Matches when the normalized field value equals the normalized token:

spec = SearchSpec(
    query="python book",
    fields=("title",),
    mode=SearchFieldMode.EXACT,
)
# "Python Book" matches (normalizes to "python book")
# "Python Book 2nd Edition" does NOT match

Multi-Field Search

Search across multiple fields simultaneously. A result matches if any field contains the token:

from pypaginate import SearchSpec
from pypaginate.search.engine import SearchEngine

engine = SearchEngine()

employees = [
    {"name": "Alice Smith", "email": "alice@corp.com", "department": "Engineering"},
    {"name": "Bob Johnson", "email": "bob@corp.com", "department": "Sales"},
]

spec = SearchSpec(query="alice", fields=("name", "email", "department"))
results = engine.apply(employees, spec)
# [Alice Smith] -- matches in both name and email

Weighted Fields

Assign different weights to fields to control relevance ranking. Higher weights make matches in that field rank higher:

from pypaginate import SearchSpec
from pypaginate.search.engine import SearchEngine

engine = SearchEngine()

products = [
    {"title": "Python", "description": "A snake species"},
    {"title": "Cobra", "description": "A python library for CLI"},
]

# Title matches are twice as important as description matches
spec = SearchSpec(
    query="python",
    fields=("title", "description"),
    weights={"title": 2.0, "description": 1.0},
)
results = engine.apply(products, spec)
# [{"title": "Python", ...}, {"title": "Cobra", ...}]
# "Python" ranks higher (title match with 2x weight)

Default weight is 1.0 for fields not specified in the weights dict.

Multi-Word Queries

Queries with multiple words are tokenized. All tokens must match for an item to be included:

spec = SearchSpec(query="alice smith", fields=("name",))
# Tokenized to ["alice", "smith"]
# Both tokens must match somewhere in the searched fields

Nested Field Access

Search in nested attributes or dictionary keys with dot notation:

spec = SearchSpec(
    query="developer",
    fields=("user.profile.bio",),
)
# Accesses item["user"]["profile"]["bio"] or item.user.profile.bio

Max Results

Limit the number of search results:

spec = SearchSpec(
    query="python",
    fields=("title",),
    max_results=10,  # return at most 10 matches
)

Min Query Length

Skip search for very short queries:

spec = SearchSpec(
    query="a",
    fields=("name",),
    min_length=2,  # skip search if query < 2 chars
)
# Returns all items unfiltered (query too short)

SQLAlchemy Search

SQLAlchemySearchBackend generates ILIKE WHERE clauses:

from sqlalchemy import select
from pypaginate import SearchSpec, SearchFieldMode
from pypaginate.adapters.sqlalchemy import SQLAlchemySearchBackend

backend = SQLAlchemySearchBackend()

stmt = select(User)
searched_stmt = backend.apply_search(stmt, SearchSpec(
    query="alice",
    fields=("name", "email"),
))
# SELECT * FROM user
# WHERE (name ILIKE '%alice%' OR email ILIKE '%alice%')

Mode affects the ILIKE pattern:

Mode

Pattern

CONTAINS

%token%

PREFIX

token%

EXACT

token (no wildcards)

Multi-word queries generate AND-combined conditions:

# query="alice smith", fields=("name", "email")
# WHERE (name ILIKE '%alice%' OR email ILIKE '%alice%')
#   AND (name ILIKE '%smith%' OR email ILIKE '%smith%')

Pipeline Integration

Pass a SearchSpec to SyncPipeline.execute() or AsyncPipeline.execute() via the search= parameter. See In-Memory Pagination for a full pipeline example combining filters, sorting, search, and pagination.

Text Normalization

Both field values and query tokens are normalized before comparison:

  • Unicode normalization (NFC)

  • Lowercased

  • Whitespace trimmed

This means searches are case-insensitive and accent-aware by default.

Next Steps

  • Fuzzy Matching – Approximate matching for typo tolerance

  • Filtering – Combine with declarative filters

Previous Next

© Copyright 2024-2026 CybLow. Last updated on Jun 04, 2026.