Text Search
This guide covers exact text search with SearchSpec, including contains/prefix/exact modes, multi-field search, weighted fields, and pipeline integration.
Basic Usage
SearchEngine
SearchEngine searches in-memory sequences with relevance ranking:
from pypaginate import SearchSpec
from pypaginate.search.engine import SearchEngine
engine = SearchEngine()
products = [
{"title": "Python Book", "description": "Learn Python programming"},
{"title": "JavaScript Guide", "description": "Master JS development"},
{"title": "Go Handbook", "description": "Golang essentials"},
]
spec = SearchSpec(query="python", fields=("title", "description"))
results = engine.apply(products, spec)
# [Python Book] -- matches both title and description
MemorySearchBackend
MemorySearchBackend satisfies the SearchBackend protocol for pipeline use:
from pypaginate import SearchSpec
from pypaginate.adapters.memory import MemorySearchBackend
backend = MemorySearchBackend()
filtered = backend.apply_search(products, SearchSpec(
query="python",
fields=("title", "description"),
))
Search Modes
CONTAINS (Default)
Matches when the token appears anywhere in the field value:
from pypaginate import SearchSpec, SearchFieldMode
spec = SearchSpec(
query="python",
fields=("title",),
mode=SearchFieldMode.CONTAINS, # default
)
# "Python Book" matches (contains "python")
# "Learn Python Programming" matches
PREFIX
Matches when the field value starts with the token:
spec = SearchSpec(
query="py",
fields=("title",),
mode=SearchFieldMode.PREFIX,
)
# "Python Book" matches (starts with "py")
# "Learn Python" does NOT match
EXACT
Matches when the normalized field value equals the normalized token:
spec = SearchSpec(
query="python book",
fields=("title",),
mode=SearchFieldMode.EXACT,
)
# "Python Book" matches (normalizes to "python book")
# "Python Book 2nd Edition" does NOT match
Multi-Field Search
Search across multiple fields simultaneously. A result matches if any field contains the token:
from pypaginate import SearchSpec
from pypaginate.search.engine import SearchEngine
engine = SearchEngine()
employees = [
{"name": "Alice Smith", "email": "alice@corp.com", "department": "Engineering"},
{"name": "Bob Johnson", "email": "bob@corp.com", "department": "Sales"},
]
spec = SearchSpec(query="alice", fields=("name", "email", "department"))
results = engine.apply(employees, spec)
# [Alice Smith] -- matches in both name and email
Weighted Fields
Assign different weights to fields to control relevance ranking. Higher weights make matches in that field rank higher:
from pypaginate import SearchSpec
from pypaginate.search.engine import SearchEngine
engine = SearchEngine()
products = [
{"title": "Python", "description": "A snake species"},
{"title": "Cobra", "description": "A python library for CLI"},
]
# Title matches are twice as important as description matches
spec = SearchSpec(
query="python",
fields=("title", "description"),
weights={"title": 2.0, "description": 1.0},
)
results = engine.apply(products, spec)
# [{"title": "Python", ...}, {"title": "Cobra", ...}]
# "Python" ranks higher (title match with 2x weight)
Default weight is 1.0 for fields not specified in the weights dict.
Multi-Word Queries
Queries with multiple words are tokenized. All tokens must match for an item to be included:
spec = SearchSpec(query="alice smith", fields=("name",))
# Tokenized to ["alice", "smith"]
# Both tokens must match somewhere in the searched fields
Nested Field Access
Search in nested attributes or dictionary keys with dot notation:
spec = SearchSpec(
query="developer",
fields=("user.profile.bio",),
)
# Accesses item["user"]["profile"]["bio"] or item.user.profile.bio
Max Results
Limit the number of search results:
spec = SearchSpec(
query="python",
fields=("title",),
max_results=10, # return at most 10 matches
)
Min Query Length
Skip search for very short queries:
spec = SearchSpec(
query="a",
fields=("name",),
min_length=2, # skip search if query < 2 chars
)
# Returns all items unfiltered (query too short)
SQLAlchemy Search
SQLAlchemySearchBackend generates ILIKE WHERE clauses:
from sqlalchemy import select
from pypaginate import SearchSpec, SearchFieldMode
from pypaginate.adapters.sqlalchemy import SQLAlchemySearchBackend
backend = SQLAlchemySearchBackend()
stmt = select(User)
searched_stmt = backend.apply_search(stmt, SearchSpec(
query="alice",
fields=("name", "email"),
))
# SELECT * FROM user
# WHERE (name ILIKE '%alice%' OR email ILIKE '%alice%')
Mode affects the ILIKE pattern:
Mode |
Pattern |
|---|---|
|
|
|
|
|
|
Multi-word queries generate AND-combined conditions:
# query="alice smith", fields=("name", "email")
# WHERE (name ILIKE '%alice%' OR email ILIKE '%alice%')
# AND (name ILIKE '%smith%' OR email ILIKE '%smith%')
Pipeline Integration
Pass a SearchSpec to SyncPipeline.execute() or AsyncPipeline.execute() via the search= parameter. See In-Memory Pagination for a full pipeline example combining filters, sorting, search, and pagination.
Text Normalization
Both field values and query tokens are normalized before comparison:
Unicode normalization (NFC)
Lowercased
Whitespace trimmed
This means searches are case-insensitive and accent-aware by default.
Next Steps
Fuzzy Matching – Approximate matching for typo tolerance
Filtering – Combine with declarative filters