Index Dimensions¶
Index dimensions are the building blocks of multi-model indexing in LumiX. Each dimension represents one axis of a multi-dimensional index space.
Overview¶
An LXIndexDimension encapsulates:
The data model type (e.g., Driver, Product, Date)
The indexing function (how to extract keys)
Optional filters (which instances to include)
Data source (direct data or ORM query)
Creating Dimensions¶
Basic Dimension¶
from lumix import LXIndexDimension
driver_dim = LXIndexDimension(Driver, lambda d: d.id).from_data(drivers)
With Filtering¶
active_driver_dim = (
LXIndexDimension(Driver, lambda d: d.id)
.from_data(drivers)
.where(lambda d: d.is_active and d.years_experience >= 2)
)
With ORM¶
product_dim = (
LXIndexDimension(Product, lambda p: p.sku)
.from_model(db_session)
.where(lambda p: p.in_stock)
)
Key Functions¶
The key function extracts a unique identifier from each model instance:
Simple Keys¶
# String ID
LXIndexDimension(Driver, lambda d: d.id)
# Integer ID
LXIndexDimension(Product, lambda p: p.product_num)
# Date
LXIndexDimension(Date, lambda dt: dt.date)
Compound Keys¶
# Tuple of attributes
LXIndexDimension(Route, lambda r: (r.origin, r.destination))
# String concatenation
LXIndexDimension(Location, lambda loc: f"{loc.city}_{loc.warehouse_id}")
Filtering¶
Per-Dimension Filters¶
Applied before cartesian products:
# Single condition
dim = (
LXIndexDimension(Driver, lambda d: d.id)
.where(lambda d: d.is_active)
.from_data(drivers)
)
# Multiple conditions
dim = (
LXIndexDimension(Product, lambda p: p.sku)
.where(lambda p: p.in_stock and p.price > 0 and not p.discontinued)
.from_data(products)
)
Filter Behavior¶
# Multiple where() calls override (don't combine)
dim = (
LXIndexDimension(Driver, lambda d: d.id)
.where(lambda d: d.is_active) # This filter is lost
.where(lambda d: d.years_experience >= 5) # Only this applies
)
# Combine conditions in one where()
dim = (
LXIndexDimension(Driver, lambda d: d.id)
.where(lambda d: d.is_active and d.years_experience >= 5)
)
Data Sources¶
Direct Data¶
Use when data is already in memory:
products = load_products() # List[Product]
dim = LXIndexDimension(Product, lambda p: p.id).from_data(products)
ORM Queries¶
Use for database-backed models:
# SQLAlchemy
dim = (
LXIndexDimension(Product, lambda p: p.id)
.from_model(db_session)
.where(lambda p: p.category == "electronics")
)
# Django ORM
dim = (
LXIndexDimension(Customer, lambda c: c.id)
.from_model(Customer.objects)
)
Combining Dimensions¶
Cartesian Products¶
from lumix import LXCartesianProduct
# Two dimensions
product = LXCartesianProduct(
LXIndexDimension(Driver, lambda d: d.id).from_data(drivers),
LXIndexDimension(Date, lambda dt: dt.date).from_data(dates)
)
# Three dimensions
product = (
LXCartesianProduct(
LXIndexDimension(Warehouse, lambda w: w.id).from_data(warehouses),
LXIndexDimension(Product, lambda p: p.sku).from_data(products)
)
.add_dimension(LXIndexDimension(Month, lambda m: m.id).from_data(months))
)
In Variables¶
from typing import Tuple
assignment = (
LXVariable[Tuple[Driver, Date], int]("assignment")
.binary()
.indexed_by_product(
LXIndexDimension(Driver, lambda d: d.id).from_data(drivers),
LXIndexDimension(Date, lambda dt: dt.date).from_data(dates)
)
)
Best Practices¶
Filter Early
Apply filters at the dimension level to reduce cartesian product size:
# Good: Filter before product dim = ( LXIndexDimension(Driver, lambda d: d.id) .where(lambda d: d.is_active) .from_data(drivers) ) # Less efficient: Filter after product product.where(lambda d, dt: d.is_active and ...)
Use Meaningful Keys
# Good: Business identifier LXIndexDimension(Product, lambda p: p.sku) # Avoid: Auto-increment if unstable LXIndexDimension(Product, lambda p: p.auto_id)
Consistent Key Types
Ensure key types are hashable and consistent:
# Good: Consistent types LXIndexDimension(Date, lambda dt: dt.date) # Returns date object # Problematic: Inconsistent types LXIndexDimension(Date, lambda dt: str(dt.date) if dt.special else dt.date)
Advanced Patterns¶
Conditional Dimensions¶
# Different filters based on scenario
if scenario == "peak_season":
driver_dim = (
LXIndexDimension(Driver, lambda d: d.id)
.where(lambda d: d.is_active and d.can_work_overtime)
)
else:
driver_dim = (
LXIndexDimension(Driver, lambda d: d.id)
.where(lambda d: d.is_active)
)
Dynamic Data Sources¶
# Choose data source at runtime
if use_database:
dim = LXIndexDimension(Product, lambda p: p.id).from_model(session)
else:
dim = LXIndexDimension(Product, lambda p: p.id).from_data(cached_products)
Dimension Reuse¶
# Define once, use in multiple variables
driver_dim = (
LXIndexDimension(Driver, lambda d: d.id)
.where(lambda d: d.is_active)
.from_data(drivers)
)
date_dim = LXIndexDimension(Date, lambda dt: dt.date).from_data(dates)
# Use in multiple variables
assignment = (
LXVariable[Tuple[Driver, Date], int]("assignment")
.indexed_by_product(driver_dim, date_dim)
)
backup = (
LXVariable[Tuple[Driver, Date], int]("backup")
.indexed_by_product(driver_dim, date_dim)
)
Next Steps¶
Filtering Strategies - Advanced filtering strategies
Multi-Model Indexing - Using dimensions in multi-model indexing
lumix.indexing.dimensions- API reference