Index Dimensions

Index dimensions are the building blocks of multi-model indexing in LumiX. Each dimension represents one axis of a multi-dimensional index space.

Overview

An LXIndexDimension encapsulates:

  • The data model type (e.g., Driver, Product, Date)

  • The indexing function (how to extract keys)

  • Optional filters (which instances to include)

  • Data source (direct data or ORM query)

Creating Dimensions

Basic Dimension

from lumix import LXIndexDimension

driver_dim = LXIndexDimension(Driver, lambda d: d.id).from_data(drivers)

With Filtering

active_driver_dim = (
    LXIndexDimension(Driver, lambda d: d.id)
    .from_data(drivers)
    .where(lambda d: d.is_active and d.years_experience >= 2)
)

With ORM

product_dim = (
    LXIndexDimension(Product, lambda p: p.sku)
    .from_model(db_session)
    .where(lambda p: p.in_stock)
)

Key Functions

The key function extracts a unique identifier from each model instance:

Simple Keys

# String ID
LXIndexDimension(Driver, lambda d: d.id)

# Integer ID
LXIndexDimension(Product, lambda p: p.product_num)

# Date
LXIndexDimension(Date, lambda dt: dt.date)

Compound Keys

# Tuple of attributes
LXIndexDimension(Route, lambda r: (r.origin, r.destination))

# String concatenation
LXIndexDimension(Location, lambda loc: f"{loc.city}_{loc.warehouse_id}")

Filtering

Per-Dimension Filters

Applied before cartesian products:

# Single condition
dim = (
    LXIndexDimension(Driver, lambda d: d.id)
    .where(lambda d: d.is_active)
    .from_data(drivers)
)

# Multiple conditions
dim = (
    LXIndexDimension(Product, lambda p: p.sku)
    .where(lambda p: p.in_stock and p.price > 0 and not p.discontinued)
    .from_data(products)
)

Filter Behavior

# Multiple where() calls override (don't combine)
dim = (
    LXIndexDimension(Driver, lambda d: d.id)
    .where(lambda d: d.is_active)  # This filter is lost
    .where(lambda d: d.years_experience >= 5)  # Only this applies
)

# Combine conditions in one where()
dim = (
    LXIndexDimension(Driver, lambda d: d.id)
    .where(lambda d: d.is_active and d.years_experience >= 5)
)

Data Sources

Direct Data

Use when data is already in memory:

products = load_products()  # List[Product]

dim = LXIndexDimension(Product, lambda p: p.id).from_data(products)

ORM Queries

Use for database-backed models:

# SQLAlchemy
dim = (
    LXIndexDimension(Product, lambda p: p.id)
    .from_model(db_session)
    .where(lambda p: p.category == "electronics")
)

# Django ORM
dim = (
    LXIndexDimension(Customer, lambda c: c.id)
    .from_model(Customer.objects)
)

Combining Dimensions

Cartesian Products

from lumix import LXCartesianProduct

# Two dimensions
product = LXCartesianProduct(
    LXIndexDimension(Driver, lambda d: d.id).from_data(drivers),
    LXIndexDimension(Date, lambda dt: dt.date).from_data(dates)
)

# Three dimensions
product = (
    LXCartesianProduct(
        LXIndexDimension(Warehouse, lambda w: w.id).from_data(warehouses),
        LXIndexDimension(Product, lambda p: p.sku).from_data(products)
    )
    .add_dimension(LXIndexDimension(Month, lambda m: m.id).from_data(months))
)

In Variables

from typing import Tuple

assignment = (
    LXVariable[Tuple[Driver, Date], int]("assignment")
    .binary()
    .indexed_by_product(
        LXIndexDimension(Driver, lambda d: d.id).from_data(drivers),
        LXIndexDimension(Date, lambda dt: dt.date).from_data(dates)
    )
)

Best Practices

  1. Filter Early

    Apply filters at the dimension level to reduce cartesian product size:

    # Good: Filter before product
    dim = (
        LXIndexDimension(Driver, lambda d: d.id)
        .where(lambda d: d.is_active)
        .from_data(drivers)
    )
    
    # Less efficient: Filter after product
    product.where(lambda d, dt: d.is_active and ...)
    
  2. Use Meaningful Keys

    # Good: Business identifier
    LXIndexDimension(Product, lambda p: p.sku)
    
    # Avoid: Auto-increment if unstable
    LXIndexDimension(Product, lambda p: p.auto_id)
    
  3. Consistent Key Types

    Ensure key types are hashable and consistent:

    # Good: Consistent types
    LXIndexDimension(Date, lambda dt: dt.date)  # Returns date object
    
    # Problematic: Inconsistent types
    LXIndexDimension(Date, lambda dt: str(dt.date) if dt.special else dt.date)
    

Advanced Patterns

Conditional Dimensions

# Different filters based on scenario
if scenario == "peak_season":
    driver_dim = (
        LXIndexDimension(Driver, lambda d: d.id)
        .where(lambda d: d.is_active and d.can_work_overtime)
    )
else:
    driver_dim = (
        LXIndexDimension(Driver, lambda d: d.id)
        .where(lambda d: d.is_active)
    )

Dynamic Data Sources

# Choose data source at runtime
if use_database:
    dim = LXIndexDimension(Product, lambda p: p.id).from_model(session)
else:
    dim = LXIndexDimension(Product, lambda p: p.id).from_data(cached_products)

Dimension Reuse

# Define once, use in multiple variables
driver_dim = (
    LXIndexDimension(Driver, lambda d: d.id)
    .where(lambda d: d.is_active)
    .from_data(drivers)
)

date_dim = LXIndexDimension(Date, lambda dt: dt.date).from_data(dates)

# Use in multiple variables
assignment = (
    LXVariable[Tuple[Driver, Date], int]("assignment")
    .indexed_by_product(driver_dim, date_dim)
)

backup = (
    LXVariable[Tuple[Driver, Date], int]("backup")
    .indexed_by_product(driver_dim, date_dim)
)

Next Steps