Indexing Guide

Overview

The indexing module is one of LumiX’s most powerful features, enabling you to create variables and constraints indexed by one or more data models. This allows you to build optimization models that naturally reflect the structure of your data.

Key Concept: Instead of manually creating variables in loops with numeric indices, LumiX automatically expands variable and constraint families based on your data models with full type safety.

# Traditional approach - error-prone and verbose
x = {}
for i in range(len(products)):
    x[i] = model.addVar(name=f"x_{i}")

# LumiX approach - data-driven and type-safe
production = (
    LXVariable[Product, float]("production")
    .continuous()
    .indexed_by(lambda p: p.id)
    .from_data(products)
)

Core Concepts

Single-Model Indexing

Variables and constraints indexed by a single data model.

Use cases:

  • Production planning (indexed by Product)

  • Facility location (indexed by Facility)

  • Resource allocation (indexed by Resource)

Single-Model Indexing - Learn about single-model indexing

Multi-Model Indexing

Variables indexed by tuples of multiple data models using cartesian products.

Use cases:

  • Driver scheduling (indexed by Driver × Date)

  • Transportation planning (indexed by Origin × Destination)

  • Shift assignment (indexed by Worker × Date × Shift)

Multi-Model Indexing - Learn about multi-model indexing

Index Dimensions

Building blocks for multi-dimensional indexing with filtering and data sources.

Key features:

  • Per-dimension filtering

  • Multiple data sources (direct data, ORM queries)

  • Type-safe key extraction

  • Lazy evaluation

Index Dimensions - Learn about index dimensions

Filtering Strategies

Control which instances and combinations are included in your model.

Filtering types:

  • Per-dimension filters (reduce data before expansion)

  • Cross-dimension filters (filter combinations)

  • Performance optimization techniques

Filtering Strategies - Learn about filtering strategies

Why Indexing Matters

Type Safety

LumiX preserves type information throughout your model:

production = LXVariable[Product, float]("production").from_data(products)

# IDE knows 'p' is a Product - full autocomplete
expr = LXLinearExpression().add_term(production, lambda p: p.profit)

Data-Driven Modeling

Models automatically adapt to your data:

# Add more products? No code changes needed!
products = load_products()  # Could be 10 or 10,000 products

production = LXVariable[Product, float]("production").from_data(products)
# Automatically creates the right number of variables

No Manual Loops

LumiX handles variable expansion internally:

# No loops needed - automatic expansion
production = LXVariable[Product, float]("production").from_data(products)

# Automatic expansion in expressions
total_profit = LXLinearExpression().add_term(production, lambda p: p.profit)

# Automatic expansion in constraints
for resource in resources:
    model.add_constraint(
        LXConstraint[Product](f"resource_{resource.id}")
        .expression(
            LXLinearExpression()
            .add_term(production, lambda p: p.usage.get(resource.id, 0))
        )
        .le()
        .rhs(resource.capacity)
        .from_data(products)
    )

Multi-Dimensional Support

Natural handling of multi-dimensional problems:

from typing import Tuple

# Two-dimensional variable
assignment = (
    LXVariable[Tuple[Driver, Date], int]("assignment")
    .binary()
    .indexed_by_product(
        LXIndexDimension(Driver, lambda d: d.id).from_data(drivers),
        LXIndexDimension(Date, lambda dt: dt.date).from_data(dates)
    )
)

# Access solution with type-safe indices
for driver in drivers:
    for date in dates:
        value = solution.variables["assignment"][(driver.id, date.date)]
        if value > 0.5:
            print(f"{driver.name} works on {date.date}")

Quick Start Examples

Production Planning (Single-Model)

from dataclasses import dataclass
from lumix import LXVariable, LXModel, LXLinearExpression, LXConstraint

@dataclass
class Product:
    id: str
    name: str
    profit: float
    resource_usage: float

products = [
    Product("A", "Product A", profit=30, resource_usage=2),
    Product("B", "Product B", profit=40, resource_usage=3),
]

# Define variable indexed by Product
production = (
    LXVariable[Product, float]("production")
    .continuous()
    .bounds(lower=0)
    .indexed_by(lambda p: p.id)
    .from_data(products)
)

# Build model
model = (
    LXModel("production_plan")
    .add_variable(production)
    .maximize(
        LXLinearExpression()
        .add_term(production, lambda p: p.profit)
    )
)

# Add resource constraint
model.add_constraint(
    LXConstraint("resource_limit")
    .expression(
        LXLinearExpression()
        .add_term(production, lambda p: p.resource_usage)
    )
    .le()
    .rhs(100)
)

Driver Scheduling (Multi-Model)

from typing import Tuple
from lumix import LXVariable, LXIndexDimension

@dataclass
class Driver:
    id: str
    name: str
    daily_rate: float
    is_active: bool

@dataclass
class Date:
    date: datetime.date
    min_drivers_required: int

drivers = [...]
dates = [...]

# Define variable indexed by (Driver, Date)
duty = (
    LXVariable[Tuple[Driver, Date], int]("duty")
    .binary()
    .indexed_by_product(
        LXIndexDimension(Driver, lambda d: d.id)
            .where(lambda d: d.is_active)
            .from_data(drivers),
        LXIndexDimension(Date, lambda dt: dt.date)
            .from_data(dates)
    )
    .cost_multi(lambda driver, date: driver.daily_rate)
    .where_multi(lambda driver, date: is_available(driver, date))
)

Common Patterns

Filtering During Definition

Apply filters early to reduce model size:

# Filter at variable level
production = (
    LXVariable[Product, float]("production")
    .continuous()
    .where(lambda p: p.is_active and p.stock > 0)
    .from_data(products)
)

# Filter dimensions
driver_dim = (
    LXIndexDimension(Driver, lambda d: d.id)
    .where(lambda d: d.is_active and d.years_experience >= 2)
    .from_data(drivers)
)

Sparse Indexing

Create variables only for valid combinations:

# Only create variables where driver can work on date
duty = (
    LXVariable[Tuple[Driver, Date], int]("duty")
    .binary()
    .indexed_by_product(
        LXIndexDimension(Driver, lambda d: d.id).from_data(drivers),
        LXIndexDimension(Date, lambda dt: dt.date).from_data(dates)
    )
    .where_multi(lambda driver, date:
        date.weekday() not in driver.days_off and
        driver.can_work_date(date)
    )
)

Compound Keys

Use tuples for complex indices:

# Route indexed by (origin, destination)
route_dim = LXIndexDimension(
    Route,
    lambda r: (r.origin, r.destination)
).from_data(routes)

Best Practices

  1. Use Type Annotations

    # Good: Full type information
    production = LXVariable[Product, float]("production")
    
    # Bad: No type information
    production = LXVariable("production")
    
  2. Filter Early

    # Good: Filter at dimension level
    dim = (
        LXIndexDimension(Driver, lambda d: d.id)
        .where(lambda d: d.is_active)
        .from_data(drivers)
    )
    
    # Less efficient: Filter after cartesian product
    product.where(lambda d, dt: d.is_active and ...)
    
  3. Choose Descriptive Index Functions

    # Good: Clear what the index represents
    .indexed_by(lambda product: product.sku)
    
    # Less clear: Generic
    .indexed_by(lambda p: p.id)
    
  4. Use Sparse Indexing for Large Models

    # Create variables only where needed
    assignment = (
        LXVariable[Tuple[Worker, Task], int]("assignment")
        .binary()
        .indexed_by_product(...)
        .where_multi(lambda w, t: w.can_do_task(t))
    )
    

Performance Considerations

Cartesian Product Size

Be aware of combinatorial explosion:

# 100 drivers × 7 dates = 700 variables (manageable)
# 100 drivers × 30 dates × 3 shifts = 9,000 variables (still fine)
# 1000 drivers × 365 dates × 10 shifts = 3.65M variables (problematic)

Use filtering to reduce size:

duty = (
    LXVariable[Tuple[Driver, Date, Shift], int]("duty")
    .indexed_by_product(...)
    .where_multi(lambda d, dt, s:
        # Only create variables for valid combinations
        d.can_work_shift(s) and
        dt.weekday() not in d.days_off
    )
)

Lazy Evaluation

Data is retrieved only when needed:

# This doesn't query the database yet
product_dim = LXIndexDimension(Product, lambda p: p.id).from_model(session)

# Database query happens here during model solving
optimizer.solve(model)

Next Steps

See Also

  • Variables Guide - Variables that use indexing

  • Constraints Guide - Constraints with indexing

  • Driver Scheduling Example (examples/02_driver_scheduling) - Complete multi-model example

  • Production Planning Example (examples/01_production_planning) - Single-model example