Indexing Guide¶

Overview¶

The indexing module is one of LumiX’s most powerful features, enabling you to create variables and constraints indexed by one or more data models. This allows you to build optimization models that naturally reflect the structure of your data.

Key Concept: Instead of manually creating variables in loops with numeric indices, LumiX automatically expands variable and constraint families based on your data models with full type safety.

# Traditional approach - error-prone and verbose
x = {}
for i in range(len(products)):
    x[i] = model.addVar(name=f"x_{i}")

# LumiX approach - data-driven and type-safe
production = (
    LXVariable[Product, float]("production")
    .continuous()
    .indexed_by(lambda p: p.id)
    .from_data(products)
)

Core Concepts¶

Single-Model Indexing¶

Variables and constraints indexed by a single data model.

Use cases:

Production planning (indexed by Product)
Facility location (indexed by Facility)
Resource allocation (indexed by Resource)

Single-Model Indexing - Learn about single-model indexing

Multi-Model Indexing¶

Variables indexed by tuples of multiple data models using cartesian products.

Use cases:

Driver scheduling (indexed by Driver × Date)
Transportation planning (indexed by Origin × Destination)
Shift assignment (indexed by Worker × Date × Shift)

Multi-Model Indexing - Learn about multi-model indexing

Index Dimensions¶

Building blocks for multi-dimensional indexing with filtering and data sources.

Key features:

Per-dimension filtering
Multiple data sources (direct data, ORM queries)
Type-safe key extraction
Lazy evaluation

Index Dimensions - Learn about index dimensions

Filtering Strategies¶

Control which instances and combinations are included in your model.

Filtering types:

Per-dimension filters (reduce data before expansion)
Cross-dimension filters (filter combinations)
Performance optimization techniques

Filtering Strategies - Learn about filtering strategies

Why Indexing Matters¶

Type Safety¶

LumiX preserves type information throughout your model:

production = LXVariable[Product, float]("production").from_data(products)

# IDE knows 'p' is a Product - full autocomplete
expr = LXLinearExpression().add_term(production, lambda p: p.profit)

Data-Driven Modeling¶

Models automatically adapt to your data:

# Add more products? No code changes needed!
products = load_products()  # Could be 10 or 10,000 products

production = LXVariable[Product, float]("production").from_data(products)
# Automatically creates the right number of variables

No Manual Loops¶

LumiX handles variable expansion internally:

# No loops needed - automatic expansion
production = LXVariable[Product, float]("production").from_data(products)

# Automatic expansion in expressions
total_profit = LXLinearExpression().add_term(production, lambda p: p.profit)

# Automatic expansion in constraints
for resource in resources:
    model.add_constraint(
        LXConstraint[Product](f"resource_{resource.id}")
        .expression(
            LXLinearExpression()
            .add_term(production, lambda p: p.usage.get(resource.id, 0))
        )
        .le()
        .rhs(resource.capacity)
        .from_data(products)
    )

Multi-Dimensional Support¶

Natural handling of multi-dimensional problems:

from typing import Tuple

# Two-dimensional variable
assignment = (
    LXVariable[Tuple[Driver, Date], int]("assignment")
    .binary()
    .indexed_by_product(
        LXIndexDimension(Driver, lambda d: d.id).from_data(drivers),
        LXIndexDimension(Date, lambda dt: dt.date).from_data(dates)
    )
)

# Access solution with type-safe indices
for driver in drivers:
    for date in dates:
        value = solution.variables["assignment"][(driver.id, date.date)]
        if value > 0.5:
            print(f"{driver.name} works on {date.date}")

Quick Start Examples¶

Production Planning (Single-Model)¶

from dataclasses import dataclass
from lumix import LXVariable, LXModel, LXLinearExpression, LXConstraint

@dataclass
class Product:
    id: str
    name: str
    profit: float
    resource_usage: float

products = [
    Product("A", "Product A", profit=30, resource_usage=2),
    Product("B", "Product B", profit=40, resource_usage=3),
]

# Define variable indexed by Product
production = (
    LXVariable[Product, float]("production")
    .continuous()
    .bounds(lower=0)
    .indexed_by(lambda p: p.id)
    .from_data(products)
)

# Build model
model = (
    LXModel("production_plan")
    .add_variable(production)
    .maximize(
        LXLinearExpression()
        .add_term(production, lambda p: p.profit)
    )
)

# Add resource constraint
model.add_constraint(
    LXConstraint("resource_limit")
    .expression(
        LXLinearExpression()
        .add_term(production, lambda p: p.resource_usage)
    )
    .le()
    .rhs(100)
)

Driver Scheduling (Multi-Model)¶

from typing import Tuple
from lumix import LXVariable, LXIndexDimension

@dataclass
class Driver:
    id: str
    name: str
    daily_rate: float
    is_active: bool

@dataclass
class Date:
    date: datetime.date
    min_drivers_required: int

drivers = [...]
dates = [...]

# Define variable indexed by (Driver, Date)
duty = (
    LXVariable[Tuple[Driver, Date], int]("duty")
    .binary()
    .indexed_by_product(
        LXIndexDimension(Driver, lambda d: d.id)
            .where(lambda d: d.is_active)
            .from_data(drivers),
        LXIndexDimension(Date, lambda dt: dt.date)
            .from_data(dates)
    )
    .cost_multi(lambda driver, date: driver.daily_rate)
    .where_multi(lambda driver, date: is_available(driver, date))
)

Common Patterns¶

Filtering During Definition¶

Apply filters early to reduce model size:

# Filter at variable level
production = (
    LXVariable[Product, float]("production")
    .continuous()
    .where(lambda p: p.is_active and p.stock > 0)
    .from_data(products)
)

# Filter dimensions
driver_dim = (
    LXIndexDimension(Driver, lambda d: d.id)
    .where(lambda d: d.is_active and d.years_experience >= 2)
    .from_data(drivers)
)

Sparse Indexing¶

Create variables only for valid combinations:

# Only create variables where driver can work on date
duty = (
    LXVariable[Tuple[Driver, Date], int]("duty")
    .binary()
    .indexed_by_product(
        LXIndexDimension(Driver, lambda d: d.id).from_data(drivers),
        LXIndexDimension(Date, lambda dt: dt.date).from_data(dates)
    )
    .where_multi(lambda driver, date:
        date.weekday() not in driver.days_off and
        driver.can_work_date(date)
    )
)

Compound Keys¶

Use tuples for complex indices:

# Route indexed by (origin, destination)
route_dim = LXIndexDimension(
    Route,
    lambda r: (r.origin, r.destination)
).from_data(routes)

Best Practices¶

Use Type Annotations

# Good: Full type information
production = LXVariable[Product, float]("production")

# Bad: No type information
production = LXVariable("production")

Filter Early

# Good: Filter at dimension level
dim = (
    LXIndexDimension(Driver, lambda d: d.id)
    .where(lambda d: d.is_active)
    .from_data(drivers)
)

# Less efficient: Filter after cartesian product
product.where(lambda d, dt: d.is_active and ...)

Choose Descriptive Index Functions

# Good: Clear what the index represents
.indexed_by(lambda product: product.sku)

# Less clear: Generic
.indexed_by(lambda p: p.id)

Use Sparse Indexing for Large Models

# Create variables only where needed
assignment = (
    LXVariable[Tuple[Worker, Task], int]("assignment")
    .binary()
    .indexed_by_product(...)
    .where_multi(lambda w, t: w.can_do_task(t))
)

Performance Considerations¶

Cartesian Product Size¶

Be aware of combinatorial explosion:

# 100 drivers × 7 dates = 700 variables (manageable)
# 100 drivers × 30 dates × 3 shifts = 9,000 variables (still fine)
# 1000 drivers × 365 dates × 10 shifts = 3.65M variables (problematic)

Use filtering to reduce size:

duty = (
    LXVariable[Tuple[Driver, Date, Shift], int]("duty")
    .indexed_by_product(...)
    .where_multi(lambda d, dt, s:
        # Only create variables for valid combinations
        d.can_work_shift(s) and
        dt.weekday() not in d.days_off
    )
)

Lazy Evaluation¶

Data is retrieved only when needed:

# This doesn't query the database yet
product_dim = LXIndexDimension(Product, lambda p: p.id).from_model(session)

# Database query happens here during model solving
optimizer.solve(model)

Next Steps¶

Single-Model Indexing - Master single-model indexing
Multi-Model Indexing - Learn multi-dimensional indexing
Index Dimensions - Understand index dimensions
Filtering Strategies - Optimize with filtering strategies
Examples - See complete working examples