# Usage Patterns This guide shows common patterns for using dags, based on real-world usage in projects like [pylcm](https://github.com/OpenSourceEconomics/pylcm) and [ttsim](https://github.com/ttsim-dev/ttsim). ## Pattern 1: Building Computational Pipelines The core use case for dags is combining multiple interdependent functions into a single callable. This is powerful because **the same set of functions can be combined in different ways** depending on what you want to compute. ### Example: Data Processing Pipeline Here's a simple data processing pipeline where raw data flows through cleaning, statistics computation, and report generation: ```python import dags def cleaned_data(raw_data): return [x for x in raw_data if x > 0] def statistics(cleaned_data): return { "mean": sum(cleaned_data) / len(cleaned_data), "count": len(cleaned_data), } def report(statistics, cleaned_data): return f"Processed {statistics['count']} items, mean: {statistics['mean']}" functions = { "cleaned_data": cleaned_data, "statistics": statistics, "report": report, } # Create the full pipeline pipeline = dags.concatenate_functions( functions=functions, targets=["report"], return_type="dict", ) # raw_data is an external input (not computed by any function) result = pipeline(raw_data=[1, -2, 3, 4, -5, 6]) # result = {"report": "Processed 4 items, mean: 3.5"} ``` ### Example: Economic Model with Utility Maximization Consider a consumer choosing consumption to maximize utility subject to a budget constraint. We define the model components as separate functions: ```python import numpy as np import dags def utility(consumption, risk_aversion): """CRRA utility function.""" if risk_aversion == 1: return np.log(consumption) return (consumption ** (1 - risk_aversion)) / (1 - risk_aversion) def budget_constraint(income, price): """Maximum affordable consumption.""" return income / price def feasible(consumption, budget_constraint): """Check if consumption is affordable.""" return consumption <= budget_constraint def optimal_utility(budget_constraint, risk_aversion): """Find maximum utility over a grid of consumption values.""" consumption_grid = np.linspace(0.1, budget_constraint, 100) if risk_aversion == 1: utilities = np.log(consumption_grid) else: utilities = (consumption_grid ** (1 - risk_aversion)) / (1 - risk_aversion) return float(np.max(utilities)) functions = { "utility": utility, "budget_constraint": budget_constraint, "feasible": feasible, "optimal_utility": optimal_utility, } ``` Now the power of dags becomes clear: **we can create different combined functions from the same building blocks** depending on what we need: ```python # 1. Compute optimal utility given income and prices solve_model = dags.concatenate_functions( functions=functions, targets=["optimal_utility"], return_type="dict", ) result = solve_model(income=1000, price=10, risk_aversion=2) # result = {"optimal_utility": -0.01} # 2. Evaluate utility and check feasibility for a specific consumption choice evaluate_choice = dags.concatenate_functions( functions=functions, targets=["utility", "feasible"], return_type="dict", ) result = evaluate_choice( income=1000, price=10, consumption=50, risk_aversion=2 ) # result = {"utility": -0.02, "feasible": True} # 3. Just compute the budget constraint get_budget = dags.concatenate_functions( functions=functions, targets=["budget_constraint"], return_type="dict", ) result = get_budget(income=1000, price=10) # result = {"budget_constraint": 100.0} ``` This pattern is particularly useful when: - You have a complex model with many interrelated components - Different use cases require computing different subsets of outputs - You want to avoid code duplication by reusing the same function definitions - The computation graph may change based on user configuration ## Pattern 2: Aggregating Multiple Functions When you have multiple functions that should be combined into a single result, use an aggregator. This is common when checking multiple constraints or combining scores. **When to use this pattern:** - Checking if multiple constraints are all satisfied - Combining multiple penalty terms or objective function components - Voting or ensemble methods where multiple models contribute to a decision ```python import numpy as np import dags def positive_consumption(consumption): """Consumption must be positive.""" return consumption > 0 def within_budget(consumption, budget_constraint): """Consumption must not exceed budget.""" return consumption <= budget_constraint def minimum_savings(consumption, income): """Must save at least 10% of income.""" return consumption <= 0.9 * income # Combine all constraints with logical AND all_feasible = dags.concatenate_functions( functions={ "positive_consumption": positive_consumption, "within_budget": within_budget, "minimum_savings": minimum_savings, }, targets=["positive_consumption", "within_budget", "minimum_savings"], aggregator=np.logical_and, aggregator_return_type=bool, ) # Check if a consumption choice satisfies all constraints is_ok = all_feasible(consumption=80, budget_constraint=100, income=100) # Returns True (80 > 0, 80 <= 100, 80 <= 90) is_ok = all_feasible(consumption=95, budget_constraint=100, income=100) # Returns False (95 > 90 violates minimum_savings) ``` ## Pattern 3: Generating Functions for Multiple Scenarios In economic modeling, you often need to create similar functions for different scenarios, time periods, or agent types. Rather than writing each function by hand, you can generate them programmatically and use `rename_arguments` to ensure they connect properly in the DAG. **When to use this pattern:** - Creating period-specific functions in a dynamic model (e.g., different tax rules by year) - Generating agent-type-specific behavior (e.g., different utility functions by household type) - Building functions for multiple regions or sectors with the same structure ```python import dags def create_income_tax(rate, threshold): """Create a tax function with given rate and threshold.""" def income_tax(gross_income): taxable = max(0, gross_income - threshold) return taxable * rate return income_tax # Tax rules changed over time tax_rules = { 2020: {"rate": 0.25, "threshold": 10000}, 2021: {"rate": 0.27, "threshold": 12000}, 2022: {"rate": 0.30, "threshold": 12000}, } # Generate tax functions for each year functions = {} for year, params in tax_rules.items(): tax_func = create_income_tax(params["rate"], params["threshold"]) # Rename so each function takes year-specific income functions[f"tax_{year}"] = dags.rename_arguments( tax_func, mapper={"gross_income": f"income_{year}"} ) def total_tax_burden(tax_2020, tax_2021, tax_2022): """Sum of taxes across all years.""" return tax_2020 + tax_2021 + tax_2022 functions["total_tax_burden"] = total_tax_burden combined = dags.concatenate_functions( functions=functions, targets=["total_tax_burden"], return_type="dict", ) # Compute total taxes given income trajectory result = combined(income_2020=50000, income_2021=55000, income_2022=60000) # result = {"total_tax_burden": 36010.0} ``` ## Pattern 4: Selective Computation When your function graph contains expensive computations, you can create multiple combined functions that compute only what's needed. dags automatically prunes the computation graph to include only the functions required for the specified targets. **When to use this pattern:** - Some outputs are expensive to compute and not always needed - You want fast feedback during development by computing only key outputs - Different analyses or reports need different subsets of results ```python import dags def simulated_data(parameters, n_simulations): """Expensive Monte Carlo simulation.""" # ... costly operation that takes minutes return simulated_results def summary_statistics(simulated_data): """Compute mean, std, etc. from simulations.""" return {"mean": ..., "std": ...} def full_distribution(simulated_data): """Compute full empirical distribution.""" return distribution def quick_check(parameters): """Fast sanity check of parameters.""" return all(p > 0 for p in parameters.values()) functions = { "simulated_data": simulated_data, "summary_statistics": summary_statistics, "full_distribution": full_distribution, "quick_check": quick_check, } # For quick validation: only runs quick_check, skips simulation validator = dags.concatenate_functions( functions=functions, targets=["quick_check"], ) # For summary results: runs simulation + summary_statistics summarizer = dags.concatenate_functions( functions=functions, targets=["summary_statistics"], ) # For full analysis: runs everything full_analysis = dags.concatenate_functions( functions=functions, targets=["summary_statistics", "full_distribution"], ) ``` ## Pattern 5: Dependency Analysis Use `get_ancestors` to analyze which inputs affect specific outputs. This is useful for understanding model structure, debugging, and optimizing computations. **When to use this pattern:** - Understanding which parameters affect a specific output - Identifying the minimal set of inputs needed for a computation - Debugging unexpected results by tracing dependencies ```python import dags def wage(education, experience): return 20000 + 5000 * education + 1000 * experience def capital_income(wealth, interest_rate): return wealth * interest_rate def total_income(wage, capital_income): return wage + capital_income def consumption(total_income, savings_rate): return total_income * (1 - savings_rate) functions = { "wage": wage, "capital_income": capital_income, "total_income": total_income, "consumption": consumption, } # What affects consumption? (includes both functions and their inputs) ancestors = dags.get_ancestors( functions=functions, targets=["consumption"], include_targets=True, ) # Returns all nodes in the dependency graph: # {"wage", "capital_income", "total_income", "consumption", # "education", "experience", "wealth", "interest_rate", "savings_rate"} # What are the external inputs (leaf nodes)? all_args = set() for func in functions.values(): all_args.update(dags.get_free_arguments(func)) external_inputs = all_args - set(functions.keys()) # Returns: {"education", "experience", "wealth", "interest_rate", "savings_rate"} ``` ## Pattern 6: Working with Nested Structures Use `dags.tree` for hierarchical function organization. This is useful when you have functions grouped by category, region, time period, or any other hierarchy. **When to use this pattern:** - Organizing functions by logical groups (e.g., taxes, transfers, labor market) - Working with multi-region or multi-sector models - Keeping namespaces separate to avoid naming conflicts ```python import dags import dags.tree as dt # Nested function structure representing a tax-transfer system functions = { "income": { "wage": lambda hours, hourly_wage: hours * hourly_wage, "capital": lambda wealth, interest_rate: wealth * interest_rate, }, "taxes": { "income_tax": lambda income__wage, income__capital: ( 0.3 * (income__wage + income__capital) ), }, "transfers": { "basic_income": lambda: 500, }, "net_income": lambda income__wage, income__capital, taxes__income_tax, transfers__basic_income: ( income__wage + income__capital - taxes__income_tax + transfers__basic_income ), } # Flatten to qualified names for use with dags flat_functions = dt.flatten_to_qnames(functions) combined = dags.concatenate_functions( functions=flat_functions, targets=["net_income"], return_type="dict", ) result = combined(hours=40, hourly_wage=25, wealth=10000, interest_rate=0.05) # result = {"net_income": 1550.0} ``` See the [Tree documentation](tree.md) for more details. ## Pattern 7: Signature Inspection and Modification Sometimes you need to inspect or modify function signatures, especially when integrating functions from different sources or creating wrappers. **When to use this pattern:** - Integrating functions from external libraries with different naming conventions - Creating generic wrappers that work with varying function signatures - Building function registries or plugin systems ```python import dags from dags.signature import with_signature # Inspect a function's arguments def model(alpha, beta, gamma): return alpha + beta * gamma args = dags.get_free_arguments(model) # args = ["alpha", "beta", "gamma"] # Rename arguments to match your naming convention renamed = dags.rename_arguments(model, mapper={ "alpha": "intercept", "beta": "slope", "gamma": "x", }) # Verify the new signature new_args = dags.get_free_arguments(renamed) # new_args = ["intercept", "slope", "x"] # Get type annotations (returns type objects, not strings) def typed_func(x: float, y: int) -> float: return x + y annotations = dags.get_annotations(typed_func) # annotations = {"x": float, "y": int, "return": float} ``` ## Best Practices 1. **Use descriptive function names**: Since dags uses names for dependency resolution, clear names make the DAG easier to understand and debug. 2. **Keep functions focused**: Each function should do one thing well, making the DAG modular and testable. This also makes it easier to compute different subsets of outputs. 3. **Document dependencies**: Even though dags infers dependencies from parameter names, documenting expected inputs in docstrings helps maintainability. 4. **Use `enforce_signature=False` for dynamic cases**: When functions have dynamic signatures (e.g., generated at runtime), disable signature enforcement: ```python combined = dags.concatenate_functions( functions=functions, targets=targets, enforce_signature=False, ) ``` 5. **Set annotations for type checking**: Enable type annotations on the combined function for better IDE support and type checking: ```python combined = dags.concatenate_functions( functions=functions, targets=targets, set_annotations=True, ) ```