Tree Structures#

The dags.tree module provides utilities for working with nested dictionary structures, converting between tree representations and flat qualified names.

Overview#

When working with hierarchical data or organizing functions into namespaces, you often need to convert between:

Nested dictionaries (tree structure): {"a": {"b": 1, "c": 2}}
Qualified names (flat strings): {"a__b": 1, "a__c": 2}
Tree paths (flat tuples): {("a", "b"): 1, ("a", "c"): 2}

import dags.tree as dt

Qualified Names#

Qualified names (qnames) use a delimiter (default: "__") to represent hierarchy:

# The delimiter used for qualified names
print(dt.QNAME_DELIMITER)  # "__"

# Convert tree path to qname
qname = dt.qname_from_tree_path(("household", "income"))
# "household__income"

# Convert qname to tree path
path = dt.tree_path_from_qname("household__income")
# ("household", "income")

Flattening and Unflattening#

To/From Qualified Names#

# Nested structure
tree = {
    "household": {
        "income": 50000,
        "expenses": 30000,
    },
    "taxes": {
        "federal": 10000,
        "state": 2500,
    },
}

# Flatten to qualified names
flat = dt.flatten_to_qnames(tree)
# {
#     "household__income": 50000,
#     "household__expenses": 30000,
#     "taxes__federal": 10000,
#     "taxes__state": 2500,
# }

# Unflatten back to tree
restored = dt.unflatten_from_qnames(flat)
# Returns the original nested structure

To/From Tree Paths#

# Flatten to tuple paths
flat_paths = dt.flatten_to_tree_paths(tree)
# {
#     ("household", "income"): 50000,
#     ("household", "expenses"): 30000,
#     ("taxes", "federal"): 10000,
#     ("taxes", "state"): 2500,
# }

# Unflatten from tuple paths
restored = dt.unflatten_from_tree_paths(flat_paths)

Extracting Names and Paths#

tree = {
    "a": {
        "x": 1,
        "y": 2,
    },
    "b": 3,
}

# Get all qualified names
names = dt.qnames(tree)
# ["a__x", "a__y", "b"]

# Get all tree paths
paths = dt.tree_paths(tree)
# [("a", "x"), ("a", "y"), ("b",)]

Tree DAG Functions#

The tree module also provides DAG functions that work with nested structures:

create_dag_tree#

Create a DAG from a nested function dictionary:

functions = {
    "inputs": {
        "income": lambda: 50000,
        "tax_rate": lambda: 0.3,
    },
    "calculations": {
        "tax": lambda inputs__income, inputs__tax_rate: inputs__income * inputs__tax_rate,
        "net": lambda inputs__income, calculations__tax: inputs__income - calculations__tax,
    },
}

dag = dt.create_dag_tree(
    functions=functions,
    targets={"calculations": {"net": None}},  # or [("calculations", "net")]
)

concatenate_functions_tree#

Combine functions while preserving tree structure in outputs:

combined = dt.concatenate_functions_tree(
    functions=functions,
    targets={"calculations": ["tax", "net"]},
)

result = combined()
# Returns nested dict: {"calculations": {"tax": 15000, "net": 35000}}

functions_without_tree_logic#

Convert tree-aware functions to flat functions:

# Functions that use qualified names internally
tree_functions = {
    "a": {
        "x": lambda: 1,
        "y": lambda a__x: a__x + 1,  # references a__x
    },
}

# Convert to flat dict with qnames
flat_functions = dt.functions_without_tree_logic(tree_functions)
# {"a__x": <func>, "a__y": <func>}

Validation Functions#

The tree module includes validation utilities:

# Check for invalid paths
dt.fail_if_paths_are_invalid(paths)

# Check for trailing underscores (not allowed)
dt.fail_if_path_elements_have_trailing_undersores(paths)

# Check for repeated top-level elements
dt.fail_if_top_level_elements_repeated_in_paths(paths)

Real-World Example#

Here’s how pylcm uses tree utilities for regime management:

import dags.tree as dt

# Define functions for multiple regimes
regime_functions = {
    "working": {
        "income": lambda wage, hours: wage * hours,
        "utility": lambda income, leisure: income + leisure,
    },
    "retired": {
        "income": lambda pension: pension,
        "utility": lambda income, leisure: income + 2 * leisure,
    },
}

# Flatten for processing with dags
flat_functions = dt.flatten_to_qnames(regime_functions)

# Use with concatenate_functions
import dags

combined = dags.concatenate_functions(
    functions=flat_functions,
    targets=["working__utility", "retired__utility"],
)

# Restore structure for output
result = combined(wage=20, hours=40, pension=1000, leisure=10)
nested_result = dt.unflatten_from_qnames(result)
# {"working": {"utility": ...}, "retired": {"utility": ...}}

API Reference#

See the API documentation for complete function signatures and details.