API Reference#
This page documents the public API of the dags library.
Core Functions#
The main functions for creating and executing DAGs.
concatenate_functions#
- dags.concatenate_functions(functions, targets=None, *, dag=None, return_type='tuple', aggregator=None, aggregator_return_type=None, enforce_signature=True, set_annotations=False, lexsort_key=None)[source]#
Combine functions to one function that generates targets.
Functions can depend on the output of other functions as inputs, as long as the dependencies can be described by a directed acyclic graph (DAG).
Functions that are not required to produce the targets will simply be ignored.
The arguments of the combined function are all arguments of relevant functions that are not themselves function names, in alphabetical order.
- Parameters:
functions (dict or list) – Dict or list of functions. If a list, the function name is inferred from the __name__ attribute of the entries. If a dict, the name of the function is set to the dictionary key.
targets (str or list or None) – Name of the function that produces the target or list of such function names. If the value is None, all variables are returned.
dag (networkx.DiGraph or None) – A DAG of functions. If None, a new DAG is created from the functions and targets.
return_type (str) – One of “tuple”, “list”, “dict”. This is ignored if the targets are a single string or if an aggregator is provided.
aggregator (callable or None) – Binary reduction function that is used to aggregate the targets into a single target.
aggregator_return_type (str or None) – Explicit return type annotation for the aggregated result. If None and set_annotations is True, the return type is inferred from the aggregator’s annotations or from the target types (if all targets have the same type). This parameter is only used when an aggregator is provided.
enforce_signature (bool) – If True, the signature of the concatenated function is enforced. Otherwise it is only provided for introspection purposes. Enforcing the signature has a small runtime overhead.
set_annotations (bool) – If True, sets the annotations of the concatenated function based on those of the functions used to generate the targets. The return annotation of the concatenated function reflects the requested return type and number of targets (e.g., for two targets returned as a list, the return annotation is a list of their respective type hints). Note that this is not a valid type annotation and should not be used for type checking. All annotations must be strings; otherwise, a NonStringAnnotationError is raised. To ensure string annotations, enclose them in quotes or use “from __future__ import annotations” at the top of your file. An AnnotationMismatchError is raised if annotations differ between functions.
lexsort_key (callable or None) – A function that takes a string and returns a value that can be used to sort the nodes. This is used to sort the nodes in the topological sort. If None, the nodes are sorted alphabetically.
- Returns:
function: A function that produces targets when called with suitable arguments.
- Raises:
- NonStringAnnotationError – If set_annotations is True and the type: annotations are not strings.
- AnnotationMismatchError – If set_annotations is True and there are: incompatible annotations in the DAG’s components.
- Return type:
Callable[…, Any]
create_dag#
- dags.create_dag(functions, targets)[source]#
Build a directed acyclic graph (DAG) from functions.
Functions can depend on the output of other functions as inputs, as long as the dependencies can be described by a directed acyclic graph (DAG).
Functions that are not required to produce the targets will simply be ignored.
- Parameters:
functions (dict or list) – Dict or list of functions. If a list, the function name is inferred from the __name__ attribute of the entries. If a dict, the name of the function is set to the dictionary key.
targets (str or list or None) – Name of the function that produces the target or list of such function names. If the value is None, all variables are returned.
- Returns:
dag: the DAG (as networkx.DiGraph object)
- Return type:
nx.DiGraph[str]
get_ancestors#
- dags.get_ancestors(functions, targets, include_targets=False)[source]#
Build a DAG and extract all ancestors of targets.
- Parameters:
functions (dict or list) – Dict or list of functions. If a list, the function name is inferred from the __name__ attribute of the entries. If a dict, with node names as keys or just the values as a tuple for multiple outputs.
targets (str) – Name of the function that produces the target function.
include_targets (bool) – Whether to include the target as its own ancestor.
- Return type:
- Returns:
set: The ancestors
Annotation Functions#
Functions for working with type annotations and function signatures.
get_annotations#
- dags.get_annotations(func, eval_str=False, default=None)[source]#
Thin wrapper around inspect.get_annotations.
Compared to inspect.get_annotations, this function also handles partialled funcs, and it returns annotations for all arguments, not just the ones with annotations.
- Parameters:
func (
Callable[...,Any]) – The function to get annotations from.eval_str (
bool, default:False) – If True, the string type annotations are evaluated.default (
str|type|None, default:None) – The default value to use if an annotation is missing. If None, the default value is inspect.Parameter.empty if eval_str is True, otherwise “no_annotation_found”.
- Return type:
- Returns:
A dictionary with the argument names as keys and the type annotations as values. The type annotations are strings if eval_str is False, otherwise they are types.
get_free_arguments#
rename_arguments#
- dags.rename_arguments(func=None, *, mapper=None)[source]#
Rename positional and keyword arguments of func.
- Parameters:
func (callable) – The function of which the arguments are renamed.
mapper (dict) – Dict of strings where keys are old names and values are new of arguments.
- Return type:
Callable[…, R] | Callable[[Callable[P, R]], Callable[…, R]]
- Returns:
function: The function with renamed arguments.
Exceptions#
- exception dags.AnnotationMismatchError[source]#
Bases:
DagsErrorRaised when there is a mismatch between annotations.
- exception dags.CyclicDependencyError[source]#
Bases:
DagsErrorRaised when the DAG contains a cycle.
- exception dags.InvalidFunctionArgumentsError[source]#
Bases:
DagsErrorRaised when there’s an issue with function signatures or arguments.
dags.tree#
The tree module provides utilities for working with nested dictionaries and qualified names.
Tree Path Utilities#
- dags.tree.QNAME_DELIMITER = '__'#
str(object=’’) -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.__str__() (if defined) or repr(object). encoding defaults to ‘utf-8’. errors defaults to ‘strict’.
- dags.tree.tree_path_from_qname(qname)[source]#
Convert a qualified name to a tree path (tuple of strings).
Flatten/Unflatten Functions#
- dags.tree.flatten_to_qnames(nested)[source]#
Flatten a nested dictionary to a flat dictionary with qualified names as keys.
- Parameters:
nested (NestedStructureDict) – A nested dictionary.
- Return type:
FlatQNameDict
- Returns:
A flat dictionary with qualified names as keys.
- dags.tree.unflatten_from_qnames(flat_qnames)[source]#
Return a nested dictionary from a flat dictionary with qualified names as keys.
- Parameters:
flat_qnames (FlatQNameDict) – A dictionary with qualified names as keys.
- Return type:
NestedStructureDict
- Returns:
A nested dictionary.
- dags.tree.flatten_to_tree_paths(nested)[source]#
Flatten a nested dictionary to a flat dictionary with tree paths as keys.
- Parameters:
nested (NestedStructureDict) – A nested dictionary.
- Return type:
FlatTreePathDict
- Returns:
A flat dictionary with qualified names as keys.
- dags.tree.unflatten_from_tree_paths(flat_tree_paths)[source]#
Return a nested dictionary from a flat dictionary with tree paths as keys.
- Parameters:
flat_tree_paths (FlatTreePathDict) – A flat dictionary with tree paths (tuples) as keys.
- Return type:
NestedStructureDict
- Returns:
A nested dictionary.
Tree DAG Functions#
- dags.tree.create_dag_tree(functions, inputs, targets)[source]#
Build a DAG from the given functions, targets, and input structure.
- Parameters:
functions – A nested dictionary of functions.
inputs – A nested dictionary with the inputs or their structure.
targets – A nested dictionary of targets (or None).
- Returns:
A networkx.DiGraph representing the DAG.
- Return type:
nx.DiGraph[str]
- dags.tree.concatenate_functions_tree(functions, inputs, targets, enforce_signature=True)[source]#
Combine a nested dictionary of functions into a single callable.
- Parameters:
functions (
Mapping[str,Callable[...,Any] |Mapping[str, Callable[…, Any] | NestedFunctionDict]]) – The nested dictionary of functions to concatenate.inputs (
Mapping[str,Any|Mapping[str, Any | NestedInputDict]]) – A nested dictionary that defines the (structure of) inputs.targets (
Mapping[str,str|None|Mapping[str, str | None | NestedTargetDict]] |None) – The nested dictionary of targets (or None).enforce_signature (
bool, default:True) – Whether to enforce the function signature strictly.
- Return type:
Callable[[Mapping[str,Any|Mapping[str, Any | NestedInputDict]]],Mapping[str,Any|Mapping[str, Any | NestedOutputDict]]]- Returns:
A callable that takes a NestedInputDict and returns a NestedOutputDict.
- dags.tree.create_tree_with_input_types(functions, targets=None, top_level_inputs=())[source]#
Create a nested input structure template based on the functions and targets.
- Parameters:
functions (
Mapping[str,Callable[...,Any] |Mapping[str, Callable[…, Any] | NestedFunctionDict]]) – A nested dictionary of functions.targets (
Mapping[str,str|None|Mapping[str, str | None | NestedTargetDict]] |None, default:None) – A nested dictionary of targets (or None).top_level_inputs (
set[str] |list[str] |tuple[str,...], default:()) – Names of inputs in the top-level namespace.
- Return type:
Mapping[str,str|None|Mapping[str, str | None | NestedInputStructureDict]]- Returns:
A nested dictionary representing the expected input structure.
- dags.tree.functions_without_tree_logic(functions, top_level_namespace)[source]#
Return a functions dictionary that dags.concatenate_functions can work with.
In particular, remove all tree logic by
Flattening the set of functions and inputs to qualified absolute names.
Convert all functions so they take only qualified absolute names as arguments.
The result can be put into dags.concatenate_functions.
- Parameters:
- Return type:
- Returns:
A flat dictionary mapping qualified absolute names to functions taking qualified
absolute names as arguments.
Validation Functions#
- dags.tree.fail_if_paths_are_invalid(functions=None, abs_qnames_functions=None, data_tree=None, input_structure=None, targets=None, top_level_namespace=())[source]#
Fail if the paths in the (different parts of the) functions tree are invalid.
The interface is designed so you can pass any argument you like, but none of them is required (however, not passing anything does not make sense).
There are two reasons for failure:
Path elements have trailing underscores.
The paths contain elements that are part of the top-level namespace.
Note: Sometimes you want to pass both functions (the nested function dict you will start out with) and abs_qnames_functions (the result of running functions_without_tree_logic on functions, which contains the converted parameters of functions, too). Even though the former may be seen as a subset of the latter, the conversion to qualified absolute names is not innocuous when it comes to the check for trailing underscores. The reason is that the conversion from qualified names to tree paths assigns any third consecutive underscore to the name that comes after the double underscore separating two levels of nesting.
- Parameters:
functions (NestedFunctionDict | None, default:
None) – The nested function dict.abs_qnames_functions (QNameFunctionDict | None, default:
None) – The result of running functions_without_tree_logic on functions.data_tree (NestedStructureDict | None, default:
None) – The tree of input data (typically not used together with input_structure).input_structure (NestedInputDict | None, default:
None) – The structure of inputs (typically not used together with data_tree).targets (NestedTargetDict | None, default:
None) – The tree of targets to be computed.top_level_namespace (set[str] | list[str] | tuple[str, …], default:
()) – The set of top-level namespace elements (required for the check regarding repetition of elements).
- Raises:
TrailingUnderscoreError – If the paths in the functions tree are invalid.:
RepeatedTopLevelElementError – If the paths in the functions tree are invalid.:
- Return type:
None
- dags.tree.fail_if_path_elements_have_trailing_undersores(all_tree_paths)[source]#
Check if any element of the tree path except for the leaf ends with an underscore.
- Parameters:
tree_paths – The tree paths.
- Raises:
TrailingUnderscoreError – If any branch of the functions tree ends with an: underscore.
- Return type:
- dags.tree.fail_if_top_level_elements_repeated_in_paths(all_tree_paths, top_level_namespace)[source]#
Fail if any element of the top-level namespace is repeated elsewhere.
- Parameters:
- Raises:
RepeatedTopLevelElementError – If any element of the top-level namespace is: repeated further down in the hierarchy.
- Return type:
Tree Exceptions#
- exception dags.tree.RepeatedTopLevelElementError[source]#
Bases:
ValidationErrorRaised when top-level elements are repeated in paths.
- exception dags.tree.TrailingUnderscoreError[source]#
Bases:
ValidationErrorRaised when path elements have trailing underscores.