PathTree class

class pathtreelib.PathTree(root, skip_properties=False)

The PathTree class describe a tree made up by PathNode nodes.

The structure mimic the directory tree but add analytic functionalities.

Parameters:
  • root (str | Path | PathNode) –

  • skip_properties (bool) –

root

the root PathNode of the tree

property

the properties of the tree (equivalent to the root properties)

__init__(root, skip_properties=False)

Create a new PathTree, based on the root node.

Parameters:
  • root (str | Path | PathNode) – the root path

  • skip_properties (bool) – ignore computation of tree’s main properties.

Return type:

None

__iter__()

Return a default iterator on the nodes of the tree.

The order is breadth-first and uses the iterator returned by the breadth_first_iter() function. To use the depth/first order, use the iterator returned by the depth_first_iter() function.

Returns:

An iterator for the nodes of the tree.

Return type:

Iterator

__str__()

Return a string describing the tree properties.

Print the property of the root.

Returns:

A string describing the tree.

Return type:

str

breadth_first_iter()

Return an iterator on the nodes of the tree using breadth-first order.

Returns:

An iterator for the nodes of the tree, in breadth-first order.

Return type:

Iterator

compute_basic_property(property_name)

Compute a basic property, contained in PathTreeProperty enum.

If the property specified is PathTreeProperty.PRUNED, this function is a nop.

Parameters:

property_name (PathTreeProperty) – the name of the basic property to compute

Return type:

None

compute_bottom_up_property(property_name, leaf_func, inode_func)

Compute a property of bottom-up type in the tree.

This function calls the namesake function of the PathNode on the root. Check PathNode.compute_bottom_up_property() for more details.

Parameters:
  • property_name (str | PathTreeProperty) – the name of the property (used as key in property dict)

  • leaf_func (Callable[[PathNode], Any]) – the function to compute the property on the leaves

  • inode_func (Callable[[PathNode, list[PathNode]], Any]) – the function to compute the property on the inner nodes (assuming it is already computed on the leaves)

Return type:

None

Parameters of leaf_func:

  • leaf (PathNode): the current node

Parameters of inode_func:

  • inode (PathNode): the current node

  • children (list(PathNode)): the list of children of the current node

compute_individual_property(property_name, property_func)

Compute an individual property in the tree.

This function calls the namesake function of the PathNode on the root. Check PathNode.compute_individual_property() for more details.

Parameters:
  • property_name (str | PathTreeProperty) – the name of the property (used as key in property dict)

  • property_func (Callable[[PathNode], Any]) – the function to compute the property on the node

Parameters of property_func:

  • node: the current node (as PathNode)

compute_top_down_property(property_name, root_func, notroot_func)

Compute a property of top-down type in the subtree of the node.

This function calls the namesake function of the PathNode on the root. Check PathNode.compute_top_down_property() for more details.

Parameters:
  • property_name (str | PathTreeProperty) – the name of the property (used as key in property dict)

  • root_func (Callable[[PathNode], Any]) – the function to compute the property on the root

  • notroot_func (Callable[[PathNode, PathNode], Any]) – the function to compute the property on the not-root nodes (assuming it is already computed on the parent)

Return type:

None

Parameters of root_func:

  • root (PathNode): the current node

Parameters of notroot_func:

  • node (PathNode): the current node

  • parent (PathNode): the parent of the current node

copy()

Return a deepcopy of the tree and all its nodes.

Returns:

A deepcopy of the tree.

Return type:

PathTree

depth_first_iter()

Return an iterator on the nodes of the tree using depth-first order.

Returns:

An iterator for the nodes of the tree, in depth-first order.

Return type:

Iterator

get_node(path)

Return the PathNode corresponding to the passed Path.

Parameters:

path (str | Path) – the path to search in the tree

Returns:

The PathNode corresponding to the passed Path if exists, None otherwise.

Return type:

PathNode | None

logical_pruning(keep_condition)

Remove (logically) the all the subtrees where the root does not satisfy the keep condition.

The logical removal is applied using the property PathTreeProperty.PRUNED: if true, the node is removed.

The tree is scanned in breadth-first order. For each node, the keep condition is checked and if it is not satisfied all the corresponding subtree is logically removed from the tree.

Parameters:

keep_condition (Callable[[PathNode], bool]) – the boolean function that assess if a node, and its subtree, should be kept or pruned.

Return type:

None

Parameters of keep_condition:

  • node: the node to check.

Return of keep_condition:

  • True if the node (and subtree) must be kept, false if it must be pruned.

physical_pruning(keep_condition)

Remove (physically) the all the subtrees where the root does not satisfy the keep condition.

The tree is scanned in breadth-first order. For each node, the keep condition is checked and if it is not satisfied all the corresponding subtree is physically removed from the tree.

Note that the properties of the nodes are not recomputed.

Parameters:

keep_condition (Callable[[PathNode], bool]) – the boolean function that assess if a node, and its subtree, should be kept or pruned.

Return type:

None

Parameters of keep_condition:

  • node: the node to check.

Return of keep_condition:

  • True if the node (and subtree) must be kept, false if it must be pruned.

remove_property(property_name)

Remove a property from all the nodes in the tree.

If the property is missing from one node, no exception is raised.

Parameters:

property_name (str | PathTreeProperty) – the name of the property to remove.

Returns:

  • True if the property appeared in all nodes, false otherwise

  • True if the property appeared in at least one node, false otherwise

Return type:

A tuple containing two booleans

to_csv(csvfile, properties=None, node_condition=<function PathTree.<lambda>>, node_limit=1000000)

Export all nodes of the tree to a csv.

The export includes the name of the path and a list of properties. Due to the high number of nodes a directory tree can have, by default, the export is limited to the first 1 million nodes.

Parameters:
  • csvfile (Path | str) – the name of the csv for the export.

  • properties (list[str | PathTreeProperty] | None) – the list of properties to include in the export. If None all parameters are included.

  • node_condition (Callable[[PathNode], bool]) – the condition a node must meet to be exported (by default all nodes are exported).

  • node_limit (int) – the max number of nodes that can be exported. If <= 0, no limitation is applied.

Return type:

None

to_excel(excelfile, properties=None, node_condition=<function PathTree.<lambda>>, node_limit=1000000)

Export all nodes of the tree to Excel.

The export includes the name of the path and a list of properties. Due to the high number of nodes a directory tree can have, by default, the export is limited to the first 1 million nodes.

Parameters:
  • excelfile (Path | str) – the name of the Excel for the export.

  • properties (list[str | PathTreeProperty] | None) – the list of properties to include in the export. If None all parameters are included.

  • node_condition (Callable[[PathNode], bool]) – the condition a node must meet to be exported (by default all nodes are exported).

  • node_limit (int) – the max number of nodes that can be exported. If <= 0, no limitation is applied.

Return type:

None

validated_iter(valid_func)

Return an iterator on filtered nodes of the tree.

The nodes that do not satisfy the condition are excluded and so their subtree.

Parameters:

valid_func (Callable[[PathNode], bool]) – the criteria to keep nodes.

Returns:

An iterator that exclude not valid nodes and subtrees.

Return type:

Iterator

Parameters of valid_func:

  • node (PathNode): the node to test

Return of valid_func:

  • True if the node is acceptable, false otherwise.