Containers#

Containers are the data structures returned by the extractors and used by the iterators. They are udao’s vehicle for data along the processing pipeline.

Base Container#

class udao.data.containers.base_container.BaseContainer#

Bases: ABC

Base class for containers. Containers are used to store and retrieve data from a dataset, based on a key.

Query Structure Container#

class udao.data.containers.query_structure_container.QueryDescription(template_id: int, template_graph: dgl.heterograph.DGLGraph, graph_features: Iterable, meta_features: Iterable | None, operation_types: Iterable)#

Bases: object

class udao.data.containers.query_structure_container.QueryStructureContainer(graph_features: DataFrame, graph_meta_features: DataFrame | None, template_plans: Dict[int, QueryPlanStructure], key_to_template: Dict[str, int], operation_types: Series)#

Bases: BaseContainer

Container for the query structure and features of a query plan.

graph_features: DataFrame#

Stores the features of the operations in the query plan.

graph_meta_features: DataFrame | None#

Stores the meta features of the operations in the query plan.

key_to_template: Dict[str, int]#

Link a key to a template id.

operation_types: Series#

Stores the operation types of the operations in the query plan.

template_plans: Dict[int, QueryPlanStructure]#

Link a template id to a QueryPlanStructure

Dataframe Container#

class udao.data.containers.tabular_container.TabularContainer(data: DataFrame)#

Bases: BaseContainer

Container for tabular data, stored in DataFrame format.