Embedders#
Base Embedder#
Base Graph Embedder#
- class udao.model.embedders.base_graph_embedder.BaseGraphEmbedder(net_params: Params)#
Bases:
BaseEmbedder
,ABC
Base class for Embedder networks. Takes care of preparing the input features for the embedding layer, and normalizing the output embedding.
- Parameters:
net_params (EmbedderParams) –
- class Params(output_size: int, input_size: int, n_op_types: int | None, op_groups: Sequence[str], type_embedding_dim: int | None, embedding_normalizer: Literal['BN', 'LN', 'IsoBN'] | None)#
Bases:
Params
- embedding_normalizer: Literal['BN', 'LN', 'IsoBN'] | None#
Name of the normalizer to use for the output embedding.
- input_size: int#
The size of the input features, except for the type of operation. If type is provided, the input size is increased at init by the type embedding dimension.
- n_op_types: int | None#
The number of operation types.
- op_groups: Sequence[str]#
The groups of operation features to be included in the embedding.
- type_embedding_dim: int | None#
The dimension of the operation type embedding.
- concatenate_op_features(g: DGLGraph) Tensor #
Concatenate the operation features into a single tensor.
- Parameters:
g (dgl.DGLGraph) – Input graph
- Returns:
output tensor of shape (num_nodes, input_size)
- Return type:
th.Tensor
- normalize_embedding(embedding: Tensor) Tensor #
Normalizes the embedding.
Graph Averager#
- class udao.model.embedders.graph_averager.GraphAverager(net_params: Params)#
Bases:
BaseGraphEmbedder
Averager Embedder network. Computes an embedding for each operation using a linear layer, then averages the embeddings of all operations in the graph.
- class Params(output_size: int, input_size: int, n_op_types: int | None, op_groups: Sequence[str], type_embedding_dim: int | None, embedding_normalizer: Literal['BN', 'LN', 'IsoBN'] | None)#
Bases:
Params
- embedding_normalizer: Literal['BN', 'LN', 'IsoBN'] | None#
Name of the normalizer to use for the output embedding.
- input_size: int#
The size of the input features, except for the type of operation. If type is provided, the input size is increased at init by the type embedding dimension.
- n_op_types: int | None#
The number of operation types.
- op_groups: Sequence[str]#
The groups of operation features to be included in the embedding.
- output_size: int#
The size of the output embedding.
- type_embedding_dim: int | None#
The dimension of the operation type embedding.
- forward(g: DGLGraph) Tensor #
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
Graph Transformer#
- class udao.model.embedders.graph_transformer.GraphTransformer(net_params: Params)#
Bases:
BaseGraphEmbedder
Graph Transformer Network Computes graph embedding using attention mechanism (either QF, RAAL, or GTN)
- class Params(output_size: int, input_size: int, n_op_types: int | None, op_groups: Sequence[str], type_embedding_dim: int | None, embedding_normalizer: Literal['BN', 'LN', 'IsoBN'] | None, pos_encoding_dim: int, n_layers: int, n_heads: int, hidden_dim: int, readout: Literal['sum', 'max', 'mean'], max_dist: int | None = None, non_siblings_map: Dict[int, Dict[int, List[int]]] | None = None, attention_layer_name: Literal['QF', 'GTN', 'RAAL'] = 'GTN', dropout: float = 0.0, residual: bool = True, use_bias: bool = False, batch_norm: bool = True, layer_norm: bool = False)#
Bases:
Params
- attention_layer_name: Literal['QF', 'GTN', 'RAAL'] = 'GTN'#
Defines which attention layer to use (QF, RAAL, or GTN))
- batch_norm: bool = True#
Whether to use batch normalization. Defaults to True.
- dropout: float = 0.0#
Dropout probability.
Size of the hidden layers outputs.
- layer_norm: bool = False#
Whether to use layer normalization. Defaults to False.
- max_dist: int | None = None#
Maximum distance for QF attention.
- n_heads: int#
Number of attention heads.
- n_layers: int#
Number of GCN layers.
- non_siblings_map: Dict[int, Dict[int, List[int]]] | None = None#
Non-siblings map for RAAL attention. For each type of graph, maps the edge id to all nodes that are not siblings of its source node
- pos_encoding_dim: int#
Dimension of the position encoding.
- readout: Literal['sum', 'max', 'mean']#
how the node embeddings are aggregated to form the graph embedding.
- Type:
Readout type
- residual: bool = True#
Whether to make the layer residual. Defaults to True.
- use_bias: bool = False#
Whether to use bias in the attention layer. Defaults to False.
- forward(g: DGLGraph) Tensor #
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
Embedder layers#
Graph MultiHead Attention#
- class udao.model.embedders.layers.multi_head_attention.MultiHeadAttentionLayer(in_dim: int, out_dim: int, n_heads: int, use_bias: bool)#
Bases:
Module
Multi-Head Attention Layer for Graph proposed by “A Generalization of Transformer Networks to Graphs”, DLG-AAAI’21. https://arxiv.org/pdf/2012.09699.pdf
- Parameters:
in_dim (int) – Input dimension
out_dim (int) – Output dimension
n_heads (int) – Number of attention heads
use_bias (bool) – Whether to use bias
- compute_attention(g: DGLGraph) DGLGraph #
Simple attention mechanism
- compute_query_key_value(g: DGLGraph, h: Tensor) DGLGraph #
Compute query, key, and value for each node
- forward(g: DGLGraph, h: Tensor) Tensor #
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class udao.model.embedders.layers.multi_head_attention.QFMultiHeadAttentionLayer(in_dim: int, out_dim: int, n_heads: int, use_bias: bool, attention_bias: Tensor)#
Bases:
MultiHeadAttentionLayer
MultiHead Attention using QueryFormer proposed by “QueryFormer: A Tree Transformer Model for Query Plan Representation” https://www.vldb.org/pvldb/vol15/p1658-zhao.pdf
The QF MultiHead Attention Layer requires the graphs to have a “dist” edge feature. see examples/data/spark/4.qf_addition
- Parameters:
attention_bias (torch.Tensor) –
- compute_attention(g: DGLGraph) DGLGraph #
Attention mechanism with attention bias
- class udao.model.embedders.layers.multi_head_attention.RAALMultiHeadAttentionLayer(in_dim: int, out_dim: int, n_heads: int, use_bias: bool, non_siblings_map: Dict[int, Dict[int, List[int]]])#
Bases:
MultiHeadAttentionLayer
MultiHead Attention using Resource-Aware Attentional LSTM proposed by “A Resource-Aware Deep Cost Model for Big Data Query Processing” https://ieeexplore.ieee.org/document/9835426
The RAAL MultiHead Attention Layer requires the graphs to have an “sid” node feature, corresponding to the id of the template graph. This makes the link with the non_siblings_map.
- Parameters:
non_siblings_map (Dict[int, Dict[int, List[int]]]) –
graph (For each type of) –
to (maps the edge id) –
node (all nodes that are not siblings of its source) –
- compute_attention(g: DGLGraph) DGLGraph #
Attention mechanism with non-siblings attention
- udao.model.embedders.layers.multi_head_attention.e_scaled_exp(field: str, scale_constant: float) Callable[[Any], Dict[str, Tensor]] #
Compute scaled exponential of a graph’s edge field
- udao.model.embedders.layers.multi_head_attention.e_src_dot_dst(src_field: str, dst_field: str, out_field: str) Callable[[Any], Dict[str, Tensor]] #
Multiply source and destination node features and sum them up
Graph Transformer layer#
- class udao.model.embedders.layers.graph_transformer_layer.GraphTransformerLayer(in_dim: int, out_dim: int, n_heads: int, dropout: float = 0.0, layer_norm: bool = False, batch_norm: bool = True, residual: bool = True, use_bias: bool = False, attention_layer_name: Literal['QF', 'GTN', 'RAAL'] = 'GTN', attention_bias: Tensor | None = None, non_siblings_map: Dict[int, Dict[int, List[int]]] | None = None)#
Bases:
Module
Graph Transformer Layer that applies multi-head attention and feed-forward network to an input containing a graph. proposed by “A Generalization of Transformer Networks to Graphs”, DLG-AAAI’21. https://arxiv.org/pdf/2012.09699.pdf
- Parameters:
in_dim (int) – Dimension of the input tensor
out_dim (int) – Dimension of the output tensor
n_heads (int) – Number of attention heads
dropout (float, optional) – Probability of dropout to apply to attention output, by default 0.0
layer_norm (bool, optional) – Whether to apply layer normalization, by default False
batch_norm (bool, optional) – Whether to apply batch normalization, by default True
residual (bool, optional) – Whether to make a residual connection, by default True
use_bias (bool, optional) – Whether to use bias, by default False
attention_layer_name (AttentionLayerName, optional) – Choice of the kind of attention layer, by default “GTN”
attention_bias (Optional[th.Tensor], optional) – by default None
non_siblings_map (Optional[Dict[int, Dict[int, List[int]]]], optional) – For each type of graph, maps the node id to all nodes that are not siblings of its source node
- Raises:
ValueError – Unknown attention type
ValueError – out_dim must be divisible by n_heads
- forward(g: DGLGraph, h: Tensor) Tensor #
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.