deepsnap package¶

Submodules¶

deepsnap.batch module¶

class deepsnap.batch.Batch(batch=None, **kwargs)[source]¶

Bases: deepsnap.graph.Graph

A plain old python object modeling a batch of deepsnap.graph.Graph objects as one big (disconnected) graph, with torch_geometric.data.Data being the base class, all its methods can also be used here. In addition, single graphs can be reconstructed via the assignment vector batch, which maps each node to its respective graph identifier.

apply_transform(transform, update_tensor: bool = True, update_graph: bool = False, deep_copy: bool = False, **kwargs)[source]¶

Applies a transformation to each graph object in parallel by first calling to_data_list, applying the transform, and then perform re-batching again to a Batch. A transform should edit the graph object, including changing the graph structure, and adding node/edge/graph attributes. The rest are automatically handled by the deepsnap.graph.Graph object, including everything ended with index.

Parameters

transform – Transformation function applied to each graph object.
update_tensor – Whether use nx graph to update tensor attributes.
update_graph – Whether use tensor attributes to update nx graphs.
deep_copy – True if a new deep copy of batch is returned. This option allows modifying the batch of graphs without changing the graphs in the original dataset.
kwargs – Parameters used in transform function in deepsnap.graph.Graph objects.

apply_transform_batched(transform)[source]¶

A transform that directly operates on batched graphs User customized apply for batched graphs (expert-only)

Parameters: transform – Transformation function applied to each graph object.

static collate(follow_batch=[], transform=None, **kwargs)[source]¶

static from_data_list(data_list: List[deepsnap.graph.Graph], follow_batch: List = None, transform: Callable = None, **kwargs)[source]¶

Constructs A deepsnap.batch.Batch object from a python list holding torch_geometric.data.Data objects. The assignment vector batch is created on the fly. Additionally, creates assignment batch vectors for each key in follow_batch.

Parameters

data_list (list) – A list of deepsnap.graph.Graph objects.
follow_batch (list, optional) – Creates assignment batch vectors for each key.
transform – If apply transform when batching.
**kwargs – Other parameters.

property num_graphs¶

Returns the number of graphs in the batch.

Returns: The number of graphs in the batch.
Return type: int

to_data_list()[source]¶: Reconstructs the list of torch_geometric.data.Data objects from the batch object. The batch object must have been created via from_data_list() in order to be able reconstruct the initial objects.

deepsnap.dataset module¶

class deepsnap.dataset.EnsembleGenerator(generators, gen_prob=None, dataset_len=0)[source]¶

Bases: deepsnap.dataset.Generator

generate(**kwargs)[source]¶

Generate a list of graphs.

Returns: Generated list of deepsnap.graph.Graph objects.
Return type: list

property num_edge_labels¶

Returns number of the edge labels in the generated graphs.

Returns: The number of edge labels.
Return type: int

property num_edges¶

Returns number of the edges in each generated graphs.

Returns: List of the number of edges.
Return type: list

property num_graph_labels¶

Returns number of the graph labels in the generated graphs.

Returns: The number of graph labels.
Return type: int

property num_node_labels¶

Returns number of the node labels in the generated graphs.

Returns: The number of node labels.
Return type: int

property num_nodes¶

Returns number of the nodes in each generated graphs.

Returns: List of the number of nodes.
Return type: list

class deepsnap.dataset.Generator(sizes, size_prob=None, dataset_len=0)[source]¶

Bases: object

Abstract class of on the fly generator used in dataset. It generates graphs on the fly to be fed into the model.

generate()[source]¶: Overwrite in subclass. Generates and returns a Graph object

property num_edge_labels¶

property num_edges¶

property num_graph_labels¶

property num_node_labels¶

property num_nodes¶

set_len(dataset_len)[source]¶

class deepsnap.dataset.GraphDataset(graphs, task: str = 'node', edge_negative_sampling_ratio: float = 1, edge_message_ratio: float = 0.8, edge_train_mode: str = 'all', edge_split_mode: str = 'exact', minimum_node_per_graph: int = 5, generator=None)[source]¶

Bases: object

A plain python object modeling a list of Graph with various (optional) attributes.

Parameters

graphs (list) – A list of Graph.
task (str) – Task this GraphDataset is used for (task = ‘node’ or ‘edge’ or ‘link_pred’ or ‘graph’).
edge_negative_sampling_ratio (float) – The number of negative samples compared to that of positive data.
edge_message_ratio (float) – The number of training edge objectives compared to that of message-passing edges.
edge_train_mode (str) – Whether to use (edge_train_mode = ‘all’: training edge objectives are the same as the message-passing edges; or ‘disjoint’: training edge objectives are different from message-passing edges; or ‘train_only’: training edge objectives are always the training set edges).
minimum_node_per_graph (int) – If the number of nodes of a graph is smaller than this, that graph will be filtered out.
generator (deepsnap.dataset.Generator) – The dataset can be on-the-fly-generated. When using on the fly generator, the graphs = [] or None, and a generator(Generator) is provided, with an overwritten generate() method.

apply_transform(transform, update_tensor: bool = True, update_graph: bool = False, deep_copy: bool = False, **kwargs)[source]¶

Applies a transformation to each graph object in parallel by first calling to_data_list, applying the transform, and then perform re-batching again to a GraphDataset.

Parameters

transform – user-defined transformation function.
update_tensor – whether request the Graph object remain unchanged.
kwargs – parameters used in transform function in Graph object.

filter(filter_fn, deep_copy: bool = False, **kwargs)[source]¶

Filter the dataset, discarding graph data G where filter_fn(G) is False.

GraphDataset.apply_transform is an analog of python map in graph dataset, while GraphDataset.filter is an analog of python filter.

Parameters

filter_fn – user-defined filter function that returns True (keep) or False (discard) for graph object in this dataset.
deep_copy – whether to deep copy all graph objects in the returned list.
kwargs – parameters used in the filter function.

Returns

A new dataset where graphs are filtered by the given filter function.

static list_to_graphs(G_list) → List[deepsnap.graph.Graph][source]¶

Transform a list of networkx data object to a list of Graph object.

Parameters: G_list – a list of networkx data object.
Returns: A list of deepsnap.graph.Graph object.
Return type: list

num_dims_dict()[source]¶

Dimensions for all fields.

Returns

Name of the property to the dimension.

e.g. ‘node_feature’ -> feature dimension;: ’graph_label’ -> label dimension

Return type

dict

property num_edge_features¶

Returns edge feature dimension in the graph.

Returns: The number of features per edge in the dataset.
Return type: int

property num_edge_labels¶

Returns edge feature dimension in the graph.

Returns: The number of labels per edge in the dataset.
Return type: int

property num_edges¶

Return number of nodes in graph list

Returns: A list of number of nodes for each graph in graph list
Return type: list

property num_graph_features¶

Returns graph feature dimension in the graph.

Returns: The number of features per graph in the dataset.
Return type: int

property num_graph_labels¶

Returns graph feature dimension in the graph.

Returns: The number of labels per graph in the dataset.
Return type: int

property num_labels¶

General wrapper that returns the number of labels depending on the task.

Returns: The number of labels, depending on the task
Return type: int

property num_node_features¶

Returns node feature dimension in the graph.

Returns: The number of features per node in the dataset.
Return type: int

property num_node_labels¶

Returns node feature dimension in the graph.

Returns: The number of labels per node in the dataset.
Return type: int

property num_nodes¶

Return number of nodes in graph list

Returns: A list of number of nodes for each graph in graph list
Return type: list

static pyg_to_graphs(dataset, verbose: bool = False, fixed_split: bool = False) → List[deepsnap.graph.Graph][source]¶

Transform a torch_geometric.data.Dataset object to a list of Graph object.

Parameters

dataset – a torch_geometric.data.Dataset object.
verbose – if print verbose warning
fixed_split – if load fixed data split from PyG dataset

Returns

A list of deepsnap.graph.Graph object.

Return type

list

resample_disjoint()[source]¶

Resample disjoint edge split of message passing and objective links.

Note that if apply_transform (on the message passing graph) was used before this resampling, it needs to be re-applied, after resampling, to update some of the edges that were in objectives.

split(transductive: bool = True, split_ratio: List[float] = None, split_types: Union[str, List[str]] = None) → Union[List[deepsnap.graph.Graph], List[deepsnap.hetero_graph.HeteroGraph]][source]¶

Split datasets into train, validation (and test) set.

Parameters

transductive – whether the training process is transductive or inductive. Inductive split is always used for graph-level tasks ( self.task == ‘graph’).
split_ratio – number of data splitted into train, validation (and test) set.

Returns

a list of 3 (2) lists of deepsnap.graph.Graph objects corresponding to train, validation (and test) set.

Return type

list

to(device)[source]¶

Transfer Graph object in the graphs to specified device.

Parameters: device – Specified device name

deepsnap.graph module¶

class deepsnap.graph.Graph(G=None, **kwargs)[source]¶

Bases: object

A plain python object modeling a single graph with various (optional) attributes:

Parameters

G (networkx.classes.graph) – The NetworkX graph object which contains features and labels for the tasks.
**kwargs – keyworded argument list with keys such as "node_feature", "node_label" and corresponding attributes.

static add_edge_attr(G, attr_name: str, edge_attr)[source]¶

Add edge attribute into a NetworkX graph.

Parameters

G (NetworkX Graph) – a NetworkX graph.
attr_name (string) – Name of the edge attribute to set.
edge_attr (array_like) – edge attributes.

static add_graph_attr(G, attr_name: str, graph_attr)[source]¶

Add graph attribute into a NetworkX graph.

Parameters

G (NetworkX Graph) – a NetworkX graph.
attr_name (string) – Name of the graph attribute to set.
graph_attr (scalar or array_like) – graph attributes.

static add_node_attr(G, attr_name: str, node_attr)[source]¶

Add node attribute into a NetworkX graph. Assumes that the node_attr ordering is the same as the node ordering in G.

Parameters

G (NetworkX Graph) – a NetworkX graph.
attr_name (string) – Name of the node attribute to set.
node_attr (array_like) – node attributes.

apply_tensor(func, *keys)[source]¶

Applies the function func to all tensor attributes *keys. If *keys is not given, func is applied to all present attributes.

Parameters

func (function) – a function can be applied to a PyTorch tensor.
*keys (string, optional) – names of the tensor attributes that will be applied.

Returns

Return the self deepsnap.graph.Graph.

Return type

deepsnap.graph.Graph

apply_transform(transform, update_tensor: bool = True, update_graph: bool = False, deep_copy: bool = False, **kwargs)[source]¶

Apply transform function to current graph object.

Note that when the backend graph object (e.g. networkx object) is changed in the transform function, the argument update_tensor is recommended, to update the tensor representation to be in sync with the transformed graph. Similarly, update_graph is recommended when the transform function makes change to the tensor objects.

However, the transform function should not make changes to both the backend graph object and the tensors simultaneously. Otherwise there might exist inconsistency between the transformed graph and tensors. Also note that update_tensor and update_graph cannot be true at the same time.

Parameters

transform (fuction) – in the format of transform(deepsnap.graph.Graph, **kwargs). The function needs to either return deepsnap.graph.Graph (the transformed graph object), or the transformed internal .G object (networkx). If returning .G object, all corresponding tensors will be updated.
update_tensor (boolean) – if nx graph has changed, use nx graph to update tensor attributes.
update_graph – (boolean): if tensor attributes has changed, use attributes to update nx graph.
deep_copy (boolean) – True if a new copy of graph_object is needed. In this case, the transform function needs to either return a graph object, Important: when returning Graph object in transform function, user should decide whether the tensor values of the graph is to be copied (deep copy).
**kwargs (any) – additional args for the transform function.

Note

This function different from the function apply_tensor.

clone()[source]¶

Deepcopy the graph object.

Returns: A cloned deepsnap.graph.Graph object with deepcopying all features.
Return type: deepsnap.graph.Graph

contiguous(*keys)[source]¶

Ensures a contiguous memory layout for the attributes specified by *keys. If *keys is not given, all present attributes are ensured tohave a contiguous memory layout.

Parameters: *keys (string, optional) – tensor attributes which will be in contiguous memory layout.
Returns: deepsnap.graph.Graph object with specified tensor attributes in contiguous memory layout.
Return type: deepsnap.graph.Graph

get_num_dims(key, as_label=False) → int[source]¶

Returns the number of dimensions for one graph/node/edge property.

Parameters: as_label – if as_label, treat the tensor as labels (

is_directed() → bool[source]¶

Whether the graph is directed.

Returns: True if the graph is directed.
Return type: bool

is_undirected() → bool[source]¶

Whether the graph is undirected.

Returns: True if the graph is undirected.
Return type: bool

property keys¶

Returns all names of the graph attributes.

Returns: List of deepsnap.graph.Graph attributes.
Return type: list

static negative_sampling(edge_index, num_nodes=None, num_neg_samples=None)[source]¶

Samples random negative edges of a graph given by edge_index.

Parameters

edge_index (torch.LongTensor) – The edge indices.
num_nodes (int, optional) – The number of nodes, i.e. max_val + 1 of edge_index. (default: None)
num_neg_samples (int, optional) – The number of negative samples to return. If set to None, will try to return a negative edge for every positive edge. (default: None)
force_undirected (bool, optional) – If set to True, sampled negative edges will be undirected. (default: False)

Return type

torch.LongTensor

property num_edge_features¶

Returns edge feature dimension in the graph.

Returns: Node feature dimension and 0 if there is no edge_feature.
Return type: int

property num_edge_labels¶

Returns number of the edge labels in the graph.

Returns: Number of edge labels and 0 if there is no edge_label.
Return type: int

property num_edges¶

Returns number of edges in the graph.

Returns: Number of edges.
Return type: int

property num_graph_features¶

Returns graph feature dimension in the graph.

Returns: Graph feature dimension and 0 if there is no graph_feature.
Return type: int

property num_graph_labels¶

Returns number of the graph labels in the graph.

Returns: Number of graph labels and 0 if there is no graph_label.
Return type: int

property num_node_features¶

Returns node feature dimension in the graph.

Returns: Node feature dimension and 0 if there is no node_feature.
Return type: int

property num_node_labels¶

Returns number of the node labels in the graph.

Returns: Number of node labels and 0 if there is no node_label.
Return type: int

property num_nodes¶

Return number of nodes in the graph.

Returns: Number of nodes in the graph.
Return type: int

static pyg_to_graph(data, verbose: bool = False, fixed_split: bool = False)[source]¶

Converts Pytorch Geometric data to a Graph object.

Parameters

data (torch_geometric.data) – a Pytorch Geometric data.
verbose – if print verbose warning
fixed_split – if load fixed data split from PyG dataset

Returns

A new DeepSNAP deepsnap.graph.Graph object.

Return type

deepsnap.graph.Graph

static raw_to_graph(data)[source]¶: Write other methods for user to import their own data format and make sure all attributes of G are scalar/torch.tensor. Not implemented.

resample_disjoint(message_ratio)[source]¶

Resample disjoint edge split of message passing and objective links.

Note that if apply_transform (on the message passing graph) was used before this resampling, it needs to be re-applied, after resampling, to update some of the edges that were in objectives.

split(task: str = 'node', split_ratio: List[float] = None)[source]¶

Split current graph object to list of graph objects.

Parameters

task (string) – one of node, edge or link_pred.
split_ratio (array_like) – array_like ratios [train_ratio, validation_ratio, test_ratio].

Returns

A Python list of deepsnap.graph.Graph objects with specified task.

Return type

list

split_link_pred(split_ratio: Union[float, List[float]])[source]¶: Split the graph into len(split_ratio) graphs for link prediction. Internally this splits edge indices, and the model will only compute loss for the embedding of nodes in each split graph. This is only used for transductive link prediction task In this task, different part of graph is observed in train/val/test Note: this functon will be called twice, if during training, we further split the training graph so that message edges and objective edges are different

to(device, *keys)[source]¶

Performs tensor dtype and/or device conversion to all attributes *keys. If *keys is not given, the conversion is applied to all present attributes.

Parameters

device – Specified device name.
*keys (string, optional) – Tensor attributes which will transfer to the specified device.

deepsnap.hetero_gnn module¶

class deepsnap.hetero_gnn.HeteroConv(convs, aggr='add', parallelize=False)[source]¶

Bases: torch.nn.modules.module.Module

A “wrapper” layer designed for heterogeneous graph layers. It takes a heterogeneous graph layer, such as deepsnap.hetero_gnn.HeteroSAGEConv, at the initializing stage.

aggregate(xs)[source]¶: The aggregation for each node type. Currently support concat, add, mean, max and mul.

forward(node_features, edge_indices, edge_features=None)[source]¶

The forward function for HeteroConv.

Parameters

node_features (dict) – A dictionary each key is node type and the corresponding value is a node feature tensor.
edge_indices (dict) – A dictionary each key is message type and the corresponding value is an edge index tensor.
edge_features (dict) – A dictionary each key is edge type and the corresponding value is an edge feature tensor. Default is None.

reset_parameters()[source]¶

class deepsnap.hetero_gnn.HeteroSAGEConv(in_channels_neigh, out_channels, in_channels_self=None)[source]¶

Bases: torch_geometric.nn.conv.message_passing.MessagePassing

The heterogeneous compitable GraphSAGE operator is derived from the “Inductive Representation Learning on Large Graphs”, “Modeling polypharmacy side effects with graph convolutional networks” and “Modeling Relational Data with Graph Convolutional Networks” papers.

Parameters

in_channels_neigh (int) – The input dimension of the end node type.
out_channels (int) – The dimension of the output.
in_channels_self (int) – The input dimension of the start node type. Default is None where the in_channels_self is equal to in_channels_neigh.

forward(node_feature_neigh, node_feature_self, edge_index, edge_weight=None, size=None, res_n_id=None)[source]¶

message(node_feature_neigh_j, node_feature_self_i, edge_weight)[source]¶

update(aggr_out, node_feature_self, res_n_id)[source]¶

deepsnap.hetero_gnn.forward_op(x, func, **kwargs)[source]¶

A helper function for the heterogeneous operations. Given a dictionary input, it will return a dictionary with the same keys and the values applied by the func with specified parameters.

Parameters

x (dict) – A dictionary that the value of each item will be applied by the func.
func (function) – The function will be applied to each value in the dictionary.
**kwargs – Parameters that will be passed into the func.

deepsnap.hetero_gnn.loss_op(pred, y, label_index, loss_func, **kwargs)[source]¶

A helper function for the heterogeneous loss operations.

Parameters

pred (dict) – A dictionary of predictions.
y (dict) – A dictionary of labels.
label_index (dict) – A dictionary of indicies that the loss will be computed on. Each value should be a Pytorch long tensor.
loss_func (function) – The loss function.
**kwargs – Parameters that will be passed into the loss_func.

deepsnap.hetero_graph module¶

class deepsnap.hetero_graph.HeteroGraph(G=None, **kwargs)[source]¶

Bases: deepsnap.graph.Graph

A plain python object modeling a heterogeneous graph with various attributes (String node type is required for the HeteroGraph).

Parameters

G (networkx.classes.graph) – The NetworkX graph object which contains features and labels for each node type of edge type.
**kwargs – keyworded argument list with keys such as "node_feature", "node_label" and corresponding attributes.

property edge_types¶: Return list of edge types in the heterogeneous graph.

get_num_dims(key, obj_type, as_label: bool = False) → int[source]¶

Returns the number of dimensions for one graph/node/edge property for specified types.

Parameters

key (str) – The choosing property.
obj_type – Node or edge type.
as_label (bool) – If as_label, treat the tensor as labels.

get_num_edge_features(edge_type: str) → int[source]¶

Return the edge feature dimension of specified edge type.

Returns: The edge feature dimension for specified edge type.
Return type: int

get_num_edge_labels(edge_type: str) → int[source]¶

Return the number of edge labels.

Returns: Number of edge labels for specified edge type.
Return type: int

get_num_edges(message_type: Union[tuple, List[tuple]] = None) → int[source]¶

Return number of edges for a edge type or list of edgs types.

Parameters: edge_type (str or list) – Specified edge type(s).
Returns: The number of edges for a edge type or list of edge types.
Return type: int or list

get_num_node_features(node_type: str) → int[source]¶

Return the node feature dimension of specified node type.

Returns: The node feature dimension for specified node type.
Return type: int

get_num_node_labels(node_type: str) → int[source]¶

Return the number of node labels.

Returns: Number of node labels for specified node type.
Return type: int

get_num_nodes(node_type: Union[str, List[str]] = None)[source]¶

Return number of nodes for a node type or list of node types.

Parameters: node_type (str or list) – Specified node type(s).
Returns: The number of nodes for a node type or list of node types.
Return type: int or list

property message_types¶: Return the list of message types (src_node_type, edge_type, end_node_type) in the heterogeneous graph.

static negative_sampling(edge_index: Dict[str, None._VariableFunctions.tensor], num_nodes=None, num_neg_samples: Dict[str, int] = None)[source]¶

Samples random negative edges of a heterogeneous graph given by edge_index.

Parameters

edge_index (LongTensor) – The edge indices.
num_nodes (int, optional) – The number of nodes, i.e. max_val + 1 of edge_index. (default: None)
num_neg_samples (int, optional) – The number of negative samples to return. If set to None, will try to return a negative edge for every positive edge. (default: None)
force_undirected (bool, optional) – If set to True, sampled negative edges will be undirected. (default: False)

Return type

torch.LongTensor

property node_types¶: Return list of node types in the heterogeneous graph.

split(task: str = 'node', split_types: Union[str, List[str], tuple, List[tuple]] = None, split_ratio: List[float] = None, edge_split_mode: str = 'exact')[source]¶

Split current graph object to list of graph objects.

Parameters

task (string) – One of node, edge or link_pred.
split_types (list) – Types splitted on. Default is None which will split all the types in specified task.
split_ratio (array_like) – Array_like ratios [train_ratio, validation_ratio, test_ratio].

Returns

A Python list of Graph objects with specified task.

Return type

list

split_link_pred(split_types: List[tuple], split_ratio: Union[float, List[float]], edge_split_mode: str = 'exact')[source]¶: Split the graph into len(split_ratio) graphs for link prediction. Internally this splits edge indices, and the model will only compute loss for the embedding of nodes in each split graph. This is only used for transductive link prediction task In this task, different part of graph is observed in train/val/test Note: this functon will be called twice, if during training, we further split the training graph so that message edges and objective edges are different

deepsnap package¶

Submodules¶

deepsnap.batch module¶

deepsnap.dataset module¶

deepsnap.graph module¶

deepsnap.hetero_gnn module¶

deepsnap.hetero_graph module¶

Module contents¶