pgl.heter_graph module: Heterogenous Graph Storage

This package implement Heterogeneous Graph structure for handling Heterogeneous graph data.

class pgl.heter_graph.HeterGraph(num_nodes, edges, node_types=None, node_feat=None, edge_feat=None)[source]

Bases: object

Implementation of heterogeneous graph structure in pgl

This is a simple implementation of heterogeneous graph structure in pgl.

Parameters
  • num_nodes – number of nodes in a heterogeneous graph

  • edges – dict, every element in dict is a list of (u, v) tuples.

  • node_types (optional) – list of (u, node_type) tuples to specify the node type of every node

  • node_feat (optional) – a dict of numpy array as node features

  • edge_feat (optional) – a dict of dict as edge features for every edge type

Examples

import numpy as np
num_nodes = 4
node_types = [(0, 'user'), (1, 'item'), (2, 'item'), (3, 'user')]
edges = {
    'edges_type1': [(0,1), (3,2)],
    'edges_type2': [(1,2), (3,1)],
}
node_feat = {'feature': np.random.randn(4, 16)}
edges_feat = {
    'edges_type1': {'h': np.random.randn(2, 16)},
    'edges_type2': {'h': np.random.randn(2, 16)},
}

g = heter_graph.HeterGraph(
                num_nodes=num_nodes,
                edges=edges,
                node_types=node_types,
                node_feat=node_feat,
                edge_feat=edges_feat)
property edge_feat

Return edge features of all edge types.

edge_feat_info()[source]

Return the information of edge feature for HeterGraphWrapper.

This function return the information of edge features of all edge types. And this function is used to help constructing HeterGraphWrapper

Returns

A dict of list of tuple (name, shape, dtype) for all given edge feature.

property edge_types

Return a list of edge types.

edge_types_info()[source]

Return the information of all edge types.

Returns

A list of all edge types.

indegree(nodes=None, edge_type=None)[source]

Return the indegree of the given nodes with the specified edge_type.

Parameters
  • nodes – Return the indegree of given nodes. if nodes is None, return indegree for all nodes.

  • edge_types – Return the indegree with specified edge_type. if edge_type is None, return the total indegree of the given nodes.

Returns

A numpy.ndarray as the given nodes’ indegree.

node_batch_iter(batch_size, shuffle=True, n_type=None)[source]

Node batch iterator

Iterate all nodes by batch with the specified node type.

Parameters
  • batch_size – The batch size of each batch of nodes.

  • shuffle – Whether shuffle the nodes.

  • n_type – Iterate the nodes with the specified node type. If n_type is None, iterate all nodes by batch.

Returns

Batch iterator

property node_feat

Return a dictionary of node features.

node_feat_info()[source]

Return the information of node feature for HeterGraphWrapper.

This function return the information of node features of all node types. And this function is used to help constructing HeterGraphWrapper

Returns

A list of tuple (name, shape, dtype) for all given node feature.

property node_types

Return the node types.

property nodes

Return all nodes id from 0 to num_nodes - 1

property num_edges

Return edges number of all edge types.

property num_nodes

Return the number of nodes.

num_nodes_by_type(n_type=None)[source]

Return the number of nodes with the specified node type.

outdegree(nodes=None, edge_type=None)[source]

Return the outdegree of the given nodes with the specified edge_type.

Parameters
  • nodes – Return the outdegree of given nodes, if nodes is None, return outdegree for all nodes

  • edge_types – Return the outdegree with specified edge_type. if edge_type is None, return the total outdegree of the given nodes.

Returns

A numpy.array as the given nodes’ outdegree.

predecessor(edge_type, nodes=None, return_eids=False)[source]

Find predecessor of given nodes with the specified edge_type.

Parameters
  • nodes – Return the predecessor of given nodes, if nodes is None, return predecessor for all nodes

  • edge_types – Return the predecessor with specified edge_type.

  • return_eids – If True return nodes together with corresponding eid

sample_nodes(sample_num, n_type=None)[source]

Sample nodes with the specified n_type from the graph

This function helps to sample nodes with the specified n_type from the graph. If n_type is None, this function will sample nodes from all nodes. Nodes might be duplicated.

Parameters
  • sample_num – The number of samples

  • n_type – The nodes of type to be sampled

Returns

A list of nodes

sample_predecessor(edge_type, nodes, max_degree, return_eids=False, shuffle=False)[source]

Sample predecessors of given nodes with the specified edge_type.

Parameters
  • edge_type – The specified edge_type.

  • nodes – Given nodes whose predecessors will be sampled.

  • max_degree – The max sampled predecessors for each nodes.

  • return_eids – Whether to return the corresponding eids.

Returns

Return a list of numpy.ndarray and each numpy.ndarray represent a list of sampled predecessor ids for given nodes with specified edge type. If return_eids=True, there will be an additional list of numpy.ndarray and each numpy.ndarray represent a list of eids that connected nodes to their predecessors.

sample_successor(edge_type, nodes, max_degree, return_eids=False, shuffle=False)[source]

Sample successors of given nodes with the specified edge_type.

Parameters
  • edge_type – The specified edge_type.

  • nodes – Given nodes whose successors will be sampled.

  • max_degree – The max sampled successors for each nodes.

  • return_eids – Whether to return the corresponding eids.

Returns

Return a list of numpy.ndarray and each numpy.ndarray represent a list of sampled successor ids for given nodes with specified edge type. If return_eids=True, there will be an additional list of numpy.ndarray and each numpy.ndarray represent a list of eids that connected nodes to their successors.

successor(edge_type, nodes=None, return_eids=False)[source]

Find successor of given nodes with the specified edge_type.

Parameters
  • nodes – Return the successor of given nodes, if nodes is None, return successor for all nodes

  • edge_types – Return the successor with specified edge_type. if edge_type is None, return the total successor of the given nodes and eids are invalid in this way.

  • return_eids – If True return nodes together with corresponding eid

class pgl.heter_graph.SubHeterGraph(num_nodes, edges, node_types=None, node_feat=None, edge_feat=None, reindex=None)[source]

Bases: pgl.heter_graph.HeterGraph

Implementation of SubHeterGraph in pgl.

SubHeterGraph is inherit from HeterGraph.

Parameters
  • num_nodes – number of nodes in a heterogeneous graph

  • edges – dict, every element in dict is a list of (u, v) tuples.

  • node_types (optional) – list of (u, node_type) tuples to specify the node type of every node

  • node_feat (optional) – a dict of numpy array as node features

  • edge_feat (optional) – a dict of dict as edge features for every edge type

  • reindex – A dictionary that maps parent hetergraph node id to subhetergraph node id.

reindex_from_parrent_nodes(nodes)[source]

Map the given parent graph node id to subgraph id.

Parameters

nodes – A list of nodes from parent graph.

Returns

A list of subgraph ids.

reindex_to_parrent_nodes(nodes)[source]

Map the given subgraph node id to parent graph id.

Parameters

nodes – A list of nodes in this subgraph.

Returns

A list of node ids in parent graph.