pgl.graph module: Graph Storage

This package implement Graph structure for handling graph data.

class pgl.graph.Graph(num_nodes, edges=None, node_feat=None, edge_feat=None)[source]

Bases: object

Implementation of graph structure in pgl.

This is a simple implementation of graph structure in pgl.

Parameters
  • num_nodes – number of nodes in a graph

  • edges – list of (u, v) tuples

  • node_feat (optional) – a dict of numpy array as node features

  • edge_feat (optional) – a dict of numpy array as edge features (should have consistent order with edges)

Examples

import numpy as np
num_nodes = 5
edges = [ (0, 1), (1, 2), (3, 4)]
feature = np.random.randn(5, 100)
edge_feature = np.random.randn(3, 100)
graph = Graph(num_nodes=num_nodes,
            edges=edges,
            node_feat={
                "feature": feature
            },
            edge_feat={
                "edge_feature": edge_feature
            })
property adj_dst_index

Return an EdgeIndex object for dst.

property adj_src_index

Return an EdgeIndex object for src.

dump(path)[source]
property edge_feat

Return a dictionary of edge features.

edge_feat_info()[source]

Return the information of edge feature for GraphWrapper.

This function return the information of edge features. And this function is used to help constructing GraphWrapper

Returns

A list of tuple (name, shape, dtype) for all given edge feature.

Examples

import numpy as np
num_nodes = 5
edges = [ (0, 1), (1, 2), (3, 4)]
feature = np.random.randn(3, 100)
graph = Graph(num_nodes=num_nodes,
        edges=edges,
        edge_feat={
            "feature": feature
        })
print(graph.edge_feat_info())

The output will be:

[("feature", [None, 100], "float32")]
property edges

Return all edges in numpy.ndarray with shape (num_edges, 2).

property graph_lod

Return Graph Lod Index for Paddle Computation

has_edges_between(u, v)[source]

Check whether some edges is in graph.

Parameters
  • u – a numpy.array of src nodes ID.

  • v – a numpy.array of dst nodes ID.

Returns

A numpy.array of bool, with the same shape with u and v,

exists[i] is True if (u[i], v[i]) is a edge in graph, Flase otherwise.

Return type

exists

indegree(nodes=None)[source]

Return the indegree of the given nodes

This function will return indegree of given nodes.

Parameters

nodes – Return the indegree of given nodes, if nodes is None, return indegree for all nodes

Returns

A numpy.ndarray as the given nodes’ indegree.

node2vec_random_walk(nodes, max_depth, p=1.0, q=1.0)[source]

Implement of node2vec stype random walk.

Reference paper: https://cs.stanford.edu/~jure/pubs/node2vec-kdd16.pdf.

Parameters
  • nodes – Walk starting from nodes

  • max_depth – Max walking depth

  • p – Return parameter

  • q – In-out parameter

Returns

A list of walks.

node_batch_iter(batch_size, shuffle=True)[source]

Node batch iterator

Iterate all node by batch.

Parameters
  • batch_size – The batch size of each batch of nodes.

  • shuffle – Whether shuffle the nodes.

Returns

Batch iterator

property node_feat

Return a dictionary of node features.

node_feat_info()[source]

Return the information of node feature for GraphWrapper.

This function return the information of node features. And this function is used to help constructing GraphWrapper

Returns

A list of tuple (name, shape, dtype) for all given node feature.

Examples

import numpy as np
num_nodes = 5
edges = [ (0, 1), (1, 2), (3, 4)]
feature = np.random.randn(5, 100)
graph = Graph(num_nodes=num_nodes,
        edges=edges,
        node_feat={
            "feature": feature
        })
print(graph.node_feat_info())

The output will be:

[("feature", [None, 100], "float32")]
property nodes

Return all nodes id from 0 to num_nodes - 1

property num_edges

Return the number of edges.

property num_graph

Return Number of Graphs

property num_nodes

Return the number of nodes.

outdegree(nodes=None)[source]

Return the outdegree of the given nodes.

This function will return outdegree of given nodes.

Parameters

nodes – Return the outdegree of given nodes, if nodes is None, return outdegree for all nodes

Returns

A numpy.array as the given nodes’ outdegree.

predecessor(nodes=None, return_eids=False)[source]

Find predecessor of given nodes.

This function will return the predecessor of given nodes.

Parameters
  • nodes – Return the predecessor of given nodes, if nodes is None, return predecessor for all nodes.

  • return_eids – If True return nodes together with corresponding eid

Returns

Return a list of numpy.ndarray and each numpy.ndarray represent a list of predecessor ids for given nodes. If return_eids=True, there will be an additional list of numpy.ndarray and each numpy.ndarray represent a list of eids that connected nodes to their predecessors.

Example

import numpy as np
num_nodes = 5
edges = [ (0, 1), (1, 2), (3, 4)]
graph = Graph(num_nodes=num_nodes,
        edges=edges)
pred, pred_eid = graph.predecessor(return_eids=True)

This will give output.

pred:
      [[],
       [0],
       [1],
       [],
       [3]]

pred_eid:
      [[],
       [0],
       [1],
       [],
       [2]]
random_walk(nodes, max_depth)[source]

Implement of random walk.

This function get random walks path for given nodes and depth.

Parameters
  • nodes – Walk starting from nodes

  • max_depth – Max walking depth

Returns

A list of walks.

sample_edges(sample_num, replace=False)[source]

Sample edges from the graph

This function helps to sample edges from all edges.

Parameters
  • sample_num – The number of samples

  • replace – boolean, Whether the sample is with or without replacement.

Returns

(u, v), eid each is a numy.array with the same shape.

sample_nodes(sample_num)[source]

Sample nodes from the graph

This function helps to sample nodes from all nodes. Nodes might be duplicated.

Parameters

sample_num – The number of samples

Returns

A list of nodes

sample_predecessor(nodes, max_degree, return_eids=False, shuffle=False)[source]

Sample predecessor of given nodes.

Parameters
  • nodes – Given nodes whose predecessor will be sampled.

  • max_degree – The max sampled predecessor for each nodes.

  • return_eids – Whether to return the corresponding eids.

Returns

Return a list of numpy.ndarray and each numpy.ndarray represent a list of sampled predecessor ids for given nodes. If return_eids=True, there will be an additional list of numpy.ndarray and each numpy.ndarray represent a list of eids that connected nodes to their predecessors.

sample_successor(nodes, max_degree, return_eids=False, shuffle=False)[source]

Sample successors of given nodes.

Parameters
  • nodes – Given nodes whose successors will be sampled.

  • max_degree – The max sampled successors for each nodes.

  • return_eids – Whether to return the corresponding eids.

Returns

Return a list of numpy.ndarray and each numpy.ndarray represent a list of sampled successor ids for given nodes. If return_eids=True, there will be an additional list of numpy.ndarray and each numpy.ndarray represent a list of eids that connected nodes to their successors.

sorted_edges(sort_by='src')[source]

Return sorted edges with different strategies.

This function will return sorted edges with different strategy. If sort_by="src", then edges will be sorted by src nodes and otherwise dst.

Parameters

sort_by – The type for sorted edges. (“src” or “dst”)

Returns

A tuple of (sorted_src, sorted_dst, sorted_eid).

subgraph(nodes, eid=None, edges=None, edge_feats=None, with_node_feat=True, with_edge_feat=True)[source]

Generate subgraph with nodes and edge ids.

This function will generate a pgl.graph.Subgraph object and copy all corresponding node and edge features. Nodes and edges will be reindex from 0. Eid and edges can’t both be None.

WARNING: ALL NODES IN EID MUST BE INCLUDED BY NODES

Parameters
  • nodes – Node ids which will be included in the subgraph.

  • eid (optional) – Edge ids which will be included in the subgraph.

  • edges (optional) – Edge(src, dst) list which will be included in the subgraph.

  • with_node_feat – Whether to inherit node features from parent graph.

  • with_edge_feat – Whether to inherit edge features from parent graph.

Returns

A pgl.graph.Subgraph object.

successor(nodes=None, return_eids=False)[source]

Find successor of given nodes.

This function will return the successor of given nodes.

Parameters
  • nodes – Return the successor of given nodes, if nodes is None, return successor for all nodes.

  • return_eids – If True return nodes together with corresponding eid

Returns

Return a list of numpy.ndarray and each numpy.ndarray represent a list of successor ids for given nodes. If return_eids=True, there will be an additional list of numpy.ndarray and each numpy.ndarray represent a list of eids that connected nodes to their successors.

Example

import numpy as np
num_nodes = 5
edges = [ (0, 1), (1, 2), (3, 4)]
graph = Graph(num_nodes=num_nodes,
        edges=edges)
succ, succ_eid = graph.successor(return_eids=True)

This will give output.

succ:
      [[1],
       [2],
       [],
       [4],
       []]

succ_eid:
      [[0],
       [1],
       [],
       [2],
       []]
class pgl.graph.SubGraph(num_nodes, edges=None, node_feat=None, edge_feat=None, reindex=None)[source]

Bases: pgl.graph.Graph

Implementation of SubGraph in pgl.

Subgraph is inherit from Graph. The best way to construct subgraph is to use Graph.subgraph methods to generate Subgraph object.

Parameters
  • num_nodes – number of nodes in a graph

  • edges – list of (u, v) tuples

  • node_feat (optional) – a dict of numpy array as node features

  • edge_feat (optional) – a dict of numpy array as edge features (should have consistent order with edges)

  • reindex – A dictionary that maps parent graph node id to subgraph node id.

reindex_from_parrent_nodes(nodes)[source]

Map the given parent graph node id to subgraph id.

Parameters

nodes – A list of nodes from parent graph.

Returns

A list of subgraph ids.

reindex_to_parrent_nodes(nodes)[source]

Map the given subgraph node id to parent graph id.

Parameters

nodes – A list of nodes in this subgraph.

Returns

A list of node ids in parent graph.

class pgl.graph.MultiGraph(graph_list)[source]

Bases: pgl.graph.Graph

Implementation of multiple disjoint graph structure in pgl.

This is a simple implementation of graph structure in pgl.

Parameters

graph_list – A list of Graph Instances

Examples

batch_graph = MultiGraph([graph1, graph2, graph3])