pgl.graph module: Graph Storage¶
This package implement Graph structure for handling graph data.
-
class
pgl.graph.
Graph
(num_nodes, edges=None, node_feat=None, edge_feat=None)[source]¶ Bases:
object
Implementation of graph structure in pgl.
This is a simple implementation of graph structure in pgl.
- Parameters
num_nodes – number of nodes in a graph
edges – list of (u, v) tuples
node_feat (optional) – a dict of numpy array as node features
edge_feat (optional) – a dict of numpy array as edge features (should have consistent order with edges)
Examples
import numpy as np num_nodes = 5 edges = [ (0, 1), (1, 2), (3, 4)] feature = np.random.randn(5, 100) edge_feature = np.random.randn(3, 100) graph = Graph(num_nodes=num_nodes, edges=edges, node_feat={ "feature": feature }, edge_feat={ "edge_feature": edge_feature })
-
property
adj_dst_index
¶ Return an EdgeIndex object for dst.
-
property
adj_src_index
¶ Return an EdgeIndex object for src.
-
property
edge_feat
¶ Return a dictionary of edge features.
-
edge_feat_info
()[source]¶ Return the information of edge feature for GraphWrapper.
This function return the information of edge features. And this function is used to help constructing GraphWrapper
- Returns
A list of tuple (name, shape, dtype) for all given edge feature.
Examples
import numpy as np num_nodes = 5 edges = [ (0, 1), (1, 2), (3, 4)] feature = np.random.randn(3, 100) graph = Graph(num_nodes=num_nodes, edges=edges, edge_feat={ "feature": feature }) print(graph.edge_feat_info())
The output will be:
[("feature", [None, 100], "float32")]
-
property
edges
¶ Return all edges in numpy.ndarray with shape (num_edges, 2).
-
property
graph_lod
¶ Return Graph Lod Index for Paddle Computation
-
has_edges_between
(u, v)[source]¶ Check whether some edges is in graph.
- Parameters
u – a numpy.array of src nodes ID.
v – a numpy.array of dst nodes ID.
- Returns
- A numpy.array of bool, with the same shape with u and v,
exists[i] is True if (u[i], v[i]) is a edge in graph, Flase otherwise.
- Return type
exists
-
indegree
(nodes=None)[source]¶ Return the indegree of the given nodes
This function will return indegree of given nodes.
- Parameters
nodes – Return the indegree of given nodes, if nodes is None, return indegree for all nodes
- Returns
A numpy.ndarray as the given nodes’ indegree.
-
node2vec_random_walk
(nodes, max_depth, p=1.0, q=1.0)[source]¶ Implement of node2vec stype random walk.
Reference paper: https://cs.stanford.edu/~jure/pubs/node2vec-kdd16.pdf.
- Parameters
nodes – Walk starting from nodes
max_depth – Max walking depth
p – Return parameter
q – In-out parameter
- Returns
A list of walks.
-
node_batch_iter
(batch_size, shuffle=True)[source]¶ Node batch iterator
Iterate all node by batch.
- Parameters
batch_size – The batch size of each batch of nodes.
shuffle – Whether shuffle the nodes.
- Returns
Batch iterator
-
property
node_feat
¶ Return a dictionary of node features.
-
node_feat_info
()[source]¶ Return the information of node feature for GraphWrapper.
This function return the information of node features. And this function is used to help constructing GraphWrapper
- Returns
A list of tuple (name, shape, dtype) for all given node feature.
Examples
import numpy as np num_nodes = 5 edges = [ (0, 1), (1, 2), (3, 4)] feature = np.random.randn(5, 100) graph = Graph(num_nodes=num_nodes, edges=edges, node_feat={ "feature": feature }) print(graph.node_feat_info())
The output will be:
[("feature", [None, 100], "float32")]
-
property
nodes
¶ Return all nodes id from 0 to
num_nodes - 1
-
property
num_edges
¶ Return the number of edges.
-
property
num_graph
¶ Return Number of Graphs
-
property
num_nodes
¶ Return the number of nodes.
-
outdegree
(nodes=None)[source]¶ Return the outdegree of the given nodes.
This function will return outdegree of given nodes.
- Parameters
nodes – Return the outdegree of given nodes, if nodes is None, return outdegree for all nodes
- Returns
A numpy.array as the given nodes’ outdegree.
-
predecessor
(nodes=None, return_eids=False)[source]¶ Find predecessor of given nodes.
This function will return the predecessor of given nodes.
- Parameters
nodes – Return the predecessor of given nodes, if nodes is None, return predecessor for all nodes.
return_eids – If True return nodes together with corresponding eid
- Returns
Return a list of numpy.ndarray and each numpy.ndarray represent a list of predecessor ids for given nodes. If
return_eids=True
, there will be an additional list of numpy.ndarray and each numpy.ndarray represent a list of eids that connected nodes to their predecessors.
Example
import numpy as np num_nodes = 5 edges = [ (0, 1), (1, 2), (3, 4)] graph = Graph(num_nodes=num_nodes, edges=edges) pred, pred_eid = graph.predecessor(return_eids=True)
This will give output.
pred: [[], [0], [1], [], [3]] pred_eid: [[], [0], [1], [], [2]]
-
random_walk
(nodes, max_depth)[source]¶ Implement of random walk.
This function get random walks path for given nodes and depth.
- Parameters
nodes – Walk starting from nodes
max_depth – Max walking depth
- Returns
A list of walks.
-
sample_edges
(sample_num, replace=False)[source]¶ Sample edges from the graph
This function helps to sample edges from all edges.
- Parameters
sample_num – The number of samples
replace – boolean, Whether the sample is with or without replacement.
- Returns
(u, v), eid each is a numy.array with the same shape.
-
sample_nodes
(sample_num)[source]¶ Sample nodes from the graph
This function helps to sample nodes from all nodes. Nodes might be duplicated.
- Parameters
sample_num – The number of samples
- Returns
A list of nodes
-
sample_predecessor
(nodes, max_degree, return_eids=False, shuffle=False)[source]¶ Sample predecessor of given nodes.
- Parameters
nodes – Given nodes whose predecessor will be sampled.
max_degree – The max sampled predecessor for each nodes.
return_eids – Whether to return the corresponding eids.
- Returns
Return a list of numpy.ndarray and each numpy.ndarray represent a list of sampled predecessor ids for given nodes. If
return_eids=True
, there will be an additional list of numpy.ndarray and each numpy.ndarray represent a list of eids that connected nodes to their predecessors.
-
sample_successor
(nodes, max_degree, return_eids=False, shuffle=False)[source]¶ Sample successors of given nodes.
- Parameters
nodes – Given nodes whose successors will be sampled.
max_degree – The max sampled successors for each nodes.
return_eids – Whether to return the corresponding eids.
- Returns
Return a list of numpy.ndarray and each numpy.ndarray represent a list of sampled successor ids for given nodes. If
return_eids=True
, there will be an additional list of numpy.ndarray and each numpy.ndarray represent a list of eids that connected nodes to their successors.
-
sorted_edges
(sort_by='src')[source]¶ Return sorted edges with different strategies.
This function will return sorted edges with different strategy. If
sort_by="src"
, then edges will be sorted bysrc
nodes and otherwisedst
.- Parameters
sort_by – The type for sorted edges. (“src” or “dst”)
- Returns
A tuple of (sorted_src, sorted_dst, sorted_eid).
-
subgraph
(nodes, eid=None, edges=None, edge_feats=None, with_node_feat=True, with_edge_feat=True)[source]¶ Generate subgraph with nodes and edge ids.
This function will generate a
pgl.graph.Subgraph
object and copy all corresponding node and edge features. Nodes and edges will be reindex from 0. Eid and edges can’t both be None.WARNING: ALL NODES IN EID MUST BE INCLUDED BY NODES
- Parameters
nodes – Node ids which will be included in the subgraph.
eid (optional) – Edge ids which will be included in the subgraph.
edges (optional) – Edge(src, dst) list which will be included in the subgraph.
with_node_feat – Whether to inherit node features from parent graph.
with_edge_feat – Whether to inherit edge features from parent graph.
- Returns
A
pgl.graph.Subgraph
object.
-
successor
(nodes=None, return_eids=False)[source]¶ Find successor of given nodes.
This function will return the successor of given nodes.
- Parameters
nodes – Return the successor of given nodes, if nodes is None, return successor for all nodes.
return_eids – If True return nodes together with corresponding eid
- Returns
Return a list of numpy.ndarray and each numpy.ndarray represent a list of successor ids for given nodes. If
return_eids=True
, there will be an additional list of numpy.ndarray and each numpy.ndarray represent a list of eids that connected nodes to their successors.
Example
import numpy as np num_nodes = 5 edges = [ (0, 1), (1, 2), (3, 4)] graph = Graph(num_nodes=num_nodes, edges=edges) succ, succ_eid = graph.successor(return_eids=True)
This will give output.
succ: [[1], [2], [], [4], []] succ_eid: [[0], [1], [], [2], []]
-
class
pgl.graph.
SubGraph
(num_nodes, edges=None, node_feat=None, edge_feat=None, reindex=None)[source]¶ Bases:
pgl.graph.Graph
Implementation of SubGraph in pgl.
Subgraph is inherit from
Graph
. The best way to construct subgraph is to useGraph.subgraph
methods to generate Subgraph object.- Parameters
num_nodes – number of nodes in a graph
edges – list of (u, v) tuples
node_feat (optional) – a dict of numpy array as node features
edge_feat (optional) – a dict of numpy array as edge features (should have consistent order with edges)
reindex – A dictionary that maps parent graph node id to subgraph node id.
-
class
pgl.graph.
MultiGraph
(graph_list)[source]¶ Bases:
pgl.graph.Graph
Implementation of multiple disjoint graph structure in pgl.
This is a simple implementation of graph structure in pgl.
- Parameters
graph_list – A list of Graph Instances
Examples
batch_graph = MultiGraph([graph1, graph2, graph3])