pyg_lib.sampler

neighbor_sample(rowptr: Tensor, col: Tensor, seed: Tensor, num_neighbors: List[int], time: Optional[Tensor] = None, seed_time: Optional[Tensor] = None, csc: bool = False, replace: bool = False, directed: bool = True, disjoint: bool = False, temporal_strategy: str = 'uniform', return_edge_id: bool = True) → Tuple[Tensor, Tensor, Tensor, Optional[Tensor]]

Recursively samples neighbors from all node indices in seed in the graph given by (rowptr, col).

Note

For temporal sampling, the col vector needs to be sorted according to time within individual neighborhoods since we use binary search to find neighbors that fulfill temporal constraints.

Parameters

rowptr (torch.Tensor) – Compressed source node indices.
col (torch.Tensor) – Target node indices.
seed (torch.Tensor) – The seed node indices.
num_neighbors (List[int]) – The number of neighbors to sample for each node in each iteration. If an entry is set to -1, all neighbors will be included.
time (torch.Tensor, optional) – Timestamps for the nodes in the graph. If set, temporal sampling will be used such that neighbors are guaranteed to fulfill temporal constraints, i.e. neighbors have an earlier timestamp than the seed node. If used, the col vector needs to be sorted according to time within individual neighborhoods. Requires disjoint=True. (default: None)
seed_time (torch.Tensor, optional) – Optional values to override the timestamp for seed nodes. If not set, will use timestamps in time as default for seed nodes. (default: None)
csc (bool, optional) – If set to True, assumes that the graph is given in CSC format (colptr, row). (default: False)
replace (bool, optional) – If set to True, will sample with replacement. (default: False)
directed (bool, optional) – If set to False, will include all edges between all sampled nodes. (default: True)
disjoint (bool, optional) – If set to True , will create disjoint subgraphs for every seed node. (default: False)
temporal_strategy (string, optional) – The sampling strategy when using temporal sampling ("uniform", "last"). (default: "uniform")
return_edge_id (bool, optional) – If set to False, will not return the indices of edges of the original graph. (default: :obj: True)

Returns

Row indices, col indices of the returned subtree/subgraph, as well as original node indices for all nodes sampled. In addition, may return the indices of edges of the original graph.

Return type

(torch.Tensor, torch.Tensor, torch.Tensor, Optional[torch.Tensor])

hetero_neighbor_sample(rowptr_dict: Dict[Tuple[str, str, str], Tensor], col_dict: Dict[Tuple[str, str, str], Tensor], seed_dict: Dict[str, Tensor], num_neighbors_dict: Dict[Tuple[str, str, str], List[int]], time_dict: Optional[Dict[str, Tensor]] = None, seed_time_dict: Optional[Dict[str, Tensor]] = None, csc: bool = False, replace: bool = False, directed: bool = True, disjoint: bool = False, temporal_strategy: str = 'uniform', return_edge_id: bool = True) → Tuple[Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Tensor], Dict[str, Tensor], Optional[Dict[Tuple[str, str, str], Tensor]]]

Recursively samples neighbors from all node indices in seed_dict in the heterogeneous graph given by (rowptr_dict, col_dict).

Note

Similar to neighbor_sample(), but expects a dictionary of node types (str) and edge types (Tuple[str, str, str]) for each non-boolean argument.

Parameters: kwargs – Arguments of neighbor_sample().

subgraph(rowptr: Tensor, col: Tensor, nodes: Tensor, return_edge_id: bool = True) → Tuple[Tensor, Tensor, Optional[Tensor]]

Returns the induced subgraph of the graph given by (rowptr, col), containing only the nodes in nodes.

Parameters

rowptr (torch.Tensor) – Compressed source node indices.
col (torch.Tensor) – Target node indices.
nodes (torch.Tensor) – Node indices of the induced subgraph.
return_edge_id (bool, optional) – If set to False, will not return the indices of edges of the original graph contained in the induced subgraph. (default: True)

Returns

Compressed source node indices and target node indices of the induced subgraph. In addition, may return the indices of edges of the original graph.

Return type

(torch.Tensor, torch.Tensor, Optional[torch.Tensor])

random_walk(rowptr: Tensor, col: Tensor, seed: Tensor, walk_length: int, p: float = 1.0, q: float = 1.0) → Tensor

Samples random walks of length walk_length from all node indices in seed in the graph given by (rowptr, col), as described in the “node2vec: Scalable Feature Learning for Networks” paper.

Parameters

rowptr (torch.Tensor) – Compressed source node indices.
col (torch.Tensor) – Target node indices.
seed (torch.Tensor) – Seed node indices from where random walks start.
walk_length (int) – The walk length of a random walk.
p (float, optional) – Likelihood of immediately revisiting a node in the walk. (default: 1.0)
q (float, optional) – Control parameter to interpolate between breadth-first strategy and depth-first strategy. (default: 1.0)

Returns

A tensor of shape [seed.size(0), walk_length + 1], holding the nodes indices of each walk for each seed node.

Return type

torch.Tensor