gliner.modeling.multitask.relations_layers module¶

gliner.modeling.multitask.relations_layers.compute_degree(A)[source]¶

Compute the degree matrix from an adjacency matrix.

The degree of node i is defined as D_ii = Σ_j A_ij, representing the sum of edge weights connected to that node.

Parameters:: A (Tensor) – Adjacency matrix of shape (B, E, E) where B is batch size and E is the number of entities/nodes.
Returns:: Degree vector of shape (B, E) containing the degree for each node. Values are clamped to a minimum of 1e-6 to avoid division by zero.
Return type:: Tensor

gliner.modeling.multitask.relations_layers.dot_product_adjacency(X, mask=None, normalize=False)[source]¶

Compute adjacency matrix using dot-product (cosine) similarity.

Computes pairwise similarities between entity embeddings using either normalized (cosine similarity) or unnormalized dot products, followed by sigmoid activation.

Parameters:

X (Tensor) – Entity embeddings of shape (B, E, D) where B is batch size, E is number of entities, and D is embedding dimension.
mask (Tensor | None) – Optional mask of shape (B, E) indicating valid entities.
normalize (bool) – If True, L2-normalize embeddings before computing similarity (results in cosine similarity). Defaults to False.

Returns:

Adjacency matrix of shape (B, E, E) with values in (0, 1).

Return type:

Tensor

class gliner.modeling.multitask.relations_layers.MLPDecoder(in_dim, hidden_dim)[source]¶

Bases: Module

MLP-based adjacency decoder using concatenated node pairs.

This decoder concatenates embeddings of node pairs and passes them through an MLP to predict edge existence. It models pairwise interactions explicitly.

Parameters:

in_dim (int) – Input embedding dimension.
hidden_dim (int) – Hidden layer dimension for the MLP.

Initialize the MLP decoder.

Parameters:

in_dim (int) – Input embedding dimension.
hidden_dim (int) – Hidden layer dimension.

__init__(in_dim, hidden_dim)[source]¶

Initialize the MLP decoder.

Parameters:

in_dim (int) – Input embedding dimension.
hidden_dim (int) – Hidden layer dimension.

forward(X, mask=None)[source]¶

Compute adjacency matrix using MLP on concatenated node pairs.

Parameters:

X (Tensor) – Entity embeddings of shape (B, E, D).
mask (Tensor | None) – Optional mask of shape (B, E) indicating valid entities.

Returns:

Adjacency matrix of shape (B, E, E) with values in (0, 1).

Return type:

Tensor

class gliner.modeling.multitask.relations_layers.AttentionAdjacency(d_model, nhead)[source]¶

Bases: Module

Adjacency matrix derived from multi-head attention weights.

Uses PyTorch’s multi-head attention mechanism to compute pairwise attention scores, which are averaged across heads to form the adjacency matrix.

Parameters:

d_model (int) – Model dimension (embedding size).
nhead (int) – Number of attention heads.

Initialize the attention-based adjacency module.

Parameters:

d_model (int) – Model dimension for attention.
nhead (int) – Number of attention heads.

__init__(d_model, nhead)[source]¶

Initialize the attention-based adjacency module.

Parameters:

d_model (int) – Model dimension for attention.
nhead (int) – Number of attention heads.

forward(X, mask=None)[source]¶

Compute adjacency matrix from attention weights.

Parameters:

X (Tensor) – Entity embeddings of shape (B, E, D).
mask (Tensor | None) – Optional mask of shape (B, E) where 1 indicates valid entities.

Returns:

Adjacency matrix of shape (B, E, E) computed from averaged attention weights.

Return type:

Tensor

class gliner.modeling.multitask.relations_layers.BilinearDecoder(in_dim, latent_dim)[source]¶

Bases: Module

Bilinear decoder for adjacency prediction.

Projects embeddings to a latent space and computes adjacency as the sigmoid of the bilinear product Z @ Z^T.

Parameters:

in_dim (int) – Input embedding dimension.
latent_dim (int) – Latent projection dimension.

Initialize the bilinear decoder.

Parameters:

in_dim (int) – Input embedding dimension.
latent_dim (int) – Dimension of the latent projection space.

__init__(in_dim, latent_dim)[source]¶

Initialize the bilinear decoder.

Parameters:

in_dim (int) – Input embedding dimension.
latent_dim (int) – Dimension of the latent projection space.

forward(X, mask=None)[source]¶

Compute adjacency using bilinear projection.

Parameters:

X (Tensor) – Entity embeddings of shape (B, E, D).
mask (Tensor | None) – Optional mask of shape (B, E) indicating valid entities.

Returns:

Adjacency matrix of shape (B, E, E) with values in (0, 1).

Return type:

Tensor

class gliner.modeling.multitask.relations_layers.SimpleGCNLayer(in_dim, out_dim)[source]¶

Bases: Module

Simple Graph Convolutional Network layer with symmetric normalization.

Implements the GCN propagation rule: H = ReLU(D^(-1/2) A D^(-1/2) X W) where D is the degree matrix, A is the adjacency with self-loops, and W is a learnable weight matrix.

Parameters:

in_dim (int) – Input feature dimension.
out_dim (int) – Output feature dimension.

Initialize the GCN layer.

Parameters:

in_dim (int) – Input feature dimension.
out_dim (int) – Output feature dimension.

__init__(in_dim, out_dim)[source]¶

Initialize the GCN layer.

Parameters:

in_dim (int) – Input feature dimension.
out_dim (int) – Output feature dimension.

forward(X, A, mask=None)[source]¶

Apply graph convolution with symmetric normalization.

Parameters:

X (Tensor) – Node features of shape (B, E, D).
A (Tensor) – Adjacency matrix of shape (B, E, E).
mask (Tensor | None) – Optional mask of shape (B, E). Self-loops are added only to valid (non-masked) nodes.

Returns:

Updated node features of shape (B, E, out_dim).

Return type:

Tensor

class gliner.modeling.multitask.relations_layers.GCNDecoder(in_dim, hidden_dim)[source]¶

Bases: Module

GCN-based adjacency decoder.

First computes an initial adjacency using dot-product similarity, applies a GCN layer to update node representations, then predicts the final adjacency from the updated representations.

Parameters:

in_dim (int) – Input embedding dimension.
hidden_dim (int) – Hidden dimension for GCN and projection layers.

Initialize the GCN decoder.

Parameters:

in_dim (int) – Input embedding dimension.
hidden_dim (int) – Hidden dimension for the GCN layer and projection.

__init__(in_dim, hidden_dim)[source]¶

Initialize the GCN decoder.

Parameters:

in_dim (int) – Input embedding dimension.
hidden_dim (int) – Hidden dimension for the GCN layer and projection.

forward(X, mask=None)[source]¶

Compute adjacency using GCN refinement.

Parameters:

X (Tensor) – Entity embeddings of shape (B, E, D).
mask (Tensor | None) – Optional mask of shape (B, E) indicating valid entities.

Returns:

Adjacency matrix of shape (B, E, E) with values in (0, 1).

Return type:

Tensor

class gliner.modeling.multitask.relations_layers.GATDecoder(d_model, nhead, hidden_dim)[source]¶

Bases: Module

Graph Attention Network (GAT) based adjacency decoder.

Uses multi-head attention to update node representations, then predicts adjacency from the transformed features.

Parameters:

d_model (int) – Model dimension for attention.
nhead (int) – Number of attention heads.
hidden_dim (int) – Hidden dimension for the final projection.

Initialize the GAT decoder.

Parameters:

d_model (int) – Model dimension for attention mechanism.
nhead (int) – Number of attention heads.
hidden_dim (int) – Hidden dimension for the output projection.

__init__(d_model, nhead, hidden_dim)[source]¶

Initialize the GAT decoder.

Parameters:

d_model (int) – Model dimension for attention mechanism.
nhead (int) – Number of attention heads.
hidden_dim (int) – Hidden dimension for the output projection.

forward(X, mask=None)[source]¶

Compute adjacency using GAT refinement.

Parameters:

X (Tensor) – Entity embeddings of shape (B, E, D).
mask (Tensor | None) – Optional mask of shape (B, E) indicating valid entities.

Returns:

Adjacency matrix of shape (B, E, E) with values in (0, 1).

Return type:

Tensor

class gliner.modeling.multitask.relations_layers.RelationsRepLayer(in_dim, relation_mode, **kwargs)[source]¶

Bases: Module

Unified wrapper for different adjacency computation methods.

This layer provides a common interface for various approaches to computing adjacency matrices from entity embeddings, including: - ‘dot’: Dot-product/cosine similarity - ‘mlp’: MLP-based pairwise decoder - ‘attention’/’attn’: Multi-head attention weights - ‘bilinear’: Bilinear projection - ‘gcn’: Graph convolutional refinement - ‘gat’: Graph attention network

All methods support masked inputs for handling variable-length sequences.

Parameters:

in_dim (int) – Input embedding dimension.
relation_mode (str) – String specifying the adjacency computation method. One of: ‘dot’, ‘mlp’, ‘attention’, ‘attn’, ‘bilinear’, ‘gcn’, ‘gat’.
**kwargs (Any) – Additional arguments passed to specific decoders: - hidden_dim (int): For ‘mlp’, ‘gcn’, ‘gat’. Defaults to in_dim. - nhead (int): For ‘attention’/’attn’ and ‘gat’. Defaults to 8. - latent_dim (int): For ‘bilinear’. Defaults to in_dim.

Raises:

ValueError – If relation_mode is not one of the supported methods.

Example

>>> layer = RelationsRepLayer(in_dim=128, relation_mode="gcn", hidden_dim=64)
>>> X = torch.randn(4, 10, 128)  # (batch=4, entities=10, dim=128)
>>> mask = torch.ones(4, 10)  # All entities valid
>>> A = layer(X, mask)  # (4, 10, 10) adjacency matrix

Initialize the relations representation layer.

Parameters:

in_dim (int) – Input embedding dimension.
relation_mode (str) – Adjacency computation method. One of: ‘dot’, ‘mlp’, ‘attention’, ‘attn’, ‘bilinear’, ‘gcn’, ‘gat’.
**kwargs (Any) – Method-specific arguments (hidden_dim, nhead, latent_dim).

Raises:

ValueError – If relation_mode is not recognized.

__init__(in_dim, relation_mode, **kwargs)[source]¶

Initialize the relations representation layer.

Parameters:

in_dim (int) – Input embedding dimension.
relation_mode (str) – Adjacency computation method. One of: ‘dot’, ‘mlp’, ‘attention’, ‘attn’, ‘bilinear’, ‘gcn’, ‘gat’.
**kwargs (Any) – Method-specific arguments (hidden_dim, nhead, latent_dim).

Raises:

ValueError – If relation_mode is not recognized.

forward(X, mask=None, *args, **kwargs)[source]¶

Compute adjacency matrix from entity embeddings.

Parameters:

X (Tensor) – Entity/mention embeddings of shape (B, E, D) where B is batch size, E is number of entities, and D is embedding dimension.
mask (Tensor | None) – Optional mask of shape (B, E) where 1 indicates valid entities and 0 indicates padding.
*args (Any) – Additional positional arguments (unused, for compatibility).
**kwargs (Any) – Additional keyword arguments (unused, for compatibility).

Returns:

Adjacency matrix of shape (B, E, E) with values in [0, 1]. Entries A[b, i, j] represent the predicted edge weight from entity i to entity j in batch b.

Return type:

Tensor