gliner.onnx.model moduleΒΆ
ONNX Runtime inference models for GLiNER.
This module provides ONNX Runtime implementations of various GLiNER model architectures, including uni-encoder and bi-encoder variants for both span-level and token-level named entity recognition, as well as relation extraction models.
- class gliner.onnx.model.BaseORTModel(session)[source]ΒΆ
Bases:
ABCBase class for ONNX Runtime inference models.
Provides common functionality for preparing inputs, running inference, and managing ONNX session I/O. All concrete ORT model implementations should inherit from this class.
- sessionΒΆ
ONNX Runtime inference session.
- input_namesΒΆ
Dictionary mapping input names to their indices.
- output_namesΒΆ
Dictionary mapping output names to their indices.
Initialize the ONNX Runtime model.
- Parameters:
session (InferenceSession) β ONNX Runtime inference session.
- __init__(session)[source]ΒΆ
Initialize the ONNX Runtime model.
- Parameters:
session (InferenceSession) β ONNX Runtime inference session.
- prepare_inputs(inputs)[source]ΒΆ
Prepare inputs for ONNX model inference.
Converts PyTorch tensors to numpy arrays and filters out inputs that are not expected by the ONNX model.
- Parameters:
inputs (Dict[str, Tensor]) β Dictionary of input names and PyTorch tensors.
- Returns:
Dictionary of input names and numpy arrays ready for ONNX inference.
- Raises:
ValueError β If inputs is not a dictionary.
- Return type:
Dict[str, ndarray]
- run_inference(inputs)[source]ΒΆ
Run the ONNX model inference.
- Parameters:
inputs (Dict[str, ndarray]) β Prepared inputs for the model as numpy arrays.
- Returns:
Dictionary mapping output names to their corresponding numpy arrays.
- Return type:
Dict[str, ndarray]
- abstract forward(input_ids, attention_mask, **kwargs)[source]ΒΆ
Perform forward pass through the model.
Abstract method that must be implemented by subclasses to define model-specific forward pass logic.
- Parameters:
input_ids β Input token IDs.
attention_mask β Attention mask for input tokens.
**kwargs β Additional model-specific arguments.
- Returns:
Dictionary containing model outputs.
- Return type:
Dict[str, Any]
- class gliner.onnx.model.UniEncoderSpanORTModel(session)[source]ΒΆ
Bases:
BaseORTModelONNX Runtime model for uni-encoder span-level NER.
Uses a single encoder to process both text and entity labels, performing span-level entity recognition.
Initialize the ONNX Runtime model.
- Parameters:
session (InferenceSession) β ONNX Runtime inference session.
- forward(input_ids, attention_mask, words_mask, text_lengths, span_idx, span_mask, **kwargs)[source]ΒΆ
Forward pass for span model using ONNX inference.
- Parameters:
input_ids (Tensor) β Tensor of shape (batch_size, seq_len) containing input token IDs.
attention_mask (Tensor) β Tensor of shape (batch_size, seq_len) with 1s for real tokens and 0s for padding.
words_mask (Tensor) β Tensor of shape (batch_size, seq_len) indicating word boundaries.
text_lengths (Tensor) β Tensor of shape (batch_size,) containing the actual length of each text sequence.
span_idx (Tensor) β Tensor containing indices of spans to classify.
span_mask (Tensor) β Tensor indicating which spans are valid (not padding).
**kwargs β Additional arguments (ignored).
- Returns:
GLiNERBaseOutput containing logits for span classification.
- Return type:
Dict[str, Any]
- class gliner.onnx.model.BiEncoderSpanORTModel(session)[source]ΒΆ
Bases:
BaseORTModelONNX Runtime model for bi-encoder span-level NER.
Uses separate encoders for text and entity labels, performing span-level entity recognition with bi-encoder architecture.
Initialize the ONNX Runtime model.
- Parameters:
session (InferenceSession) β ONNX Runtime inference session.
- forward(input_ids, attention_mask, words_mask, text_lengths, span_idx, span_mask, labels_embeds=None, labels_input_ids=None, labels_attention_mask=None, **kwargs)[source]ΒΆ
Forward pass for bi-encoder span model using ONNX inference.
- Parameters:
input_ids (Tensor) β Tensor of shape (batch_size, seq_len) containing input token IDs.
attention_mask (Tensor) β Tensor of shape (batch_size, seq_len) with 1s for real tokens and 0s for padding.
words_mask (Tensor) β Tensor of shape (batch_size, seq_len) indicating word boundaries.
text_lengths (Tensor) β Tensor of shape (batch_size,) containing the actual length of each text sequence.
span_idx (Tensor) β Tensor containing indices of spans to classify.
span_mask (Tensor) β Tensor indicating which spans are valid (not padding).
labels_embeds (Tensor | None) β Optional pre-computed embeddings for entity labels. If provided, labels_input_ids and labels_attention_mask are ignored.
labels_input_ids (FloatTensor | None) β Optional tensor containing token IDs for entity labels. Used when labels_embeds is not provided.
labels_attention_mask (LongTensor | None) β Optional attention mask for entity label tokens. Used when labels_embeds is not provided.
**kwargs β Additional arguments (ignored).
- Returns:
GLiNERBaseOutput containing logits for span classification.
- Return type:
Dict[str, Any]
- class gliner.onnx.model.UniEncoderTokenORTModel(session)[source]ΒΆ
Bases:
BaseORTModelONNX Runtime model for uni-encoder token-level NER.
Uses a single encoder to process both text and entity labels, performing token-level entity recognition.
Initialize the ONNX Runtime model.
- Parameters:
session (InferenceSession) β ONNX Runtime inference session.
- forward(input_ids, attention_mask, words_mask, text_lengths, **kwargs)[source]ΒΆ
Forward pass for token model using ONNX inference.
- Parameters:
input_ids (Tensor) β Tensor of shape (batch_size, seq_len) containing input token IDs.
attention_mask (Tensor) β Tensor of shape (batch_size, seq_len) with 1s for real tokens and 0s for padding.
words_mask (Tensor) β Tensor of shape (batch_size, seq_len) indicating word boundaries.
text_lengths (Tensor) β Tensor of shape (batch_size,) containing the actual length of each text sequence.
**kwargs β Additional arguments (ignored).
- Returns:
GLiNERBaseOutput containing logits for token classification.
- Return type:
Dict[str, Any]
- class gliner.onnx.model.BiEncoderTokenORTModel(session)[source]ΒΆ
Bases:
BaseORTModelONNX Runtime model for bi-encoder token-level NER.
Uses separate encoders for text and entity labels, performing token-level entity recognition with bi-encoder architecture.
Initialize the ONNX Runtime model.
- Parameters:
session (InferenceSession) β ONNX Runtime inference session.
- forward(input_ids, attention_mask, words_mask, text_lengths, labels_embeds=None, labels_input_ids=None, labels_attention_mask=None, **kwargs)[source]ΒΆ
Forward pass for bi-encoder token model using ONNX inference.
- Parameters:
input_ids (Tensor) β Tensor of shape (batch_size, seq_len) containing input token IDs.
attention_mask (Tensor) β Tensor of shape (batch_size, seq_len) with 1s for real tokens and 0s for padding.
words_mask (Tensor) β Tensor of shape (batch_size, seq_len) indicating word boundaries.
text_lengths (Tensor) β Tensor of shape (batch_size,) containing the actual length of each text sequence.
labels_embeds (Tensor | None) β Optional pre-computed embeddings for entity labels. If provided, labels_input_ids and labels_attention_mask are ignored.
labels_input_ids (FloatTensor | None) β Optional tensor containing token IDs for entity labels. Used when labels_embeds is not provided.
labels_attention_mask (LongTensor | None) β Optional attention mask for entity label tokens. Used when labels_embeds is not provided.
**kwargs β Additional arguments (ignored).
- Returns:
GLiNERBaseOutput containing logits for token classification.
- Return type:
Dict[str, Any]
- class gliner.onnx.model.UniEncoderSpanRelexORTModel(session)[source]ΒΆ
Bases:
BaseORTModelONNX Runtime model for uni-encoder span-level relation extraction.
Uses a single encoder to process text and perform both entity recognition and relation extraction at the span level.
Initialize the ONNX Runtime model.
- Parameters:
session (InferenceSession) β ONNX Runtime inference session.
- forward(input_ids, attention_mask, words_mask, text_lengths, span_idx, span_mask, **kwargs)[source]ΒΆ
Forward pass for span relation extraction model using ONNX inference.
- Parameters:
input_ids (Tensor) β Tensor of shape (batch_size, seq_len) containing input token IDs.
attention_mask (Tensor) β Tensor of shape (batch_size, seq_len) with 1s for real tokens and 0s for padding.
words_mask (Tensor) β Tensor of shape (batch_size, seq_len) indicating word boundaries.
text_lengths (Tensor) β Tensor of shape (batch_size,) containing the actual length of each text sequence.
span_idx (Tensor) β Tensor containing indices of spans to classify.
span_mask (Tensor) β Tensor indicating which spans are valid (not padding).
**kwargs β Additional arguments (ignored).
- Returns:
GLiNERRelexOutput containing logits for span classification, relation indices, relation logits, and relation mask.
- Return type:
Dict[str, Any]