gliner package¶
- class gliner.GLiNER(*args, **kwargs)[source]¶
Bases:
Module,PyTorchModelHubMixinMeta GLiNER class that automatically instantiates the appropriate GLiNER variant.
This class provides a unified interface for all GLiNER models, automatically switching to specialized model types based on the model configuration. It supports various NER architectures including uni-encoder, bi-encoder, decoder-based, and relation extraction models.
- The class automatically detects the model type based on:
span_mode: Token-level vs span-level
labels_encoder: Uni-encoder vs bi-encoder
labels_decoder: Standard vs decoder-based
relations_layer: NER-only vs joint entity-relation extraction
- model¶
The loaded GLiNER model instance (automatically typed).
- config¶
Model configuration.
- data_processor¶
Data processor for the model.
- decoder¶
Decoder for predictions.
Examples
Load a pretrained uni-encoder span model: >>> model = GLiNER.from_pretrained(“urchade/gliner_small-v2.1”)
Load a bi-encoder model: >>> model = GLiNER.from_pretrained(“knowledgator/gliner-bi-small-v1.0”)
Load from local configuration: >>> config = GLiNERConfig.from_pretrained(“config.json”) >>> model = GLiNER.from_config(config)
Initialize from scratch: >>> config = GLiNERConfig(model_name=”microsoft/deberta-v3-small”) >>> model = GLiNER(config)
Initialize a GLiNER model with automatic type detection.
This constructor determines the appropriate GLiNER variant based on the configuration and replaces itself with an instance of that variant.
- Parameters:
config (str | Path | GLiNERConfig) – Model configuration (GLiNERConfig object, path to config file, or dict).
**kwargs – Additional arguments passed to the specific GLiNER variant.
Examples
>>> config = GLiNERConfig(model_name="bert-base-cased") >>> model = GLiNER(config) >>> model = GLiNER("path/to/gliner_config.json")
- __init__(config, **kwargs)[source]¶
Initialize a GLiNER model with automatic type detection.
This constructor determines the appropriate GLiNER variant based on the configuration and replaces itself with an instance of that variant.
- Parameters:
config (str | Path | GLiNERConfig) – Model configuration (GLiNERConfig object, path to config file, or dict).
**kwargs – Additional arguments passed to the specific GLiNER variant.
Examples
>>> config = GLiNERConfig(model_name="bert-base-cased") >>> model = GLiNER(config) >>> model = GLiNER("path/to/gliner_config.json")
- classmethod from_pretrained(model_id, revision=None, cache_dir=None, force_download=False, proxies=None, resume_download=False, local_files_only=False, token=None, map_location='cpu', strict=False, load_tokenizer=None, resize_token_embeddings=True, compile_torch_model=False, load_onnx_model=False, onnx_model_file='model.onnx', max_length=None, max_width=None, post_fusion_schema=None, _attn_implementation=None, **model_kwargs)[source]¶
Load a pretrained GLiNER model with automatic type detection.
This method loads the configuration, determines the appropriate GLiNER variant, and delegates to that variant’s from_pretrained method.
- Parameters:
model_id (str) – Model identifier or local path.
revision (str | None) – Model revision.
cache_dir (str | Path | None) – Cache directory.
force_download (bool) – Force redownload.
proxies (dict | None) – Proxy configuration.
resume_download (bool) – Resume interrupted downloads.
local_files_only (bool) – Only use local files.
token (str | bool | None) – HF token for private repos.
map_location (str) – Device to map model to.
strict (bool) – Enforce strict state_dict loading.
load_tokenizer (bool | None) – Whether to load tokenizer.
resize_token_embeddings (bool | None) – Whether to resize embeddings.
compile_torch_model (bool | None) – Whether to compile with torch.compile.
load_onnx_model (bool | None) – Whether to load ONNX model instead of PyTorch.
onnx_model_file (str | None) – Path to ONNX model file.
max_length (int | None) – Override max_length in config.
max_width (int | None) – Override max_width in config.
post_fusion_schema (str | None) – Override post_fusion_schema in config.
_attn_implementation (str | None) – Override attention implementation.
**model_kwargs – Additional model initialization arguments.
- Returns:
Appropriate GLiNER model instance.
Examples
>>> model = GLiNER.from_pretrained("urchade/gliner_small-v2.1") >>> model = GLiNER.from_pretrained("knowledgator/gliner-bi-small-v1.0") >>> model = GLiNER.from_pretrained("path/to/local/model")
- classmethod from_config(config, cache_dir=None, load_tokenizer=True, resize_token_embeddings=True, backbone_from_pretrained=True, compile_torch_model=False, map_location='cpu', max_length=None, max_width=None, post_fusion_schema=None, _attn_implementation=None, **model_kwargs)[source]¶
Create a GLiNER model from configuration.
- Parameters:
config (GLiNERConfig | str | Path | dict) – Model configuration (GLiNERConfig object, path to config file, or dict).
cache_dir (str | Path | None) – Cache directory for downloads.
load_tokenizer (bool) – Whether to load tokenizer.
resize_token_embeddings (bool) – Whether to resize token embeddings.
backbone_from_pretrained (bool) – Whether to load the backbone encoder from pretrained weights.
compile_torch_model (bool) – Whether to compile with torch.compile.
map_location (str) – Device to map model to.
max_length (int | None) – Override max_length in config.
max_width (int | None) – Override max_width in config.
post_fusion_schema (str | None) – Override post_fusion_schema in config.
_attn_implementation (str | None) – Override attention implementation.
**model_kwargs – Additional model initialization arguments.
- Returns:
Initialized GLiNER model instance.
Examples
>>> config = GLiNERConfig(model_name="microsoft/deberta-v3-small") >>> model = GLiNER.from_config(config) >>> model = GLiNER.from_config("path/to/gliner_config.json")
- property model_map: dict[str, dict[str, Any]]¶
Map configuration patterns to their corresponding GLiNER classes.
- Returns:
Dictionary mapping model types to their classes and descriptions.
- class gliner.GLiNERConfig(labels_encoder=None, labels_decoder=None, relations_layer=None, **kwargs)[source]¶
Bases:
BaseGLiNERConfigLegacy configuration class that auto-detects model type.
This class provides backward compatibility by automatically determining the appropriate model type based on the provided configuration parameters.
- labels_encoder¶
Name of the encoder for entity labels (bi-encoder).
- Type:
str
- labels_decoder¶
Name of the decoder for label generation.
- Type:
str
- relations_layer¶
Layer configuration for relation extraction.
- Type:
str
Initialize GLiNERConfig.
- Parameters:
labels_encoder (str, optional) – Labels encoder for bi-encoder models. Defaults to None.
labels_decoder (str, optional) – Decoder for label generation. Defaults to None.
relations_layer (str, optional) – Relations layer for relation extraction. Defaults to None.
**kwargs – Additional keyword arguments passed to BaseGLiNERConfig.
- __init__(labels_encoder=None, labels_decoder=None, relations_layer=None, **kwargs)[source]¶
Initialize GLiNERConfig.
- Parameters:
labels_encoder (str, optional) – Labels encoder for bi-encoder models. Defaults to None.
labels_decoder (str, optional) – Decoder for label generation. Defaults to None.
relations_layer (str, optional) – Relations layer for relation extraction. Defaults to None.
**kwargs – Additional keyword arguments passed to BaseGLiNERConfig.
- property model_type¶
Auto-detect model type based on configuration.
- class gliner.InferencePackingConfig(max_length, sep_token_id=None, streams_per_batch=1)[source]¶
Bases:
objectConfiguration describing how sequences should be packed.
- max_length¶
Maximum number of tokens allowed in a packed stream.
- Type:
int
- sep_token_id¶
Optional separator token ID to insert between sequences. Currently not used in the implementation.
- Type:
int | None
- streams_per_batch¶
Number of streams to create per batch. Must be >= 1.
- Type:
int
- max_length: int¶
- sep_token_id: int | None = None¶
- streams_per_batch: int = 1¶
- __init__(max_length, sep_token_id=None, streams_per_batch=1)¶
- class gliner.PackedBatch(input_ids, attention_mask, pair_attention_mask, segment_ids, map_out, offsets, lengths)[source]¶
Bases:
objectContainer describing a packed collection of requests.
- input_ids¶
Tensor of shape (num_streams, max_len) containing packed token IDs.
- Type:
torch.LongTensor
- attention_mask¶
Tensor of shape (num_streams, max_len) with 1s for valid tokens and 0s for padding.
- Type:
torch.LongTensor
- pair_attention_mask¶
Boolean tensor of shape (num_streams, max_len, max_len) representing block-diagonal attention mask.
- Type:
torch.BoolTensor
- segment_ids¶
Tensor of shape (num_streams, max_len) with unique IDs for each packed segment within a stream.
- Type:
torch.LongTensor
- map_out¶
List of lists mapping each segment in each stream back to its original request index.
- Type:
List[List[int]]
- offsets¶
List of lists containing the starting offset of each segment within each stream.
- Type:
List[List[int]]
- lengths¶
List of lists containing the length of each segment within each stream.
- Type:
List[List[int]]
- input_ids: LongTensor¶
- attention_mask: LongTensor¶
- pair_attention_mask: BoolTensor¶
- segment_ids: LongTensor¶
- map_out: List[List[int]]¶
- offsets: List[List[int]]¶
- lengths: List[List[int]]¶
- __init__(input_ids, attention_mask, pair_attention_mask, segment_ids, map_out, offsets, lengths)¶
- gliner.pack_requests(requests, cfg, pad_token_id)[source]¶
Pack a collection of requests into one or more streams.
Groups multiple short sequences into contiguous token streams to reduce padding overhead. Each request’s tokens are placed into streams using a first-fit strategy. A block-diagonal attention mask ensures tokens from different requests cannot attend to each other.
- Parameters:
requests (List[Dict[str, Any]]) – List of request dictionaries. Each must contain an ‘input_ids’ key with a sequence of token IDs.
cfg (InferencePackingConfig) – Configuration specifying packing parameters (max_length, etc.).
pad_token_id (int) – Token ID to use for padding positions.
- Returns:
PackedBatch object containing packed tensors and metadata needed to unpack results back to original request ordering.
- Raises:
ValueError – If requests list is empty or configuration is invalid.
KeyError – If any request is missing required ‘input_ids’ key.
- Return type:
Example
>>> requests = [ ... {"input_ids": [1, 2, 3]}, ... {"input_ids": [4, 5]}, ... ] >>> cfg = InferencePackingConfig(max_length=10) >>> batch = pack_requests(requests, cfg, pad_token_id=0)
- gliner.unpack_spans(per_token_outputs, packed)[source]¶
Unpack encoder outputs back to the original request layout.
Takes per-token outputs from a packed batch and redistributes them back to match the original request ordering. Handles requests that were split across multiple streams by concatenating their segments.
- Parameters:
per_token_outputs (Any) – Tensor or array of shape (num_streams, max_len, …) containing per-token outputs from the encoder.
packed (PackedBatch) – PackedBatch object containing metadata about how requests were packed (from pack_requests).
- Returns:
List of tensors or arrays (one per original request) containing the unpacked outputs. If input was a NumPy array, outputs will be NumPy arrays; if PyTorch tensor, outputs will be PyTorch tensors.
- Raises:
ValueError – If per_token_outputs is not at least 2-dimensional.
TypeError – If per_token_outputs is neither a PyTorch tensor nor NumPy array.
- Return type:
List[Any]
Subpackages¶
- gliner.data_processing package
- gliner.decoding package
- gliner.evaluation package
- gliner.modeling package
- gliner.modeling.multitask package
- gliner.modeling.base module
- gliner.modeling.decoder module
- gliner.modeling.encoder module
- gliner.modeling.layers module
- gliner.modeling.loss_functions module
- gliner.modeling.outputs module
- gliner.modeling.scorers module
- gliner.modeling.span_rep module
- gliner.modeling.utils module
- gliner.multitask package
- gliner.onnx package
- gliner.training package