gliner.model module

class gliner.model.BaseGLiNER(*args, **kwargs)[source]

Bases: ABC, Module, PyTorchModelHubMixin

Initialize a BaseGLiNER model.

Parameters:
  • config (BaseGLiNERConfig) – Model configuration object.

  • model (BaseModel | None) – Pre-initialized model instance. If None, creates a new model.

  • tokenizer (BaseModel | None) – Pre-initialized tokenizer. If None, creates a new tokenizer.

  • data_processor (BaseProcessor | None) – Pre-initialized data processor. If None, creates a new processor.

  • backbone_from_pretrained (bool | None) – Whether to load the backbone from pretrained weights.

  • cache_dir (str | Path | None) – Directory for caching downloaded models.

  • **kwargs – Additional keyword arguments passed to model creation.

config_class: type = None
model_class: type = None
ort_model_class: type = None
data_processor_class: type = None
data_collator_class: type = None
decoder_class: type = None
__init__(config, model=None, tokenizer=None, data_processor=None, backbone_from_pretrained=False, cache_dir=None, **kwargs)[source]

Initialize a BaseGLiNER model.

Parameters:
  • config (BaseGLiNERConfig) – Model configuration object.

  • model (BaseModel | None) – Pre-initialized model instance. If None, creates a new model.

  • tokenizer (BaseModel | None) – Pre-initialized tokenizer. If None, creates a new tokenizer.

  • data_processor (BaseProcessor | None) – Pre-initialized data processor. If None, creates a new processor.

  • backbone_from_pretrained (bool | None) – Whether to load the backbone from pretrained weights.

  • cache_dir (str | Path | None) – Directory for caching downloaded models.

  • **kwargs – Additional keyword arguments passed to model creation.

abstract resize_embeddings()[source]
abstract inference()[source]
abstract evaluate()[source]
forward(*args, **kwargs)[source]

Forward pass through the model.

Parameters:
  • *args – Positional arguments passed to the model.

  • **kwargs – Keyword arguments passed to the model.

Returns:

Model output from the forward pass.

property device

Get the device where the model is located.

Returns:

Torch device object (CPU or CUDA).

configure_inference_packing(config)[source]

Configure default packing behavior for inference calls.

Passing None disables packing by default. Individual inference methods accept a packing_config argument to override this setting on a per-call basis.

Parameters:

config (InferencePackingConfig | None) – Inference packing configuration or None to disable packing.

compile()[source]

Compile the model using torch.compile for optimization.

prepare_state_dict(state_dict)[source]

Prepare state dict for saving, handling torch.compile artifacts.

Parameters:

state_dict – Original state dictionary from the model.

Returns:

Cleaned state dictionary with torch.compile prefixes removed.

save_pretrained(save_directory, *, config=None, repo_id=None, push_to_hub=False, safe_serialization=False, **push_to_hub_kwargs)[source]

Save model weights and configuration to local directory.

Parameters:
  • save_directory (str | Path) – Path to directory for saving.

  • config (BaseGLiNERConfig | None) – Model configuration. Uses self.config if None.

  • repo_id (str | None) – Repository ID for hub upload.

  • push_to_hub (bool) – Whether to push to HuggingFace Hub.

  • safe_serialization (bool) – Whether to use safetensors format.

  • **push_to_hub_kwargs – Additional arguments for push_to_hub.

Returns:

Repository URL if pushed to hub, None otherwise.

Return type:

str | None

classmethod load_from_config(config, cache_dir=None, load_tokenizer=True, resize_token_embeddings=True, backbone_from_pretrained=True, compile_torch_model=False, map_location='cpu', max_length=None, max_width=None, post_fusion_schema=None, _attn_implementation=None, **model_kwargs)[source]

Initialize a model from configuration without loading pretrained weights.

This method creates a new model instance from scratch using the provided configuration. The backbone encoder can optionally be loaded from pretrained weights, but the GLiNER-specific layers are always randomly initialized.

Parameters:
  • config (str | Path | GLiNERConfig | dict) – Model configuration (GLiNERConfig object, path to config file, or dict).

  • cache_dir (str | Path | None) – Cache directory for downloads.

  • load_tokenizer (bool) – Whether to load tokenizer.

  • resize_token_embeddings (bool) – Whether to resize token embeddings.

  • backbone_from_pretrained (bool) – Whether to load the backbone encoder from pretrained weights.

  • compile_torch_model (bool) – Whether to compile with torch.compile.

  • map_location (str) – Device to map model to.

  • max_length (int | None) – Override max_length in config.

  • max_width (int | None) – Override max_width in config.

  • post_fusion_schema (str | None) – Override post_fusion_schema in config.

  • _attn_implementation (str | None) – Override attention implementation.

  • **model_kwargs – Additional model initialization arguments.

Returns:

Initialized model instance with randomly initialized weights (except backbone if specified).

Examples

>>> config = GLiNERConfig(model_name="microsoft/deberta-v3-small")
>>> model = GLiNER.load_from_config(config)
>>> model = GLiNER.load_from_config("path/to/gliner_config.json")
>>> # Load with pretrained backbone but random GLiNER layers
>>> model = GLiNER.load_from_config(config, backbone_from_pretrained=True)
classmethod from_pretrained(model_id, model_dir=None, revision=None, cache_dir=None, force_download=False, proxies=None, resume_download=False, local_files_only=False, token=None, map_location='cpu', strict=False, load_tokenizer=None, resize_token_embeddings=True, compile_torch_model=False, load_onnx_model=False, onnx_model_file='model.onnx', session_options=None, max_length=None, max_width=None, post_fusion_schema=None, _attn_implementation=None, **model_kwargs)[source]

Load pretrained model from HuggingFace Hub or local directory.

Parameters:
  • model_id (str) – Model identifier or local path.

  • model_dir (str | None) – Override model directory path.

  • revision (str | None) – Model revision.

  • cache_dir (str | Path | None) – Cache directory.

  • force_download (bool) – Force redownload.

  • proxies (dict | None) – Proxy configuration.

  • resume_download (bool) – Resume interrupted downloads.

  • local_files_only (bool) – Only use local files.

  • token (str | bool | None) – HF token for private repos.

  • map_location (str) – Device to map model to.

  • strict (bool) – Enforce strict state_dict loading.

  • load_tokenizer (bool | None) – Whether to load tokenizer.

  • resize_token_embeddings (bool | None) – Whether to resize embeddings.

  • compile_torch_model (bool | None) – Whether to compile with torch.compile.

  • load_onnx_model (bool | None) – Whether to load ONNX model instead of PyTorch.

  • onnx_model_file (str | None) – Path to ONNX model file.

  • session_options – ONNX runtime session options.

  • max_length (int | None) – Override max_length in config.

  • max_width (int | None) – Override max_width in config.

  • post_fusion_schema (str | None) – Override post_fusion_schema in config.

  • _attn_implementation (str | None) – Override attention implementation.

  • **model_kwargs – Additional model initialization arguments.

Returns:

Loaded model instance.

export_to_onnx(save_dir, onnx_filename='model.onnx', quantized_filename='model_quantized.onnx', quantize=False, opset=19, **export_kwargs)[source]

Unified ONNX export method using specifications from child classes.

Parameters:
  • save_dir (str | Path) – Directory to save ONNX files.

  • onnx_filename (str) – Name of the ONNX model file.

  • quantized_filename (str) – Name of the quantized model file.

  • quantize (bool) – Whether to create a quantized version.

  • opset (int) – ONNX opset version.

  • **export_kwargs – Additional export arguments (model-specific).

Returns:

  • onnx_path: Path to standard ONNX model

  • quantized_path: Path to quantized model (if quantize=True)

Return type:

Dictionary with paths to exported models

freeze_component(component_name)[source]

Freeze a specific component of the model.

Parameters:

component_name (str) – Name of component to freeze (e.g., ‘text_encoder’, ‘labels_encoder’, ‘decoder’)

unfreeze_component(component_name)[source]

Unfreeze a specific component of the model.

Parameters:

component_name (str) – Name of component to unfreeze

classmethod create_training_args(output_dir, learning_rate=5e-05, weight_decay=0.01, others_lr=None, others_weight_decay=None, focal_loss_alpha=-1, focal_loss_gamma=0.0, focal_loss_prob_margin=0.0, loss_reduction='sum', negatives=1.0, masking='none', lr_scheduler_type='linear', warmup_ratio=0.1, per_device_train_batch_size=8, per_device_eval_batch_size=8, max_grad_norm=1.0, max_steps=10000, save_steps=1000, save_total_limit=10, logging_steps=10, use_cpu=False, bf16=True, dataloader_num_workers=1, report_to='none', **kwargs)[source]

Create training arguments with sensible defaults.

Parameters:
  • output_dir (str | Path) – Directory to save model checkpoints.

  • learning_rate (float) – Learning rate for main parameters.

  • weight_decay (float) – Weight decay for main parameters.

  • others_lr (float | None) – Learning rate for other parameters.

  • others_weight_decay (float | None) – Weight decay for other parameters.

  • focal_loss_alpha (float) – Alpha for focal loss.

  • focal_loss_gamma (float) – Gamma for focal loss.

  • focal_loss_prob_margin (float) – Probability margin for focal loss.

  • loss_reduction (str) – Loss reduction method.

  • negatives (float) – Negative sampling ratio.

  • masking (str) – Masking strategy.

  • lr_scheduler_type (str) – Learning rate scheduler type.

  • warmup_ratio (float) – Warmup ratio.

  • per_device_train_batch_size (int) – Training batch size.

  • per_device_eval_batch_size (int) – Evaluation batch size.

  • max_grad_norm (float) – Maximum gradient norm.

  • max_steps (int) – Maximum training steps.

  • save_steps (int) – Save checkpoint every N steps.

  • save_total_limit (int) – Maximum number of checkpoints to keep.

  • logging_steps (int) – Log every N steps.

  • use_cpu (bool) – Whether to use CPU.

  • bf16 (bool) – Whether to use bfloat16.

  • dataloader_num_workers (int) – Number of dataloader workers.

  • report_to (str) – Where to report metrics.

  • **kwargs – Additional training arguments.

Returns:

TrainingArguments instance.

Return type:

TrainingArguments

train_model(train_dataset, eval_dataset, training_args=None, freeze_components=None, compile_model=False, output_dir=None, **training_kwargs)[source]

Train the model.

Parameters:
  • train_dataset – Training dataset.

  • eval_dataset – Evaluation dataset.

  • training_args (TrainingArguments | None) – Training arguments (created with defaults if None).

  • freeze_components (list[str] | None) – List of component names to freeze (e.g., [‘text_encoder’, ‘decoder’]).

  • compile_model (bool) – Whether to compile model with torch.compile.

  • output_dir (str | Path | None) – Output directory (required if training_args is None).

  • **training_kwargs – Additional kwargs for creating training args.

Returns:

Trained Trainer instance.

Return type:

Trainer

class gliner.model.BaseEncoderGLiNER(*args, **kwargs)[source]

Bases: BaseGLiNER

Initialize a BaseGLiNER model.

Parameters:
  • config – Model configuration object.

  • model – Pre-initialized model instance. If None, creates a new model.

  • tokenizer – Pre-initialized tokenizer. If None, creates a new tokenizer.

  • data_processor – Pre-initialized data processor. If None, creates a new processor.

  • backbone_from_pretrained – Whether to load the backbone from pretrained weights.

  • cache_dir – Directory for caching downloaded models.

  • **kwargs – Additional keyword arguments passed to model creation.

set_class_indices()[source]

Set the class token index in the configuration based on tokenizer vocabulary.

resize_embeddings(set_class_token_index=True)[source]

Resize token embeddings to match tokenizer vocabulary size.

Parameters:

set_class_token_index – Whether to update the class token index.

prepare_inputs(texts)[source]

Prepare inputs for the model by tokenizing and creating index mappings.

Parameters:

texts (List[str]) – The input texts to process.

Returns:

  • all_tokens: List of tokenized texts

  • all_start_token_idx_to_text_idx: Start position mappings

  • all_end_token_idx_to_text_idx: End position mappings

Return type:

Tuple containing

prepare_base_input(all_tokens)[source]

Prepare base input format for data collation.

Parameters:

all_tokens (List[List[str]]) – List of tokenized texts.

Returns:

List of input dictionaries ready for collation.

Return type:

List[Dict[str, Any]]

inference(texts, labels, flat_ner=True, threshold=0.5, multi_label=False, batch_size=8, packing_config=None, **external_inputs)[source]

Predict entities for a batch of texts.

Parameters:
  • texts (str | List[str]) – A list of input texts to predict entities for or a single text string.

  • labels (List[str]) – A list of labels to predict.

  • flat_ner (bool) – Whether to use flat NER. Defaults to True.

  • threshold (float) – Confidence threshold for predictions. Defaults to 0.5.

  • multi_label (bool) – Whether to allow multiple labels per token. Defaults to False.

  • batch_size (int) – Batch size for processing. Defaults to 8.

  • packing_config (InferencePackingConfig | None) – Configuration describing how to pack encoder inputs. When None the instance-level configuration set via configure_inference_packing is used.

  • **external_inputs – Additional inputs to pass to the model.

Returns:

  • start: Start character position

  • end: End character position

  • text: Entity text

  • label: Entity type

  • score: Confidence score

Return type:

List of lists with predicted entities, where each entity is a dictionary containing

predict_entities(text, labels, flat_ner=True, threshold=0.5, multi_label=False, **kwargs)[source]

Predict entities for a single text input.

Parameters:
  • text (str) – The input text to predict entities for.

  • labels (List[str]) – The labels to predict.

  • flat_ner (bool) – Whether to use flat NER. Defaults to True.

  • threshold (float) – Confidence threshold for predictions. Defaults to 0.5.

  • multi_label (bool) – Whether to allow multiple labels per entity. Defaults to False.

  • **kwargs – Additional arguments passed to inference.

Returns:

List of entity predictions as dictionaries.

Return type:

List[Dict[str, Any]]

batch_predict_entities(texts, labels, flat_ner=True, threshold=0.5, multi_label=False, **kwargs)[source]

Predict entities for multiple texts.

DEPRECATED: Use inference instead.

This method will be removed in a future release. It now forwards to GLiNER.inference(…) to perform inference.

Parameters:
  • texts (List[str]) – Input texts.

  • labels (List[str]) – Labels to predict.

  • flat_ner (bool) – Use flat NER. Defaults to True.

  • threshold (float) – Confidence threshold. Defaults to 0.5.

  • multi_label (bool) – Allow multiple labels per token/entity. Defaults to False.

  • **kwargs – Extra arguments forwarded to inference (e.g., batch_size).

Returns:

List of entity predictions for each text.

Return type:

List[List[Dict[str, Any]]]

evaluate(test_data, flat_ner=False, multi_label=False, threshold=0.5, batch_size=12)[source]

Evaluate the model on a given test dataset.

Parameters:
  • test_data (List[Dict[str, Any]]) – The test data containing text and entity annotations.

  • flat_ner (bool) – Whether to use flat NER. Defaults to False.

  • multi_label (bool) – Whether to use multi-label classification. Defaults to False.

  • threshold (float) – The threshold for predictions. Defaults to 0.5.

  • batch_size (int) – The batch size for evaluation. Defaults to 12.

Returns:

Tuple containing the evaluation output and the F1 score.

Return type:

Tuple[Any, float]

class gliner.model.BaseBiEncoderGLiNER(*args, **kwargs)[source]

Bases: BaseEncoderGLiNER

Initialize a BaseGLiNER model.

Parameters:
  • config – Model configuration object.

  • model – Pre-initialized model instance. If None, creates a new model.

  • tokenizer – Pre-initialized tokenizer. If None, creates a new tokenizer.

  • data_processor – Pre-initialized data processor. If None, creates a new processor.

  • backbone_from_pretrained – Whether to load the backbone from pretrained weights.

  • cache_dir – Directory for caching downloaded models.

  • **kwargs – Additional keyword arguments passed to model creation.

resize_embeddings(**kwargs)[source]

Resize token embeddings to match tokenizer vocabulary size.

Parameters:

set_class_token_index – Whether to update the class token index.

encode_labels(labels, batch_size=8)[source]

Compute embeddings for labels using the label encoder.

Parameters:
  • labels (List[str]) – A list of labels to encode.

  • batch_size (int) – Batch size for processing labels.

Returns:

Tensor containing label embeddings with shape (num_labels, hidden_size).

Raises:

NotImplementedError – If the model doesn’t have a label encoder.

Return type:

FloatTensor

batch_predict_with_embeds(texts, labels_embeddings, labels, flat_ner=True, threshold=0.5, multi_label=False, batch_size=8, packing_config=None)[source]

Predict entities for a batch of texts using pre-computed label embeddings.

Parameters:
  • texts (List[str]) – A list of input texts to predict entities for.

  • labels_embeddings (Tensor) – Pre-computed embeddings for the labels.

  • labels (List[str]) – List of label strings corresponding to the embeddings.

  • flat_ner (bool) – Whether to use flat NER. Defaults to True.

  • threshold (float) – Confidence threshold for predictions. Defaults to 0.5.

  • multi_label (bool) – Whether to allow multiple labels per token. Defaults to False.

  • batch_size (int) – Batch size for processing. Defaults to 8.

  • packing_config (InferencePackingConfig | None) – Configuration describing how to pack encoder inputs. When None the instance-level configuration set via configure_inference_packing is used.

Returns:

List of lists with predicted entities.

Return type:

List[List[Dict[str, Any]]]

predict_with_embeds(text, labels_embeddings, labels, flat_ner=True, threshold=0.5, multi_label=False, **kwargs)[source]

Predict entities for a single text input using pre-computed label embeddings.

Parameters:
  • text – The input text to predict entities for.

  • labels_embeddings – Pre-computed embeddings for the labels.

  • labels – List of label strings corresponding to the embeddings.

  • flat_ner – Whether to use flat NER. Defaults to True.

  • threshold – Confidence threshold for predictions. Defaults to 0.5.

  • multi_label – Whether to allow multiple labels per entity. Defaults to False.

  • **kwargs – Additional arguments passed to batch_predict_with_embeds.

Returns:

List of entity predictions.

class gliner.model.UniEncoderSpanGLiNER(*args, **kwargs)[source]

Bases: BaseEncoderGLiNER

Initialize a BaseGLiNER model.

Parameters:
  • config – Model configuration object.

  • model – Pre-initialized model instance. If None, creates a new model.

  • tokenizer – Pre-initialized tokenizer. If None, creates a new tokenizer.

  • data_processor – Pre-initialized data processor. If None, creates a new processor.

  • backbone_from_pretrained – Whether to load the backbone from pretrained weights.

  • cache_dir – Directory for caching downloaded models.

  • **kwargs – Additional keyword arguments passed to model creation.

config_class

alias of UniEncoderSpanConfig

model_class

alias of UniEncoderSpanModel

ort_model_class

alias of UniEncoderSpanORTModel

data_processor_class

alias of UniEncoderSpanProcessor

data_collator_class

alias of UniEncoderSpanDataCollator

decoder_class

alias of SpanDecoder

class gliner.model.UniEncoderTokenGLiNER(*args, **kwargs)[source]

Bases: BaseEncoderGLiNER

Initialize a BaseGLiNER model.

Parameters:
  • config – Model configuration object.

  • model – Pre-initialized model instance. If None, creates a new model.

  • tokenizer – Pre-initialized tokenizer. If None, creates a new tokenizer.

  • data_processor – Pre-initialized data processor. If None, creates a new processor.

  • backbone_from_pretrained – Whether to load the backbone from pretrained weights.

  • cache_dir – Directory for caching downloaded models.

  • **kwargs – Additional keyword arguments passed to model creation.

config_class

alias of UniEncoderTokenConfig

model_class

alias of UniEncoderTokenModel

ort_model_class

alias of UniEncoderTokenORTModel

data_processor_class

alias of UniEncoderTokenProcessor

data_collator_class

alias of UniEncoderTokenDataCollator

decoder_class

alias of TokenDecoder

class gliner.model.BiEncoderSpanGLiNER(*args, **kwargs)[source]

Bases: BaseBiEncoderGLiNER

Initialize a BaseGLiNER model.

Parameters:
  • config – Model configuration object.

  • model – Pre-initialized model instance. If None, creates a new model.

  • tokenizer – Pre-initialized tokenizer. If None, creates a new tokenizer.

  • data_processor – Pre-initialized data processor. If None, creates a new processor.

  • backbone_from_pretrained – Whether to load the backbone from pretrained weights.

  • cache_dir – Directory for caching downloaded models.

  • **kwargs – Additional keyword arguments passed to model creation.

config_class

alias of BiEncoderSpanConfig

model_class

alias of BiEncoderSpanModel

ort_model_class

alias of BiEncoderSpanORTModel

data_processor_class

alias of BiEncoderSpanProcessor

data_collator_class

alias of BiEncoderSpanDataCollator

decoder_class

alias of SpanDecoder

class gliner.model.BiEncoderTokenGLiNER(*args, **kwargs)[source]

Bases: BaseBiEncoderGLiNER

Initialize a BaseGLiNER model.

Parameters:
  • config – Model configuration object.

  • model – Pre-initialized model instance. If None, creates a new model.

  • tokenizer – Pre-initialized tokenizer. If None, creates a new tokenizer.

  • data_processor – Pre-initialized data processor. If None, creates a new processor.

  • backbone_from_pretrained – Whether to load the backbone from pretrained weights.

  • cache_dir – Directory for caching downloaded models.

  • **kwargs – Additional keyword arguments passed to model creation.

config_class

alias of BiEncoderTokenConfig

model_class

alias of BiEncoderTokenModel

ort_model_class

alias of BiEncoderTokenORTModel

data_processor_class

alias of BiEncoderTokenProcessor

data_collator_class

alias of BiEncoderTokenDataCollator

decoder_class

alias of TokenDecoder

class gliner.model.UniEncoderSpanDecoderGLiNER(*args, **kwargs)[source]

Bases: BaseEncoderGLiNER

GLiNER model with span-based encoding and label decoding capabilities.

Supports generating textual labels for entities.

Initialize a BaseGLiNER model.

Parameters:
  • config – Model configuration object.

  • model – Pre-initialized model instance. If None, creates a new model.

  • tokenizer – Pre-initialized tokenizer. If None, creates a new tokenizer.

  • data_processor – Pre-initialized data processor. If None, creates a new processor.

  • backbone_from_pretrained – Whether to load the backbone from pretrained weights.

  • cache_dir – Directory for caching downloaded models.

  • **kwargs – Additional keyword arguments passed to model creation.

config_class

alias of UniEncoderSpanDecoderConfig

model_class

alias of UniEncoderSpanDecoderModel

ort_model_class: type = None
data_processor_class

alias of UniEncoderSpanDecoderProcessor

data_collator_class

alias of UniEncoderSpanDecoderDataCollator

decoder_class

alias of SpanGenerativeDecoder

set_labels_trie(labels)[source]

Initialize the labels trie for constrained generation.

Parameters:

labels (List[str]) – Labels that will be used for constrained generation.

Returns:

Trie structure for constrained beam search.

Raises:

NotImplementedError – If the model doesn’t have a decoder.

generate_labels(model_output, **gen_kwargs)[source]

Generate textual class labels for each entity span.

Parameters:
  • model_output – Model output containing decoder_embedding and decoder_embedding_mask.

  • **gen_kwargs – Generation parameters (max_new_tokens, temperature, etc.).

Returns:

List of generated label strings.

inference(texts, labels, flat_ner=True, threshold=0.5, multi_label=False, batch_size=8, gen_constraints=None, num_gen_sequences=1, packing_config=None, **gen_kwargs)[source]

Predict entities with optional label generation.

Parameters:
  • texts (str | List[str]) – Input texts (string or list of strings).

  • labels (List[str]) – Entity type labels.

  • flat_ner (bool) – Whether to use flat NER.

  • threshold (float) – Confidence threshold.

  • multi_label (bool) – Allow multiple labels per span.

  • batch_size (int) – Batch size for processing.

  • gen_constraints (List[str] | None) – Labels to constrain generation.

  • num_gen_sequences (int) – Number of label sequences to generate per span.

  • packing_config (InferencePackingConfig | None) – Inference packing configuration.

  • **gen_kwargs – Additional generation parameters.

Returns:

List of entity predictions with optional generated labels.

Return type:

List[List[Dict[str, Any]]]

export_to_onnx(save_dir, onnx_filename='model.onnx', quantized_filename='model_quantized.onnx', quantize=False, opset=19)[source]

ONNX export not supported for encoder-decoder models.

Raises:

NotImplementedError – Always raised as this model type cannot be exported to ONNX

class gliner.model.UniEncoderSpanRelexGLiNER(*args, **kwargs)[source]

Bases: BaseEncoderGLiNER

GLiNER model for both entity recognition and relation extraction.

Performs joint entity and relation prediction, allowing the model to simultaneously detect entities and the relationships between them in a single forward pass.

Initialize a BaseGLiNER model.

Parameters:
  • config – Model configuration object.

  • model – Pre-initialized model instance. If None, creates a new model.

  • tokenizer – Pre-initialized tokenizer. If None, creates a new tokenizer.

  • data_processor – Pre-initialized data processor. If None, creates a new processor.

  • backbone_from_pretrained – Whether to load the backbone from pretrained weights.

  • cache_dir – Directory for caching downloaded models.

  • **kwargs – Additional keyword arguments passed to model creation.

config_class

alias of UniEncoderSpanRelexConfig

model_class

alias of UniEncoderSpanRelexModel

ort_model_class

alias of UniEncoderSpanRelexORTModel

data_processor_class

alias of RelationExtractionSpanProcessor

data_collator_class

alias of RelationExtractionSpanDataCollator

decoder_class

alias of SpanRelexDecoder

set_class_indices()[source]

Set the class token indices for entities and relations in the configuration.

inference(texts, labels, relations, flat_ner=True, threshold=0.5, adjacency_threshold=None, relation_threshold=None, multi_label=False, batch_size=8, packing_config=None, return_relations=True)[source]

Predict entities and relations.

Parameters:
  • texts (str | List[str]) – Input texts (str or List[str]).

  • labels (List[str]) – Entity type labels (List[str]).

  • relations (List[str]) – Relation type labels (List[str]).

  • flat_ner (bool) – Whether to use flat NER (no nested entities).

  • threshold (float) – Confidence threshold for entities.

  • adjacency_threshold (float | None) – Confidence threshold for adjacency matrix reconstruction (defaults to threshold).

  • relation_threshold (float | None) – Confidence threshold for relations (defaults to threshold).

  • multi_label (bool) – Allow multiple labels per span.

  • batch_size (int) – Batch size for processing.

  • packing_config (InferencePackingConfig | None) – Inference packing configuration.

  • return_relations (bool) – Whether to return relation predictions.

Returns:

Tuple of (entities, relations) if return_relations=True, else just entities.

Return type:

List[List[Dict[str, Any]]] | Tuple[List[List[Dict[str, Any]]], List[List[Dict[str, Any]]]]

evaluate(test_data, flat_ner=False, multi_label=False, threshold=0.5, adjacency_threshold=None, relation_threshold=None, batch_size=12)[source]

Evaluate the model on both NER and relation extraction tasks.

Parameters:
  • test_data (List[Dict[str, Any]]) – The test data containing text, entity, and relation annotations.

  • flat_ner (bool) – Whether to use flat NER. Defaults to False.

  • multi_label (bool) – Whether to use multi-label classification. Defaults to False.

  • threshold (float) – The threshold for entity predictions. Defaults to 0.5.

  • adjacency_threshold (float | None) – Threshold for adjacency matrix reconstruction. Defaults to threshold.

  • relation_threshold (float | None) – The threshold for relation predictions. Defaults to threshold.

  • batch_size (int) – The batch size for evaluation. Defaults to 12.

Returns:

  • ner_output: Formatted string with NER P, R, F1

  • ner_f1: NER F1 score

  • rel_output: Formatted string with relation extraction P, R, F1

  • rel_f1: Relation extraction F1 score

Return type:

Tuple of ((ner_output, ner_f1), (rel_output, rel_f1)) containing

class gliner.model.GLiNER(*args, **kwargs)[source]

Bases: Module, PyTorchModelHubMixin

Meta GLiNER class that automatically instantiates the appropriate GLiNER variant.

This class provides a unified interface for all GLiNER models, automatically switching to specialized model types based on the model configuration. It supports various NER architectures including uni-encoder, bi-encoder, decoder-based, and relation extraction models.

The class automatically detects the model type based on:
  • span_mode: Token-level vs span-level

  • labels_encoder: Uni-encoder vs bi-encoder

  • labels_decoder: Standard vs decoder-based

  • relations_layer: NER-only vs joint entity-relation extraction

model

The loaded GLiNER model instance (automatically typed).

config

Model configuration.

data_processor

Data processor for the model.

decoder

Decoder for predictions.

Examples

Load a pretrained uni-encoder span model: >>> model = GLiNER.from_pretrained(“urchade/gliner_small-v2.1”)

Load a bi-encoder model: >>> model = GLiNER.from_pretrained(“knowledgator/gliner-bi-small-v1.0”)

Load from local configuration: >>> config = GLiNERConfig.from_pretrained(“config.json”) >>> model = GLiNER.from_config(config)

Initialize from scratch: >>> config = GLiNERConfig(model_name=”microsoft/deberta-v3-small”) >>> model = GLiNER(config)

Initialize a GLiNER model with automatic type detection.

This constructor determines the appropriate GLiNER variant based on the configuration and replaces itself with an instance of that variant.

Parameters:
  • config (str | Path | GLiNERConfig) – Model configuration (GLiNERConfig object, path to config file, or dict).

  • **kwargs – Additional arguments passed to the specific GLiNER variant.

Examples

>>> config = GLiNERConfig(model_name="bert-base-cased")
>>> model = GLiNER(config)
>>> model = GLiNER("path/to/gliner_config.json")
__init__(config, **kwargs)[source]

Initialize a GLiNER model with automatic type detection.

This constructor determines the appropriate GLiNER variant based on the configuration and replaces itself with an instance of that variant.

Parameters:
  • config (str | Path | GLiNERConfig) – Model configuration (GLiNERConfig object, path to config file, or dict).

  • **kwargs – Additional arguments passed to the specific GLiNER variant.

Examples

>>> config = GLiNERConfig(model_name="bert-base-cased")
>>> model = GLiNER(config)
>>> model = GLiNER("path/to/gliner_config.json")
classmethod from_pretrained(model_id, revision=None, cache_dir=None, force_download=False, proxies=None, resume_download=False, local_files_only=False, token=None, map_location='cpu', strict=False, load_tokenizer=None, resize_token_embeddings=True, compile_torch_model=False, load_onnx_model=False, onnx_model_file='model.onnx', max_length=None, max_width=None, post_fusion_schema=None, _attn_implementation=None, **model_kwargs)[source]

Load a pretrained GLiNER model with automatic type detection.

This method loads the configuration, determines the appropriate GLiNER variant, and delegates to that variant’s from_pretrained method.

Parameters:
  • model_id (str) – Model identifier or local path.

  • revision (str | None) – Model revision.

  • cache_dir (str | Path | None) – Cache directory.

  • force_download (bool) – Force redownload.

  • proxies (dict | None) – Proxy configuration.

  • resume_download (bool) – Resume interrupted downloads.

  • local_files_only (bool) – Only use local files.

  • token (str | bool | None) – HF token for private repos.

  • map_location (str) – Device to map model to.

  • strict (bool) – Enforce strict state_dict loading.

  • load_tokenizer (bool | None) – Whether to load tokenizer.

  • resize_token_embeddings (bool | None) – Whether to resize embeddings.

  • compile_torch_model (bool | None) – Whether to compile with torch.compile.

  • load_onnx_model (bool | None) – Whether to load ONNX model instead of PyTorch.

  • onnx_model_file (str | None) – Path to ONNX model file.

  • max_length (int | None) – Override max_length in config.

  • max_width (int | None) – Override max_width in config.

  • post_fusion_schema (str | None) – Override post_fusion_schema in config.

  • _attn_implementation (str | None) – Override attention implementation.

  • **model_kwargs – Additional model initialization arguments.

Returns:

Appropriate GLiNER model instance.

Examples

>>> model = GLiNER.from_pretrained("urchade/gliner_small-v2.1")
>>> model = GLiNER.from_pretrained("knowledgator/gliner-bi-small-v1.0")
>>> model = GLiNER.from_pretrained("path/to/local/model")
classmethod from_config(config, cache_dir=None, load_tokenizer=True, resize_token_embeddings=True, backbone_from_pretrained=True, compile_torch_model=False, map_location='cpu', max_length=None, max_width=None, post_fusion_schema=None, _attn_implementation=None, **model_kwargs)[source]

Create a GLiNER model from configuration.

Parameters:
  • config (GLiNERConfig | str | Path | dict) – Model configuration (GLiNERConfig object, path to config file, or dict).

  • cache_dir (str | Path | None) – Cache directory for downloads.

  • load_tokenizer (bool) – Whether to load tokenizer.

  • resize_token_embeddings (bool) – Whether to resize token embeddings.

  • backbone_from_pretrained (bool) – Whether to load the backbone encoder from pretrained weights.

  • compile_torch_model (bool) – Whether to compile with torch.compile.

  • map_location (str) – Device to map model to.

  • max_length (int | None) – Override max_length in config.

  • max_width (int | None) – Override max_width in config.

  • post_fusion_schema (str | None) – Override post_fusion_schema in config.

  • _attn_implementation (str | None) – Override attention implementation.

  • **model_kwargs – Additional model initialization arguments.

Returns:

Initialized GLiNER model instance.

Examples

>>> config = GLiNERConfig(model_name="microsoft/deberta-v3-small")
>>> model = GLiNER.from_config(config)
>>> model = GLiNER.from_config("path/to/gliner_config.json")
property model_map: dict[str, dict[str, Any]]

Map configuration patterns to their corresponding GLiNER classes.

Returns:

Dictionary mapping model types to their classes and descriptions.

get_model_type()[source]

Get the type of the current model instance.

Returns:

String identifier of the model type

Return type:

str