gliner.model moduleΒΆ
- class gliner.model.BaseGLiNER(*args, **kwargs)[source]ΒΆ
Bases:
ABC,Module,PyTorchModelHubMixinInitialize a BaseGLiNER model.
- Parameters:
config (BaseGLiNERConfig) β Model configuration object.
model (BaseModel | None) β Pre-initialized model instance. If None, creates a new model.
tokenizer (BaseModel | None) β Pre-initialized tokenizer. If None, creates a new tokenizer.
data_processor (BaseProcessor | None) β Pre-initialized data processor. If None, creates a new processor.
backbone_from_pretrained (bool | None) β Whether to load the backbone from pretrained weights.
cache_dir (str | Path | None) β Directory for caching downloaded models.
**kwargs β Additional keyword arguments passed to model creation.
- config_class: type = NoneΒΆ
- model_class: type = NoneΒΆ
- ort_model_class: type = NoneΒΆ
- data_processor_class: type = NoneΒΆ
- data_collator_class: type = NoneΒΆ
- decoder_class: type = NoneΒΆ
- __init__(config, model=None, tokenizer=None, data_processor=None, backbone_from_pretrained=False, cache_dir=None, **kwargs)[source]ΒΆ
Initialize a BaseGLiNER model.
- Parameters:
config (BaseGLiNERConfig) β Model configuration object.
model (BaseModel | None) β Pre-initialized model instance. If None, creates a new model.
tokenizer (BaseModel | None) β Pre-initialized tokenizer. If None, creates a new tokenizer.
data_processor (BaseProcessor | None) β Pre-initialized data processor. If None, creates a new processor.
backbone_from_pretrained (bool | None) β Whether to load the backbone from pretrained weights.
cache_dir (str | Path | None) β Directory for caching downloaded models.
**kwargs β Additional keyword arguments passed to model creation.
- forward(*args, **kwargs)[source]ΒΆ
Forward pass through the model.
- Parameters:
*args β Positional arguments passed to the model.
**kwargs β Keyword arguments passed to the model.
- Returns:
Model output from the forward pass.
- property deviceΒΆ
Get the device where the model is located.
- Returns:
Torch device object (CPU or CUDA).
- configure_inference_packing(config)[source]ΒΆ
Configure default packing behavior for inference calls.
Passing
Nonedisables packing by default. Individual inference methods accept apacking_configargument to override this setting on a per-call basis.- Parameters:
config (InferencePackingConfig | None) β Inference packing configuration or None to disable packing.
- compile()[source]ΒΆ
Compile the model using torch.compile for optimization.
Uses
dynamic=Trueto generate shape-generic kernels, which avoids recompilation on variable-length NER inputs. Also enablescapture_scalar_outputsto trace through data-dependent shape operations (e.g., computing max number of entity types per batch).Best combined with
quantize()for maximum throughput (~1.9x over fp32).When FlashDeBERTa is active, its custom Triton kernels are incompatible with torch.compile tracing. The encoder forward is automatically wrapped with
torch.compiler.disableso the rest of the model (span representation, scoring, etc.) still benefits from compilation.
- quantize(dtype='int8')[source]ΒΆ
Apply int8 quantization to the model.
Only
"int8"is accepted; for precision changes (fp16/bf16), usedtype=onGLiNER.from_pretrained()ormodel.to(torch_dtype)β those are downcasts, not quantization, and were removed from this API.- Parameters:
dtype (str) β Must be
"int8". On CPU, uses PyTorchβs built-in dynamic quantization with FBGEMM int8 kernels (~1.6x speedup). On GPU, usestorchaoint8 weight-only quantization (~50% memory reduction, no speed gain; requires thetorchaopackage). Stock DeBERTa-based models lose accuracy with int8; use this with models fine-tuned with quantization-aware training (QAT).- Raises:
RuntimeError β If the model is an ONNX model (use ONNX quantization instead).
ValueError β If dtype is not
"int8". Precision aliases (fp16/bf16) raise with a migration message pointing atdtype=/model.to(...).ImportError β If
torchaois not installed and int8 on GPU is requested.
Examples
>>> model = GLiNER.from_pretrained("urchade/gliner_small-v2.1", map_location="cuda") >>> model.quantize("int8") # int8 (torchao on GPU, FBGEMM on CPU) >>> # For precision-only changes, prefer: >>> model = GLiNER.from_pretrained("urchade/gliner_small-v2.1", dtype="bf16")
- prepare_state_dict(state_dict)[source]ΒΆ
Prepare state dict for saving, handling torch.compile artifacts.
- Parameters:
state_dict β Original state dictionary from the model.
- Returns:
Cleaned state dictionary with torch.compile prefixes removed.
- save_pretrained(save_directory, *, config=None, repo_id=None, push_to_hub=False, safe_serialization=False, **push_to_hub_kwargs)[source]ΒΆ
Save model weights and configuration to local directory.
- Parameters:
save_directory (str | Path) β Path to directory for saving.
config (BaseGLiNERConfig | None) β Model configuration. Uses self.config if None.
repo_id (str | None) β Repository ID for hub upload.
push_to_hub (bool) β Whether to push to HuggingFace Hub.
safe_serialization (bool) β Whether to use safetensors format.
**push_to_hub_kwargs β Additional arguments for push_to_hub.
- Returns:
Repository URL if pushed to hub, None otherwise.
- Return type:
str | None
- classmethod load_from_config(config, cache_dir=None, load_tokenizer=True, resize_token_embeddings=True, backbone_from_pretrained=True, compile_torch_model=False, quantize=None, map_location='cpu', max_length=None, max_width=None, post_fusion_schema=None, _attn_implementation=None, **model_kwargs)[source]ΒΆ
Initialize a model from configuration without loading pretrained weights.
This method creates a new model instance from scratch using the provided configuration. The backbone encoder can optionally be loaded from pretrained weights, but the GLiNER-specific layers are always randomly initialized.
- Parameters:
config (str | Path | GLiNERConfig | dict) β Model configuration (GLiNERConfig object, path to config file, or dict).
cache_dir (str | Path | None) β Cache directory for downloads.
load_tokenizer (bool) β Whether to load tokenizer.
resize_token_embeddings (bool) β Whether to resize token embeddings.
backbone_from_pretrained (bool) β Whether to load the backbone encoder from pretrained weights.
compile_torch_model (bool) β Whether to compile with torch.compile.
quantize (str | None) β Only
"int8"is accepted (int8 dynamic quantization: torchao on GPU, FBGEMM on CPU). For precision-only changes (fp16/bf16), usedtype=.Noneto disable.map_location (str) β Device to map model to.
max_length (int | None) β Override max_length in config.
max_width (int | None) β Override max_width in config.
post_fusion_schema (str | None) β Override post_fusion_schema in config.
_attn_implementation (str | None) β Override attention implementation.
**model_kwargs β Additional model initialization arguments.
- Returns:
Initialized model instance with randomly initialized weights (except backbone if specified).
Examples
>>> config = GLiNERConfig(model_name="microsoft/deberta-v3-small") >>> model = GLiNER.load_from_config(config) >>> model = GLiNER.load_from_config("path/to/gliner_config.json") >>> # Load with pretrained backbone but random GLiNER layers >>> model = GLiNER.load_from_config(config, backbone_from_pretrained=True)
- classmethod from_pretrained(model_id, model_dir=None, revision=None, cache_dir=None, force_download=False, proxies=None, resume_download=False, local_files_only=False, token=None, map_location='cpu', strict=False, load_tokenizer=None, resize_token_embeddings=True, compile_torch_model=False, quantize=None, dtype=None, low_cpu_mem_usage=False, variant=None, load_onnx_model=False, onnx_model_file='model.onnx', session_options=None, max_length=None, max_width=None, post_fusion_schema=None, _attn_implementation=None, **model_kwargs)[source]ΒΆ
Load pretrained model from HuggingFace Hub or local directory.
- Parameters:
model_id (str) β Model identifier or local path.
model_dir (str | None) β Override model directory path.
revision (str | None) β Model revision.
cache_dir (str | Path | None) β Cache directory.
force_download (bool) β Force redownload.
proxies (dict | None) β Proxy configuration.
resume_download (bool) β Resume interrupted downloads.
local_files_only (bool) β Only use local files.
token (str | bool | None) β HF token for private repos.
map_location (str) β Device to map model to.
strict (bool) β Enforce strict state_dict loading.
load_tokenizer (bool | None) β Whether to load tokenizer.
resize_token_embeddings (bool | None) β Whether to resize embeddings.
compile_torch_model (bool | None) β Whether to compile with torch.compile.
quantize (str | None) β Only
"int8"is accepted (int8 dynamic quantization: torchao on GPU, FBGEMM on CPU). For precision-only changes (fp16/bf16), usedtype=.Noneto disable.dtype (str | dtype | None) β Target floating-point dtype for the loaded weights (e.g.
torch.bfloat16,"bf16","fp16"). When set, the model shell is pre-cast and each state-dict tensor is cast during reading, so the full fp32 copy is never materialized β peak host memory is roughly half of the default path for bf16/fp16. Prefer this overquantizefor plain precision changes.low_cpu_mem_usage (bool) β If True, build the model under
torch.device("meta")and useload_state_dict(assign=True)to swap loaded tensors into place. Skips the random-init compute, the fp32 random-init shell, and the post-init cast pass β the model goes from βshape descriptorβ to βloaded weightsβ in one shot. Non-persistent buffers (e.g. DeBERTaβsposition_ids) are re-materialized after the load. DefaultFalsefor now (opt-in); enable for cold-start / serverless deployments where every 100ms matters.variant (str | None) β If set (
"fp16"or"bf16"), prefermodel.{variant}.safetensorsover the default fp32 file. Best-effort: the loader probes the Hub (or local path) for the variant file before downloading. If it is published, only the variant file is fetched (~half the bytes vs fp32) and loaded directly. If it is not published, aUserWarningis emitted and the loader falls back to the default fp32 file plus an in-memory cast β same outcome asdtype={variant!r}alone, no I/O win, no error.dtypeis inferred fromvariantwhen not set; passing both with mismatched precisions raises.None(default) preserves the prior behavior verbatim.load_onnx_model (bool | None) β Whether to load ONNX model instead of PyTorch.
onnx_model_file (str | None) β Path to ONNX model file.
session_options β ONNX runtime session options.
max_length (int | None) β Override max_length in config.
max_width (int | None) β Override max_width in config.
post_fusion_schema (str | None) β Override post_fusion_schema in config.
_attn_implementation (str | None) β Override attention implementation.
**model_kwargs β Additional model initialization arguments.
- Returns:
Loaded model instance.
- export_to_onnx(save_dir, onnx_filename='model.onnx', quantized_filename='model_quantized.onnx', quantize=False, opset=19, **export_kwargs)[source]ΒΆ
Unified ONNX export method using specifications from child classes.
- Parameters:
save_dir (str | Path) β Directory to save ONNX files.
onnx_filename (str) β Name of the ONNX model file.
quantized_filename (str) β Name of the quantized model file.
quantize (bool) β Whether to create a quantized version.
opset (int) β ONNX opset version.
**export_kwargs β Additional export arguments (model-specific).
- Returns:
onnx_path: Path to standard ONNX model
quantized_path: Path to quantized model (if quantize=True)
- Return type:
Dictionary with paths to exported models
- freeze_component(component_name)[source]ΒΆ
Freeze a specific component of the model.
- Parameters:
component_name (str) β Name of component to freeze (e.g., βtext_encoderβ, βlabels_encoderβ, βdecoderβ)
- unfreeze_component(component_name)[source]ΒΆ
Unfreeze a specific component of the model.
- Parameters:
component_name (str) β Name of component to unfreeze
- classmethod create_training_args(output_dir, learning_rate=5e-05, weight_decay=0.01, others_lr=None, others_weight_decay=None, focal_loss_alpha=-1, focal_loss_gamma=0.0, rel_focal_loss_alpha=None, rel_focal_loss_gamma=None, focal_loss_prob_margin=0.0, loss_reduction='sum', negatives=1.0, masking='none', lr_scheduler_type='linear', warmup_ratio=0.1, per_device_train_batch_size=8, per_device_eval_batch_size=8, max_grad_norm=1.0, max_steps=10000, save_steps=1000, save_total_limit=10, logging_steps=10, use_cpu=False, bf16=False, dataloader_num_workers=1, report_to='none', **kwargs)[source]ΒΆ
Create training arguments with sensible defaults.
- Parameters:
output_dir (str | Path) β Directory to save model checkpoints.
learning_rate (float) β Learning rate for main parameters.
weight_decay (float) β Weight decay for main parameters.
others_lr (float | None) β Learning rate for other parameters.
others_weight_decay (float | None) β Weight decay for other parameters.
focal_loss_alpha (float) β Alpha for focal loss.
focal_loss_gamma (float) β Gamma for focal loss.
rel_focal_loss_alpha (float | None) β Alpha for relation focal loss. Defaults to entity alpha.
rel_focal_loss_gamma (float | None) β Gamma for relation focal loss. Defaults to entity gamma.
focal_loss_prob_margin (float) β Probability margin for focal loss.
loss_reduction (str) β Loss reduction method.
negatives (float) β Negative sampling ratio.
masking (str) β Masking strategy.
lr_scheduler_type (str) β Learning rate scheduler type.
warmup_ratio (float) β Warmup ratio.
per_device_train_batch_size (int) β Training batch size.
per_device_eval_batch_size (int) β Evaluation batch size.
max_grad_norm (float) β Maximum gradient norm.
max_steps (int) β Maximum training steps.
save_steps (int) β Save checkpoint every N steps.
save_total_limit (int) β Maximum number of checkpoints to keep.
logging_steps (int) β Log every N steps.
use_cpu (bool) β Whether to use CPU.
bf16 (bool) β Whether to use bfloat16.
dataloader_num_workers (int) β Number of dataloader workers.
report_to (str) β Where to report metrics.
**kwargs β Additional training arguments.
- Returns:
TrainingArguments instance.
- Return type:
- train_model(train_dataset, eval_dataset, training_args=None, freeze_components=None, compile_model=False, output_dir=None, **training_kwargs)[source]ΒΆ
Train the model.
- Parameters:
train_dataset β Training dataset.
eval_dataset β Evaluation dataset.
training_args (TrainingArguments | None) β Training arguments (created with defaults if None).
freeze_components (list[str] | None) β List of component names to freeze (e.g., [βtext_encoderβ, βdecoderβ]).
compile_model (bool) β Whether to compile model with torch.compile.
output_dir (str | Path | None) β Output directory (required if training_args is None).
**training_kwargs β Additional kwargs for creating training args.
- Returns:
Trained Trainer instance.
- Return type:
- class gliner.model.BaseEncoderGLiNER(*args, **kwargs)[source]ΒΆ
Bases:
BaseGLiNERInitialize a BaseGLiNER model.
- Parameters:
config β Model configuration object.
model β Pre-initialized model instance. If None, creates a new model.
tokenizer β Pre-initialized tokenizer. If None, creates a new tokenizer.
data_processor β Pre-initialized data processor. If None, creates a new processor.
backbone_from_pretrained β Whether to load the backbone from pretrained weights.
cache_dir β Directory for caching downloaded models.
**kwargs β Additional keyword arguments passed to model creation.
- set_class_indices()[source]ΒΆ
Set the class token index in the configuration based on tokenizer vocabulary.
- resize_embeddings(set_class_token_index=True)[source]ΒΆ
Resize token embeddings to match tokenizer vocabulary size.
- Parameters:
set_class_token_index β Whether to update the class token index.
- prepare_inputs(texts)[source]ΒΆ
Prepare inputs for the model by tokenizing and creating index mappings.
- Parameters:
texts (List[str]) β The input texts to process.
- Returns:
all_tokens: List of tokenized texts
all_start_token_idx_to_text_idx: Start position mappings
all_end_token_idx_to_text_idx: End position mappings
- Return type:
Tuple containing
- prepare_base_input(all_tokens)[source]ΒΆ
Prepare base input format for data collation.
- Parameters:
all_tokens (List[List[str]]) β List of tokenized texts.
- Returns:
List of input dictionaries ready for collation.
- Return type:
List[Dict[str, Any]]
- prepare_batch(texts, labels, input_spans=None, **kwargs)[source]ΒΆ
Prepare raw inputs for inference (tokenization and normalization).
This method handles text normalization, tokenization, and span conversion. Use this as the first step in the inference pipeline.
- Parameters:
texts (str | List[str]) β Single text string or list of texts.
labels (str | List[str] | List[List[str]]) β Entity labels - string, list of strings, or per-text label lists.
input_spans (List[List[Dict]] | None) β Optional pre-defined spans to classify (character positions).
**kwargs β Additional keyword arguments passed to the data processor.
- Returns:
input_x: List of input dicts ready for collation
tokens: Tokenized texts
start_token_map: Per-text mapping from token idx to char start
end_token_map: Per-text mapping from token idx to char end
word_input_spans: Spans converted to word indices (or None)
entity_types: Normalized entity types
valid_texts: Non-empty texts that will be processed
valid_to_orig_idx: Mapping from valid indices to original indices
num_original: Total number of original texts
- Return type:
Dictionary containing
- collate_batch(input_x, entity_types, collator=None)[source]ΒΆ
Collate prepared inputs into a tensor batch.
- Parameters:
input_x (List[Dict[str, Any]]) β List of input dicts from prepare_batch.
entity_types (List[str] | List[List[str]]) β Entity type labels.
collator (Any | None) β Optional pre-created collator instance. If None, creates one.
- Returns:
Collated batch dictionary with tensors ready for the model.
- Return type:
Dict[str, Any]
- run_batch(batch, threshold=0.5, packing_config=None, move_to_device=True, **external_inputs)[source]ΒΆ
Run model forward pass on a collated batch.
- Parameters:
batch (Dict[str, Any]) β Collated batch from collate_batch.
threshold (float) β Confidence threshold for predictions.
packing_config (InferencePackingConfig | None) β Optional inference packing configuration.
move_to_device (bool) β Whether to move tensors to model device.
**external_inputs β Additional inputs to pass to the model.
- Returns:
Model output containing logits and span information.
- Return type:
Any
- decode_batch(model_output, batch, threshold=0.5, flat_ner=True, multi_label=False, return_class_probs=False, input_spans=None)[source]ΒΆ
Decode model output into entity predictions.
- Parameters:
model_output (Any) β Output from run_batch.
batch (Dict[str, Any]) β The collated batch (needs βtokensβ and βid_to_classesβ).
threshold (float) β Confidence threshold for predictions.
flat_ner (bool) β Whether to use flat NER (no overlapping entities).
multi_label (bool) β Whether to allow multiple labels per span.
return_class_probs (bool) β Whether to include class probabilities.
input_spans (List[List[Tuple[int, int]]] | None) β Optional word-level input spans to classify.
- Returns:
List of entity lists (one per text in batch).
- Return type:
List[List[Any]]
- map_entities_to_text(decoded, valid_texts, valid_to_orig_idx, start_token_map, end_token_map, num_original)[source]ΒΆ
Map decoded entities back to character positions in original texts.
- Parameters:
decoded (List[List[Any]]) β Decoded entity spans from decode_batch.
valid_texts (List[str]) β List of valid (non-empty) texts.
valid_to_orig_idx (List[int]) β Mapping from valid indices to original indices.
start_token_map (List[List[int]]) β Per-text token-to-char-start mapping.
end_token_map (List[List[int]]) β Per-text token-to-char-end mapping.
num_original (int) β Total number of original texts.
- Returns:
List of entity dicts aligned with original input texts.
- Return type:
List[List[Dict[str, Any]]]
- create_collator()[source]ΒΆ
Create a data collator instance for batch collation.
Useful for serve.py to create a reusable collator.
- Returns:
Configured data collator instance.
- Return type:
Any
- inference(texts, labels, flat_ner=True, threshold=0.5, multi_label=False, batch_size=8, packing_config=None, input_spans=None, return_class_probs=False, **external_inputs)[source]ΒΆ
Predict entities for a batch of texts.
- Parameters:
texts (str | List[str]) β A list of input texts to predict entities for or a single text string.
labels (List[str]) β A list of labels to predict.
flat_ner (bool) β Whether to use flat NER. Defaults to True.
threshold (float) β Confidence threshold for predictions. Defaults to 0.5.
multi_label (bool) β Whether to allow multiple labels per token. Defaults to False.
batch_size (int) β Batch size for processing. Defaults to 8.
packing_config (InferencePackingConfig | None) β Configuration describing how to pack encoder inputs. When None the instance-level configuration set via configure_inference_packing is used.
input_spans (List[List[Dict]] | None) β Input entity spans that should be classified by the model.
return_class_probs (bool) β Whether to include class probabilities in output. Defaults to False.
**external_inputs β Additional inputs to pass to the model.
- Returns:
start: Start character position
end: End character position
text: Entity text
label: Entity type
score: Confidence score
class_probs: (optional) Dictionary mapping class names to probabilities (top 5)
- Return type:
List of lists with predicted entities, where each entity is a dictionary containing
- predict_entities(text, labels, flat_ner=True, threshold=0.5, multi_label=False, return_class_probs=False, **kwargs)[source]ΒΆ
Predict entities for a single text input.
- Parameters:
text (str) β The input text to predict entities for.
labels (List[str]) β The labels to predict.
flat_ner (bool) β Whether to use flat NER. Defaults to True.
threshold (float) β Confidence threshold for predictions. Defaults to 0.5.
multi_label (bool) β Whether to allow multiple labels per entity. Defaults to False.
return_class_probs (bool) β Whether to include class probabilities in output. Defaults to False.
**kwargs β Additional arguments passed to inference.
- Returns:
List of entity predictions as dictionaries.
- Return type:
List[Dict[str, Any]]
- batch_predict_entities(texts, labels, flat_ner=True, threshold=0.5, multi_label=False, **kwargs)[source]ΒΆ
Predict entities for multiple texts.
DEPRECATED: Use inference instead.
This method will be removed in a future release. It now forwards to GLiNER.inference(β¦) to perform inference.
- Parameters:
texts (List[str]) β Input texts.
labels (List[str]) β Labels to predict.
flat_ner (bool) β Use flat NER. Defaults to True.
threshold (float) β Confidence threshold. Defaults to 0.5.
multi_label (bool) β Allow multiple labels per token/entity. Defaults to False.
**kwargs β Extra arguments forwarded to inference (e.g., batch_size).
- Returns:
List of entity predictions for each text.
- Return type:
List[List[Dict[str, Any]]]
- evaluate(test_data, flat_ner=False, multi_label=False, threshold=0.5, batch_size=12, entity_types=None)[source]ΒΆ
Evaluate the model on a given test dataset.
- Parameters:
test_data (List[Dict[str, Any]]) β The test data containing text and entity annotations.
flat_ner (bool) β Whether to use flat NER. Defaults to False.
multi_label (bool) β Whether to use multi-label classification. Defaults to False.
threshold (float) β The threshold for predictions. Defaults to 0.5.
batch_size (int) β The batch size for evaluation. Defaults to 12.
entity_types (List[str] | None) β Optional list of entity types to evaluate. If None, extracts from test data. Defaults to None.
- Returns:
Tuple containing the evaluation output and the F1 score.
- Return type:
Tuple[Any, float]
- compress_prompt_embeddings(texts, labels, rel_labels=None, batch_size=8, distill=False, distill_threshold=0.3, distill_epochs=3, distill_lr=1e-05, distill_batch_size=None, distill_output_dir='./distill_ckpt', distill_train_kwargs=None)[source]ΒΆ
Precompute averaged prompt embeddings for each label.
Runs the normal forward pass over (texts, labels) pairs, extracts the per-label prompt embedding from each example, and stores the mean per label on the underlying model. Sets
config.precomputed_prompts_modeto True so subsequent inference/training will skip label-prepending and look up the stored embeddings instead. Relation labels are supported for relation-extraction models viarel_labels.When
distill=True, the raw (pre-compression) model first generates pseudo-labels overtexts; the method then compresses prompt embeddings and fine-tunes the compressed model on those pseudo-labels so quality recovers end-to-end in a single call.- Parameters:
texts (List[str]) β List of raw input texts used as contexts for averaging.
labels (List[str]) β Entity labels to compress.
rel_labels (List[str] | None) β Optional relation labels (relex models only).
batch_size (int) β Batch size used while running the model.
distill (bool) β If True, generate pseudo-labels with the raw model over
textsand fine-tune the compressed model on them.distill_threshold (float) β Confidence threshold for pseudo-label generation.
distill_epochs (int) β Number of fine-tuning epochs.
distill_lr (float) β Fine-tuning learning rate.
distill_batch_size (int | None) β Batch size for fine-tuning (defaults to
batch_size).distill_output_dir (str) β Output directory passed to
train_model.distill_train_kwargs (Dict[str, Any] | None) β Extra kwargs forwarded to
train_model.
- class gliner.model.BaseBiEncoderGLiNER(*args, **kwargs)[source]ΒΆ
Bases:
BaseEncoderGLiNERInitialize a BaseGLiNER model.
- Parameters:
config β Model configuration object.
model β Pre-initialized model instance. If None, creates a new model.
tokenizer β Pre-initialized tokenizer. If None, creates a new tokenizer.
data_processor β Pre-initialized data processor. If None, creates a new processor.
backbone_from_pretrained β Whether to load the backbone from pretrained weights.
cache_dir β Directory for caching downloaded models.
**kwargs β Additional keyword arguments passed to model creation.
- resize_embeddings(**kwargs)[source]ΒΆ
Resize token embeddings to match tokenizer vocabulary size.
- Parameters:
set_class_token_index β Whether to update the class token index.
- encode_labels(labels, batch_size=8)[source]ΒΆ
Compute embeddings for labels using the label encoder.
- Parameters:
labels (List[str]) β A list of labels to encode.
batch_size (int) β Batch size for processing labels.
- Returns:
Tensor containing label embeddings with shape (num_labels, hidden_size).
- Raises:
NotImplementedError β If the model doesnβt have a label encoder.
- Return type:
FloatTensor
- batch_predict_with_embeds(texts, labels_embeddings, labels, flat_ner=True, threshold=0.5, multi_label=False, batch_size=8, packing_config=None, input_spans=None, return_class_probs=False)[source]ΒΆ
Predict entities for a batch of texts using pre-computed label embeddings.
- Parameters:
texts (List[str]) β A list of input texts to predict entities for.
labels_embeddings (Tensor) β Pre-computed embeddings for the labels.
labels (List[str]) β List of label strings corresponding to the embeddings.
flat_ner (bool) β Whether to use flat NER. Defaults to True.
threshold (float) β Confidence threshold for predictions. Defaults to 0.5.
multi_label (bool) β Whether to allow multiple labels per token. Defaults to False.
batch_size (int) β Batch size for processing. Defaults to 8.
packing_config (InferencePackingConfig | None) β Configuration describing how to pack encoder inputs. When None the instance-level configuration set via configure_inference_packing is used.
input_spans (List[List[Dict]] | None) β Input entity spans to limit predictions to. Each span is a dict with βstartβ and βendβ character positions.
return_class_probs (bool) β Whether to include class probabilities in output. Defaults to False.
- Returns:
List of lists with predicted entities.
- Return type:
List[List[Dict[str, Any]]]
- predict_with_embeds(text, labels_embeddings, labels, flat_ner=True, threshold=0.5, multi_label=False, return_class_probs=False, **kwargs)[source]ΒΆ
Predict entities for a single text input using pre-computed label embeddings.
- Parameters:
text β The input text to predict entities for.
labels_embeddings β Pre-computed embeddings for the labels.
labels β List of label strings corresponding to the embeddings.
flat_ner β Whether to use flat NER. Defaults to True.
threshold β Confidence threshold for predictions. Defaults to 0.5.
multi_label β Whether to allow multiple labels per entity. Defaults to False.
return_class_probs β Whether to include class probabilities in output. Defaults to False.
**kwargs β Additional arguments passed to batch_predict_with_embeds.
- Returns:
List of entity predictions.
- class gliner.model.UniEncoderSpanGLiNER(*args, **kwargs)[source]ΒΆ
Bases:
BaseEncoderGLiNERInitialize a BaseGLiNER model.
- Parameters:
config β Model configuration object.
model β Pre-initialized model instance. If None, creates a new model.
tokenizer β Pre-initialized tokenizer. If None, creates a new tokenizer.
data_processor β Pre-initialized data processor. If None, creates a new processor.
backbone_from_pretrained β Whether to load the backbone from pretrained weights.
cache_dir β Directory for caching downloaded models.
**kwargs β Additional keyword arguments passed to model creation.
- config_classΒΆ
alias of
UniEncoderSpanConfig
- model_classΒΆ
alias of
UniEncoderSpanModel
- ort_model_classΒΆ
alias of
UniEncoderSpanORTModel
- data_processor_classΒΆ
alias of
UniEncoderSpanProcessor
- data_collator_classΒΆ
alias of
UniEncoderSpanDataCollator
- decoder_classΒΆ
alias of
SpanDecoder
- class gliner.model.UniEncoderTokenGLiNER(*args, **kwargs)[source]ΒΆ
Bases:
BaseEncoderGLiNERInitialize a BaseGLiNER model.
- Parameters:
config β Model configuration object.
model β Pre-initialized model instance. If None, creates a new model.
tokenizer β Pre-initialized tokenizer. If None, creates a new tokenizer.
data_processor β Pre-initialized data processor. If None, creates a new processor.
backbone_from_pretrained β Whether to load the backbone from pretrained weights.
cache_dir β Directory for caching downloaded models.
**kwargs β Additional keyword arguments passed to model creation.
- config_classΒΆ
alias of
UniEncoderTokenConfig
- model_classΒΆ
alias of
UniEncoderTokenModel
- ort_model_classΒΆ
alias of
UniEncoderTokenORTModel
- data_processor_classΒΆ
alias of
UniEncoderTokenProcessor
- data_collator_classΒΆ
alias of
UniEncoderTokenDataCollator
- decoder_classΒΆ
alias of
TokenDecoder
- class gliner.model.BiEncoderSpanGLiNER(*args, **kwargs)[source]ΒΆ
Bases:
BaseBiEncoderGLiNERInitialize a BaseGLiNER model.
- Parameters:
config β Model configuration object.
model β Pre-initialized model instance. If None, creates a new model.
tokenizer β Pre-initialized tokenizer. If None, creates a new tokenizer.
data_processor β Pre-initialized data processor. If None, creates a new processor.
backbone_from_pretrained β Whether to load the backbone from pretrained weights.
cache_dir β Directory for caching downloaded models.
**kwargs β Additional keyword arguments passed to model creation.
- config_classΒΆ
alias of
BiEncoderSpanConfig
- model_classΒΆ
alias of
BiEncoderSpanModel
- ort_model_classΒΆ
alias of
BiEncoderSpanORTModel
- data_processor_classΒΆ
alias of
BiEncoderSpanProcessor
- data_collator_classΒΆ
alias of
BiEncoderSpanDataCollator
- decoder_classΒΆ
alias of
SpanDecoder
- class gliner.model.BiEncoderTokenGLiNER(*args, **kwargs)[source]ΒΆ
Bases:
BaseBiEncoderGLiNERInitialize a BaseGLiNER model.
- Parameters:
config β Model configuration object.
model β Pre-initialized model instance. If None, creates a new model.
tokenizer β Pre-initialized tokenizer. If None, creates a new tokenizer.
data_processor β Pre-initialized data processor. If None, creates a new processor.
backbone_from_pretrained β Whether to load the backbone from pretrained weights.
cache_dir β Directory for caching downloaded models.
**kwargs β Additional keyword arguments passed to model creation.
- config_classΒΆ
alias of
BiEncoderTokenConfig
- model_classΒΆ
alias of
BiEncoderTokenModel
- ort_model_classΒΆ
alias of
BiEncoderTokenORTModel
- data_processor_classΒΆ
alias of
BiEncoderTokenProcessor
- data_collator_classΒΆ
alias of
BiEncoderTokenDataCollator
- decoder_classΒΆ
alias of
TokenDecoder
- class gliner.model.UniEncoderSpanDecoderGLiNER(*args, **kwargs)[source]ΒΆ
Bases:
BaseEncoderGLiNERGLiNER model with span-based encoding and label decoding capabilities.
Supports generating textual labels for entities.
Initialize a BaseGLiNER model.
- Parameters:
config β Model configuration object.
model β Pre-initialized model instance. If None, creates a new model.
tokenizer β Pre-initialized tokenizer. If None, creates a new tokenizer.
data_processor β Pre-initialized data processor. If None, creates a new processor.
backbone_from_pretrained β Whether to load the backbone from pretrained weights.
cache_dir β Directory for caching downloaded models.
**kwargs β Additional keyword arguments passed to model creation.
- config_classΒΆ
alias of
UniEncoderSpanDecoderConfig
- model_classΒΆ
alias of
UniEncoderSpanDecoderModel
- ort_model_class: type = NoneΒΆ
- data_processor_classΒΆ
alias of
UniEncoderSpanDecoderProcessor
- data_collator_classΒΆ
alias of
UniEncoderSpanDecoderDataCollator
- decoder_classΒΆ
alias of
SpanGenerativeDecoder
- set_labels_trie(labels)[source]ΒΆ
Initialize the labels trie for constrained generation.
- Parameters:
labels (List[str]) β Labels that will be used for constrained generation.
- Returns:
Trie structure for constrained beam search.
- Raises:
NotImplementedError β If the model doesnβt have a decoder.
- generate_labels(model_output, **gen_kwargs)[source]ΒΆ
Generate textual class labels for each entity span.
- Parameters:
model_output β Model output containing decoder_embedding and decoder_embedding_mask.
**gen_kwargs β Generation parameters (max_new_tokens, temperature, etc.).
- Returns:
List of generated label strings.
- run_batch(batch, threshold=0.5, packing_config=None, move_to_device=True, gen_constraints=None, num_gen_sequences=1, **gen_kwargs)[source]ΒΆ
Run model forward pass on a collated batch with label generation.
- Parameters:
batch (Dict[str, Any]) β Collated batch from collate_batch.
threshold (float) β Confidence threshold for predictions.
packing_config (InferencePackingConfig | None) β Optional inference packing configuration.
move_to_device (bool) β Whether to move tensors to model device.
gen_constraints (List[str] | None) β Labels to constrain generation.
num_gen_sequences (int) β Number of label sequences to generate per span.
**gen_kwargs β Additional generation parameters.
- Returns:
Model output with generated labels attached.
- Return type:
Any
- decode_batch(model_output, batch, threshold=0.5, flat_ner=True, multi_label=False, return_class_probs=False, input_spans=None)[source]ΒΆ
Decode model output into entity predictions with generated labels.
- Parameters:
model_output (Any) β Output from run_batch (includes gen_labels).
batch (Dict[str, Any]) β The collated batch (needs βtokensβ and βid_to_classesβ).
threshold (float) β Confidence threshold for predictions.
flat_ner (bool) β Whether to use flat NER (no overlapping entities).
multi_label (bool) β Whether to allow multiple labels per span.
return_class_probs (bool) β Whether to include class probabilities.
input_spans (List[List[Tuple[int, int]]] | None) β Optional word-level input spans to classify.
- Returns:
List of entity lists (one per text in batch).
- Return type:
List[List[Any]]
- map_entities_to_text(decoded, valid_texts, valid_to_orig_idx, start_token_map, end_token_map, num_original)[source]ΒΆ
Map decoded entities back to character positions with generated labels.
- Parameters:
decoded (List[List[Any]]) β Decoded entity spans from decode_batch.
valid_texts (List[str]) β List of valid (non-empty) texts.
valid_to_orig_idx (List[int]) β Mapping from valid indices to original indices.
start_token_map (List[List[int]]) β Per-text token-to-char-start mapping.
end_token_map (List[List[int]]) β Per-text token-to-char-end mapping.
num_original (int) β Total number of original texts.
- Returns:
List of entity dicts aligned with original input texts.
- Return type:
List[List[Dict[str, Any]]]
- inference(texts, labels, flat_ner=True, threshold=0.5, multi_label=False, batch_size=8, gen_constraints=None, num_gen_sequences=1, packing_config=None, input_spans=None, return_class_probs=False, **gen_kwargs)[source]ΒΆ
Predict entities with optional label generation.
- Parameters:
texts (str | List[str]) β Input texts (string or list of strings).
labels (List[str]) β Entity type labels.
flat_ner (bool) β Whether to use flat NER.
threshold (float) β Confidence threshold.
multi_label (bool) β Allow multiple labels per span.
batch_size (int) β Batch size for processing.
gen_constraints (List[str] | None) β Labels to constrain generation.
num_gen_sequences (int) β Number of label sequences to generate per span.
packing_config (InferencePackingConfig | None) β Inference packing configuration.
input_spans (List[List[Dict]] | None) β Input entity spans to limit predictions to. Each span is a dict with βstartβ and βendβ character positions.
return_class_probs (bool) β Whether to include class probabilities in output. Defaults to False.
**gen_kwargs β Additional generation parameters.
- Returns:
List of entity predictions with optional generated labels.
- Return type:
List[List[Dict[str, Any]]]
- predict_entities(text, labels, flat_ner=True, threshold=0.5, multi_label=False, gen_constraints=None, num_gen_sequences=1, return_class_probs=False, **gen_kwargs)[source]ΒΆ
Predict entities for a single text input with optional label generation.
- Parameters:
text (str) β The input text to predict entities for.
labels (List[str]) β The labels to predict.
flat_ner (bool) β Whether to use flat NER. Defaults to True.
threshold (float) β Confidence threshold for predictions. Defaults to 0.5.
multi_label (bool) β Whether to allow multiple labels per entity. Defaults to False.
gen_constraints (List[str] | None) β Labels to constrain generation.
num_gen_sequences (int) β Number of label sequences to generate per span.
return_class_probs (bool) β Whether to include class probabilities in output. Defaults to False.
**gen_kwargs β Additional generation parameters.
- Returns:
List of entity predictions as dictionaries.
- Return type:
List[Dict[str, Any]]
- class gliner.model.UniEncoderTokenDecoderGLiNER(*args, **kwargs)[source]ΒΆ
Bases:
UniEncoderSpanDecoderGLiNERGLiNER model with token-based encoding and label decoding capabilities.
Combines token-level BIO tagging with a decoder that generates entity type labels autoregressively.
Initialize a BaseGLiNER model.
- Parameters:
config β Model configuration object.
model β Pre-initialized model instance. If None, creates a new model.
tokenizer β Pre-initialized tokenizer. If None, creates a new tokenizer.
data_processor β Pre-initialized data processor. If None, creates a new processor.
backbone_from_pretrained β Whether to load the backbone from pretrained weights.
cache_dir β Directory for caching downloaded models.
**kwargs β Additional keyword arguments passed to model creation.
- config_classΒΆ
alias of
UniEncoderTokenDecoderConfig
- model_classΒΆ
alias of
UniEncoderTokenDecoderModel
- ort_model_class: type = NoneΒΆ
- data_processor_classΒΆ
alias of
UniEncoderTokenDecoderProcessor
- data_collator_classΒΆ
alias of
UniEncoderTokenDecoderDataCollator
- decoder_classΒΆ
alias of
TokenGenerativeDecoder
- class gliner.model.UniEncoderSpanRelexGLiNER(*args, **kwargs)[source]ΒΆ
Bases:
BaseEncoderGLiNERGLiNER model for both entity recognition and relation extraction.
Performs joint entity and relation prediction, allowing the model to simultaneously detect entities and the relationships between them in a single forward pass.
Initialize a BaseGLiNER model.
- Parameters:
config β Model configuration object.
model β Pre-initialized model instance. If None, creates a new model.
tokenizer β Pre-initialized tokenizer. If None, creates a new tokenizer.
data_processor β Pre-initialized data processor. If None, creates a new processor.
backbone_from_pretrained β Whether to load the backbone from pretrained weights.
cache_dir β Directory for caching downloaded models.
**kwargs β Additional keyword arguments passed to model creation.
- config_classΒΆ
alias of
UniEncoderSpanRelexConfig
- model_classΒΆ
alias of
UniEncoderSpanRelexModel
- ort_model_classΒΆ
alias of
UniEncoderSpanRelexORTModel
- data_processor_classΒΆ
alias of
RelationExtractionSpanProcessor
- data_collator_classΒΆ
alias of
RelationExtractionSpanDataCollator
- decoder_classΒΆ
alias of
SpanRelexDecoder
- set_class_indices()[source]ΒΆ
Set the class token indices for entities and relations in the configuration.
- prepare_batch(texts, labels, input_spans=None, relations=None, **kwargs)[source]ΒΆ
Prepare raw inputs for inference including relation types.
- Parameters:
texts (str | List[str]) β Single text string or list of texts.
labels (str | List[str] | List[List[str]]) β Entity labels - string, list of strings, or per-text label lists.
input_spans (List[List[Dict]] | None) β Optional pre-defined spans to classify (character positions).
relations (str | List[str] | List[List[str]] | None) β Relation type labels - string, list of strings, or per-text label lists.
**kwargs β Additional keyword arguments passed to the parent prepare_batch.
- Returns:
Dictionary containing prepared inputs plus relation_types.
- Return type:
Dict[str, Any]
- collate_batch(input_x, entity_types, collator=None, relation_types=None)[source]ΒΆ
Collate prepared inputs into a tensor batch with relation types.
- Parameters:
input_x (List[Dict[str, Any]]) β List of input dicts from prepare_batch.
entity_types (List[str] | List[List[str]]) β Entity type labels.
collator (Any | None) β Optional pre-created collator instance.
relation_types (List[str] | List[List[str]] | None) β Relation type labels (list or per-text lists).
- Returns:
Collated batch dictionary with tensors ready for the model.
- Return type:
Dict[str, Any]
- create_collator()[source]ΒΆ
Create a data collator instance for relation extraction.
- Returns:
Configured data collator instance.
- Return type:
Any
- run_batch(batch, threshold=0.5, adjacency_threshold=None, packing_config=None, move_to_device=True, **external_inputs)[source]ΒΆ
Run model forward pass on a collated batch.
- Parameters:
batch (Dict[str, Any]) β Collated batch from collate_batch.
threshold (float) β Confidence threshold for predictions.
adjacency_threshold (float | None) β Threshold for adjacency matrix reconstruction.
packing_config (InferencePackingConfig | None) β Optional inference packing configuration.
move_to_device (bool) β Whether to move tensors to model device.
**external_inputs β Additional inputs to pass to the model.
- Returns:
Model output containing logits and relation information.
- Return type:
Any
- decode_batch(model_output, batch, threshold=0.5, relation_threshold=None, flat_ner=True, multi_label=False, return_class_probs=False, input_spans=None)[source]ΒΆ
Decode model output into entity and relation predictions.
- Parameters:
model_output (Any) β Output from run_batch.
batch (Dict[str, Any]) β The collated batch.
threshold (float) β Confidence threshold for entity predictions.
relation_threshold (float | None) β Confidence threshold for relation predictions.
flat_ner (bool) β Whether to use flat NER.
multi_label (bool) β Whether to allow multiple labels per span.
return_class_probs (bool) β Whether to include class probabilities.
input_spans (List[List[Tuple[int, int]]] | None) β Optional word-level input spans to classify.
- Returns:
Tuple of (entity_outputs, relation_outputs) where each is a list per text.
- Return type:
Tuple[List[List[Any]], List[List[Any]]]
- map_entities_to_text(decoded, valid_texts, valid_to_orig_idx, start_token_map, end_token_map, num_original)[source]ΒΆ
Map decoded entities back to character positions in original texts.
- Parameters:
decoded (List[List[Any]]) β Decoded entity spans from decode_batch.
valid_texts (List[str]) β List of valid (non-empty) texts.
valid_to_orig_idx (List[int]) β Mapping from valid indices to original indices.
start_token_map (List[List[int]]) β Per-text token-to-char-start mapping.
end_token_map (List[List[int]]) β Per-text token-to-char-end mapping.
num_original (int) β Total number of original texts.
- Returns:
List of entity dicts aligned with original input texts.
- Return type:
List[List[Dict[str, Any]]]
- map_relations_to_text(relation_outputs, entity_outputs, valid_texts, valid_to_orig_idx, start_token_map, end_token_map, num_original)[source]ΒΆ
Map relation predictions back to character positions.
- Parameters:
relation_outputs (List[List[Any]]) β Decoded relations per text.
entity_outputs (List[List[Any]]) β Decoded entities per text (for getting span info).
valid_texts (List[str]) β List of valid (non-empty) texts.
valid_to_orig_idx (List[int]) β Mapping from valid indices to original indices.
start_token_map (List[List[int]]) β Per-text token-to-char-start mapping.
end_token_map (List[List[int]]) β Per-text token-to-char-end mapping.
num_original (int) β Total number of original texts.
- Returns:
List of relation dicts aligned with original input texts.
- Return type:
List[List[Dict[str, Any]]]
- inference(texts, labels, relations=[], flat_ner=True, threshold=0.5, adjacency_threshold=None, relation_threshold=None, multi_label=False, batch_size=8, packing_config=None, input_spans=None, return_relations=True, return_class_probs=False)[source]ΒΆ
Predict entities and relations.
- Parameters:
texts (str | List[str]) β Input texts (str or List[str]).
labels (str | List[str] | List[List[str]]) β Entity type labels - string, list of strings, or per-text label lists.
relations (str | List[str] | List[List[str]]) β Relation type labels - string, list of strings, or per-text label lists.
flat_ner (bool) β Whether to use flat NER (no nested entities).
threshold (float) β Confidence threshold for entities.
adjacency_threshold (float | None) β Confidence threshold for adjacency matrix reconstruction (defaults to threshold).
relation_threshold (float | None) β Confidence threshold for relations (defaults to threshold).
multi_label (bool) β Allow multiple labels per span.
batch_size (int) β Batch size for processing.
packing_config (InferencePackingConfig | None) β Inference packing configuration.
input_spans (List[List[Dict]] | None) β Input entity spans to limit predictions to. Each span is a dict with βstartβ and βendβ character positions.
return_relations (bool) β Whether to return relation predictions.
return_class_probs (bool) β Whether to include class probabilities in output. Defaults to False.
- Returns:
Tuple of (entities, relations) if return_relations=True, else just entities.
- Return type:
List[List[Dict[str, Any]]] | Tuple[List[List[Dict[str, Any]]], List[List[Dict[str, Any]]]]
- predict_entities(text, labels, relations=[], flat_ner=True, threshold=0.5, adjacency_threshold=None, multi_label=False, return_class_probs=False, **kwargs)[source]ΒΆ
Predict entities for a single text input.
- Parameters:
text (str) β The input text to predict entities for.
labels (List[str]) β The entity labels to predict.
relations (List[str]) β The relation labels (used for context but entities only returned).
flat_ner (bool) β Whether to use flat NER. Defaults to True.
threshold (float) β Confidence threshold for predictions. Defaults to 0.5.
adjacency_threshold (float | None) β Threshold for adjacency matrix reconstruction. Defaults to threshold.
multi_label (bool) β Whether to allow multiple labels per entity. Defaults to False.
return_class_probs (bool) β Whether to include class probabilities in output. Defaults to False.
**kwargs β Additional arguments passed to inference.
- Returns:
List of entity predictions as dictionaries.
- Return type:
List[Dict[str, Any]]
- predict_relations(text, labels, relations, flat_ner=True, threshold=0.5, adjacency_threshold=None, relation_threshold=None, multi_label=False, **kwargs)[source]ΒΆ
Predict entities and relations for a single text input.
- Parameters:
text (str) β The input text to predict entities and relations for.
labels (List[str]) β The entity labels to predict.
relations (List[str]) β The relation labels to predict.
flat_ner (bool) β Whether to use flat NER. Defaults to True.
threshold (float) β Confidence threshold for entities. Defaults to 0.5.
adjacency_threshold (float | None) β Threshold for adjacency matrix reconstruction. Defaults to threshold.
relation_threshold (float | None) β Confidence threshold for relations. Defaults to threshold.
multi_label (bool) β Whether to allow multiple labels per entity. Defaults to False.
**kwargs β Additional arguments passed to inference.
- Returns:
Tuple of (entities, relations) for the single text.
- Return type:
Tuple[List[Dict[str, Any]], List[Dict[str, Any]]]
- evaluate(test_data, flat_ner=False, multi_label=False, threshold=0.5, adjacency_threshold=None, relation_threshold=None, batch_size=12, entity_types=None)[source]ΒΆ
Evaluate the model on both NER and relation extraction tasks.
- Parameters:
test_data (List[Dict[str, Any]]) β The test data containing text, entity, and relation annotations.
flat_ner (bool) β Whether to use flat NER. Defaults to False.
multi_label (bool) β Whether to use multi-label classification. Defaults to False.
threshold (float) β The threshold for entity predictions. Defaults to 0.5.
adjacency_threshold (float | None) β Threshold for adjacency matrix reconstruction. Defaults to threshold.
relation_threshold (float | None) β The threshold for relation predictions. Defaults to threshold.
batch_size (int) β The batch size for evaluation. Defaults to 12.
entity_types (List[str] | None) β Optional list of entity types to evaluate. If None, extracts from test data. Defaults to None.
- Returns:
ner_output: Formatted string with NER P, R, F1
ner_f1: NER F1 score
rel_output: Formatted string with relation extraction P, R, F1
rel_f1: Relation extraction F1 score
- Return type:
Tuple of ((ner_output, ner_f1), (rel_output, rel_f1)) containing
- class gliner.model.UniEncoderTokenRelexGLiNER(*args, **kwargs)[source]ΒΆ
Bases:
UniEncoderSpanRelexGLiNERGLiNER model for both entity recognition and relation extraction.
Performs joint entity and relation prediction, allowing the model to simultaneously detect entities and the relationships between them in a single forward pass.
Initialize a BaseGLiNER model.
- Parameters:
config β Model configuration object.
model β Pre-initialized model instance. If None, creates a new model.
tokenizer β Pre-initialized tokenizer. If None, creates a new tokenizer.
data_processor β Pre-initialized data processor. If None, creates a new processor.
backbone_from_pretrained β Whether to load the backbone from pretrained weights.
cache_dir β Directory for caching downloaded models.
**kwargs β Additional keyword arguments passed to model creation.
- config_classΒΆ
alias of
UniEncoderTokenRelexConfig
- model_classΒΆ
alias of
UniEncoderTokenRelexModel
- ort_model_classΒΆ
alias of
UniEncoderTokenRelexORTModel
- data_processor_classΒΆ
alias of
RelationExtractionTokenProcessor
- data_collator_classΒΆ
alias of
RelationExtractionTokenDataCollator
- decoder_classΒΆ
alias of
TokenRelexDecoder
- class gliner.model.GLiNER(*args, **kwargs)[source]ΒΆ
Bases:
Module,PyTorchModelHubMixinMeta GLiNER class that automatically instantiates the appropriate GLiNER variant.
This class provides a unified interface for all GLiNER models, automatically switching to specialized model types based on the model configuration. It supports various NER architectures including uni-encoder, bi-encoder, decoder-based, and relation extraction models.
- The class automatically detects the model type based on:
span_mode: Token-level vs span-level
labels_encoder: Uni-encoder vs bi-encoder
labels_decoder: Standard vs decoder-based
relations_layer: NER-only vs joint entity-relation extraction
- modelΒΆ
The loaded GLiNER model instance (automatically typed).
- configΒΆ
Model configuration.
- data_processorΒΆ
Data processor for the model.
- decoderΒΆ
Decoder for predictions.
Examples
Load a pretrained uni-encoder span model: >>> model = GLiNER.from_pretrained(βurchade/gliner_small-v2.1β)
Load a bi-encoder model: >>> model = GLiNER.from_pretrained(βknowledgator/gliner-bi-small-v1.0β)
Load from local configuration: >>> config = GLiNERConfig.from_pretrained(βconfig.jsonβ) >>> model = GLiNER.from_config(config)
Initialize from scratch: >>> config = GLiNERConfig(model_name=βmicrosoft/deberta-v3-smallβ) >>> model = GLiNER(config)
Initialize a GLiNER model with automatic type detection.
This constructor determines the appropriate GLiNER variant based on the configuration and replaces itself with an instance of that variant.
- Parameters:
config (str | Path | GLiNERConfig) β Model configuration (GLiNERConfig object, path to config file, or dict).
**kwargs β Additional arguments passed to the specific GLiNER variant.
Examples
>>> config = GLiNERConfig(model_name="bert-base-cased") >>> model = GLiNER(config) >>> model = GLiNER("path/to/gliner_config.json")
- __init__(config, **kwargs)[source]ΒΆ
Initialize a GLiNER model with automatic type detection.
This constructor determines the appropriate GLiNER variant based on the configuration and replaces itself with an instance of that variant.
- Parameters:
config (str | Path | GLiNERConfig) β Model configuration (GLiNERConfig object, path to config file, or dict).
**kwargs β Additional arguments passed to the specific GLiNER variant.
Examples
>>> config = GLiNERConfig(model_name="bert-base-cased") >>> model = GLiNER(config) >>> model = GLiNER("path/to/gliner_config.json")
- classmethod from_pretrained(model_id, revision=None, cache_dir=None, force_download=False, proxies=None, resume_download=False, local_files_only=False, token=None, map_location='cpu', strict=False, load_tokenizer=None, resize_token_embeddings=True, compile_torch_model=False, quantize=None, dtype=None, low_cpu_mem_usage=False, variant=None, load_onnx_model=False, onnx_model_file='model.onnx', max_length=None, max_width=None, post_fusion_schema=None, _attn_implementation=None, **model_kwargs)[source]ΒΆ
Load a pretrained GLiNER model with automatic type detection.
This method loads the configuration, determines the appropriate GLiNER variant, and delegates to that variantβs from_pretrained method.
- Parameters:
model_id (str) β Model identifier or local path.
revision (str | None) β Model revision.
cache_dir (str | Path | None) β Cache directory.
force_download (bool) β Force redownload.
proxies (dict | None) β Proxy configuration.
resume_download (bool) β Resume interrupted downloads.
local_files_only (bool) β Only use local files.
token (str | bool | None) β HF token for private repos.
map_location (str) β Device to map model to.
strict (bool) β Enforce strict state_dict loading.
load_tokenizer (bool | None) β Whether to load tokenizer.
resize_token_embeddings (bool | None) β Whether to resize embeddings.
compile_torch_model (bool | None) β Whether to compile with torch.compile.
quantize (str | None) β Only
"int8"is accepted (int8 dynamic quantization: torchao on GPU, FBGEMM on CPU). For precision-only changes (fp16/bf16), usedtype=.Noneto disable.dtype (str | dtype | None) β Target floating-point dtype for the loaded weights (e.g.
torch.bfloat16,"bf16","fp16"). When set, weights are cast during the state-dict read so the fp32 copy is never fully materialized; prefer this overquantizefor plain precision changes.low_cpu_mem_usage (bool) β If True, build the model under
torch.device("meta")and useload_state_dict(assign=True), skipping the random-init compute and the fp32 shell allocation. See the base-class docstring for the full contract.variant (str | None) β
"fp16"/"bf16"to prefermodel.{variant}.safetensorsover the default fp32 file. Best-effort with graceful fallback: if the publisher uploaded the variant, only that file is fetched; if not, warns and falls back to fp32 + cast on read. See the base-classfrom_pretraineddocstring for the full contract.None(default) preserves prior behavior.load_onnx_model (bool | None) β Whether to load ONNX model instead of PyTorch.
onnx_model_file (str | None) β Path to ONNX model file.
max_length (int | None) β Override max_length in config.
max_width (int | None) β Override max_width in config.
post_fusion_schema (str | None) β Override post_fusion_schema in config.
_attn_implementation (str | None) β Override attention implementation.
**model_kwargs β Additional model initialization arguments.
- Returns:
Appropriate GLiNER model instance.
Examples
>>> model = GLiNER.from_pretrained("urchade/gliner_small-v2.1") >>> model = GLiNER.from_pretrained("knowledgator/gliner-bi-small-v1.0") >>> model = GLiNER.from_pretrained("path/to/local/model", quantize="int8") >>> model = GLiNER.from_pretrained("urchade/gliner_small-v2.1", dtype="bf16") >>> # If the repo publishes model.bf16.safetensors, download only that: >>> model = GLiNER.from_pretrained("org/gliner_bf16-v1", variant="bf16")
- classmethod from_config(config, cache_dir=None, load_tokenizer=True, resize_token_embeddings=True, backbone_from_pretrained=True, compile_torch_model=False, quantize=None, map_location='cpu', max_length=None, max_width=None, post_fusion_schema=None, _attn_implementation=None, **model_kwargs)[source]ΒΆ
Create a GLiNER model from configuration.
- Parameters:
config (GLiNERConfig | str | Path | dict) β Model configuration (GLiNERConfig object, path to config file, or dict).
cache_dir (str | Path | None) β Cache directory for downloads.
load_tokenizer (bool) β Whether to load tokenizer.
resize_token_embeddings (bool) β Whether to resize token embeddings.
backbone_from_pretrained (bool) β Whether to load the backbone encoder from pretrained weights.
compile_torch_model (bool) β Whether to compile with torch.compile.
quantize (str | None) β Only
"int8"is accepted (int8 dynamic quantization: torchao on GPU, FBGEMM on CPU). For precision-only changes (fp16/bf16), usedtype=.Noneto disable.map_location (str) β Device to map model to.
max_length (int | None) β Override max_length in config.
max_width (int | None) β Override max_width in config.
post_fusion_schema (str | None) β Override post_fusion_schema in config.
_attn_implementation (str | None) β Override attention implementation.
**model_kwargs β Additional model initialization arguments.
- Returns:
Initialized GLiNER model instance.
Examples
>>> config = GLiNERConfig(model_name="microsoft/deberta-v3-small") >>> model = GLiNER.from_config(config) >>> model = GLiNER.from_config("path/to/gliner_config.json")
- property model_map: dict[str, dict[str, Any]]ΒΆ
Map configuration patterns to their corresponding GLiNER classes.
- Returns:
Dictionary mapping model types to their classes and descriptions.