Urchade Zaratiana

Welcome to Urchade Zaratiana's Webpage

Portrait of Urchade Zaratiana

I am a Member of Technical Staff at Fastino πŸ¦ŠπŸ‡ΊπŸ‡Έ, where I lead the Data Structuring team 🧠. I'm currently based in Île de la RΓ©union πŸ‡«πŸ‡·πŸ‡·πŸ‡ͺ.

I hold a PhD from the Laboratoire Informatique de Paris Nord (LIPN) 🏫, completed under the supervision of Nadi Tomeh and Thierry Charnois. My doctoral research focused on structured prediction for Natural Language Processing πŸ§ πŸ“Š.

I'm passionate about deep learning, particularly in areas such as domain generalization 🌐, zero/few-shot/instruction learning πŸ› οΈπŸ“š, learning under noisy data/labels πŸ“‰πŸ”, scaling laws πŸ“ˆ, and simplicity bias πŸŽ›οΈ.

During my PhD, I have worked on the following research problems:

Structured decoding (2021-)

Named Entity Recognition as Structured Span Prediction (Zaratiana et al., EMNLP 2022 UM-IoS)

EnriCO: Constrained decoding of information extraction using logical rules (Zaratiana et al., ArXiv 2024)

Graph structure learning (2021-)

GraphER: End-to-end graph structure learning of joint entity and relation extraction (Zaratiana et al., ArXiv 2024)

GNNer: Using graph (and Graph Neural Network) to implicitly constrain the output of neural network (Zaratiana et al., ACL 2022 SRW) (Link)

Structured loss functions (2022-)

Filtered Semi-Markov CRF (Zaratiana et al., EMNLP 2023)

Global Span Selection (Zaratiana et al., EMNLP 2022 UM-IoS)

Constrained decoding of language model (2022-)

Autoregressive text-to-graph model for Information extraction (Zaratiana et al., AAAI 2024)

Zero-shot Learning for Information Extraction (2023-)

GLiNER (Zaratiana et al., NAACL 2024): Zero-shot Named Entity Recognition ()

GraphER (Zaratiana et al., ArXiv 2024): An end-to-end model for zero-shot joint entity and relation extraction

Arabic Information Extraction (2023-)

Cross-Dialectal Named Entity Recognition in Arabic (El Khbir et al., ArabNLP 2023)

A Span-Based Approach for Flat and Nested Arabic Named Entity Recognition: Top 1 system at Wojood NER shared task (El Khbir et al., ArabNLP 2023)

Language model pretraining (2024-)

Training a BERT-like model without positional encoding (Zaratiana et al., ICLR 2024 Tiny Paper)

Detection of machine-generated text (2024-)

Our system ranked top 2 at SemEval "Multidomain, Multimodal and Multilingual Machine-Generated Text Detection" shared task using a 300M parameters model (Ben Fares et al, 2024)

Selected OSS Projects

GLiNER(1.9k ⭐)

A lightweight model for Named Entity Recognition (NER) using a BERT-like transformer.

GraphER(68 ⭐)

End-to-end zero-shot entity and relation extraction.

ATG(51 ⭐)

An autoregressive text-to-graph framework for joint entity and relation extraction.

struct_ie(2 ⭐)

Structured Information Extraction with Large Language Models.