Urchade Zaratiana

Welcome to Urchade Zaratiana's Webpage

Portrait of Urchade Zaratiana

I am a PhD student at Laboratoire Informatique de Paris Nord (LIPN) 🏫, under the supervision of Nadi Tomeh and Thierry Charnois. My research focuses on structured prediction for Natural Language Processing πŸ§ πŸ“Š.

I am also a Member of Technical Staff at Fastino πŸ¦ŠπŸ‡ΊπŸ‡Έ, where I am leading the Data structuring team 🧠. I am currently based in Île de la RΓ©union πŸ‡«πŸ‡·πŸ‡·πŸ‡ͺ.

I am passionate about the science of deep learning, with particular interest in topics such as domain generalization 🌐, zero/few-shot/instruction learning πŸ› οΈπŸ“š, learning under noisy data/labels πŸ“‰πŸ”, scaling laws πŸ“ˆ, and simplicity bias πŸŽ›οΈ.

During my PhD, I have worked on the following research problems:

Structured decoding (2021-)

Named Entity Recognition as Structured Span Prediction (Zaratiana et al., EMNLP 2022 UM-IoS)

EnriCO: Constrained decoding of information extraction using logical rules (Zaratiana et al., ArXiv 2024)

Graph structure learning (2021-)

GraphER: End-to-end graph structure learning of joint entity and relation extraction (Zaratiana et al., ArXiv 2024)

GNNer: Using graph (and Graph Neural Network) to implicitly constrain the output of neural network (Zaratiana et al., ACL 2022 SRW) (Link)

Structured loss functions (2022-)

Filtered Semi-Markov CRF (Zaratiana et al., EMNLP 2023)

Global Span Selection (Zaratiana et al., EMNLP 2022 UM-IoS)

Constrained decoding of language model (2022-)

Autoregressive text-to-graph model for Information extraction (Zaratiana et al., AAAI 2024)

Zero-shot Learning for Information Extraction (2023-)

GLiNER (Zaratiana et al., NAACL 2024): Zero-shot Named Entity Recognition ()

GraphER (Zaratiana et al., ArXiv 2024): An end-to-end model for zero-shot joint entity and relation extraction

Arabic Information Extraction (2023-)

Cross-Dialectal Named Entity Recognition in Arabic (El Khbir et al., ArabNLP 2023)

A Span-Based Approach for Flat and Nested Arabic Named Entity Recognition: Top 1 system at Wojood NER shared task (El Khbir et al., ArabNLP 2023)

Language model pretraining (2024-)

Training a BERT-like model without positional encoding (Zaratiana et al., ICLR 2024 Tiny Paper)

Detection of machine-generated text (2024-)

Our system ranked top 2 at SemEval "Multidomain, Multimodal and Multilingual Machine-Generated Text Detection" shared task using a 300M parameters model (Ben Fares et al, 2024)

Selected OSS Projects

GLiNER(1.9k ⭐)

A lightweight model for Named Entity Recognition (NER) using a BERT-like transformer.

GraphER(68 ⭐)

End-to-end zero-shot entity and relation extraction.

ATG(51 ⭐)

An autoregressive text-to-graph framework for joint entity and relation extraction.

struct_ie(2 ⭐)

Structured Information Extraction with Large Language Models.