Blog

Latest

The Hidden Truth About LLM Performance: Why Your Benchmark Results Might Be Misleading

July 28, 2025

There are two systematic issues with benchmarks, which were already present in the community before the advent of LLMs but became worse …

Read post

2024

Exploring Advanced Prompt Engineering with Google’s New Gemma Models

Today, Google launched a new set of models named Gemma. These models are based on the same tech and research used for creating the Gemini…

2023

🚀 Launching Napolab: The Natural Portuguese Language Benchmark 📊

Napolab is here: a curated collection of Portuguese datasets designed for easy evaluation of language models.

2023

Hashformers v2.0.0 is out! 🚀

Hashtag segmentation, the task of adding spaces between words in a hashtag, can now be done with Large Language Models (LLMs).

2023

A Simple Method to Detect In-Demand Tech Skills

It’s no secret that the tech landscape is dynamic and ever-evolving. New technologies are born, they mature, and then, often, they are…

2023

Hashformers: Hashtag Segmentation Applications in Abusive Language Detection

Abusive language detection, a critical aspect of modern NLP research, is often challenged by the lack of generalization across different…

2023

📢 New Portuguese NLP Model Alert! 🇵🇹 🇧🇷

I am thrilled to announce the latest milestone in the advancement of Portuguese language technology — the Albertina PT ! This breakthrough…

Archive

<h2 class="archive__subtitle">2025</h2>

Jul 28

The Hidden Truth About LLM Performance: Why Your Benchmark Results Might Be Misleading

There are two systematic issues with benchmarks, which were already present in the community before the advent of LLMs but became worse …

<h2 class="archive__subtitle">2024</h2>

Feb 21

Exploring Advanced Prompt Engineering with Google’s New Gemma Models

Today, Google launched a new set of models named Gemma. These models are based on the same tech and research used for creating the Gemini…

<h2 class="archive__subtitle">2023</h2>

Sep 11

<h2 class="archive__subtitle">2022</h2>

Mar 09

Blog

The Hidden Truth About LLM Performance: Why Your Benchmark Results Might Be Misleading

Exploring Advanced Prompt Engineering with Google’s New Gemma Models

🚀 Launching Napolab: The Natural Portuguese Language Benchmark 📊

Hashformers v2.0.0 is out! 🚀

A Simple Method to Detect In-Demand Tech Skills

Hashformers: Hashtag Segmentation Applications in Abusive Language Detection

📢 New Portuguese NLP Model Alert! 🇵🇹 🇧🇷

Archive

The Hidden Truth About LLM Performance: Why Your Benchmark Results Might Be Misleading

Exploring Advanced Prompt Engineering with Google’s New Gemma Models

🚀 Launching Napolab: The Natural Portuguese Language Benchmark 📊

Hashformers v2.0.0 is out! 🚀

A Simple Method to Detect In-Demand Tech Skills

Hashformers: Hashtag Segmentation Applications in Abusive Language Detection

📢 New Portuguese NLP Model Alert! 🇵🇹 🇧🇷

15 Datasets for Word Segmentation on the Hugging Face Hub

The cold start problem in NLP

The word ‘had’ and the calendar in Finnegans Wake

Integrating Ray Tune, Hugging Face Transformers and W&B