Hi there, I’m Ben 👋

I’m interested in NLP&IR, with a special focus on developing cool, smaller models to power LLM-based applications. You may know me from my frequent twitter rants about using late-interaction over dense vectors, or my love for encoders.

I do R&D at Answer.AI 👀, where every quest is a side-quest and scope-creep is encouraged. Currently, I’ve helped gather a really cool crowd of people to bring BERT to 2024 👀!

I’ve also built 🪤RAGatouille, a python library whose grand aim is to bridge the gap between state-of-the-art research code and commonly used practices and Rerankers, a library whose aim is to make it really easy to use just about any common reranking method, and swap them in and out painlessly.

​ Feel free to reach out if you’d like to chat!

Working while sick

I never really write online about personal things (except a few times, generally about my wife, who is awesome). Not that I don’t want to, I’m not particularly privacy-minded, but more that I never feel like it’s that interesting to share a ton about my pretty mundane day-to-day. There’s one thing that I do off-handedly publicly mention, maybe more than I realise: health. Generally in the context of “sorry about RAGatouille, I’ve had less time to spend on it than I thought because of health”, or some other similar phrasing....

February 13, 2025 · Ben

[Answer.AI] Small but Mighty: Introducing answerai-colbert-small

Say hello to answerai-colbert-small-v1, a tiny ColBERT model that punches well above its weight.

August 13, 2024 · Ben

[Answer.AI] JaColBERTv2.5🇯🇵: Optimising Retrieval Training for Lower-Resources Languages

Introducing JaColBERTv2.5🇯🇵, the new best Japanese retrieval model. Through this release, we present a thorough analysis to better understand what helps in training a good multi-vector retrieval model.

August 2, 2024 · Ben

[Answer.AI] A little pooling goes a long way for multi-vector representations

Blog post offering a quick overview of ColBERT and how it works, and introducing an efficient pooling trick to alleviate the issues it faces.

April 8, 2024 · Ben

Questions & Answer(s): thoughts and joining Answer.AI

If you’re old school, you can find a raw HTML version of this post here This is a fairly long, stream-of-thought post about how I currently (Sunday Feb 4, 2024) feel about the broader ML/NLP/IR ecosystem and its future. Everything here’s on-the-fly opinion and can & will change. In summary: I think ML developments are incredibly exciting, and we need to continue to work on bridging the gap between ML-as-a-commodity-for-ML-practitioners to ML-as-a-commodity-for-everyone....

February 6, 2024 · Ben