Reducing Time to Value for Data Science Projects: Part 3

Parts 1 and 2 of this series focussed on the technical aspect of improving the experimentation process. This started with ...
Read more
Hitchhiker’s Guide to RAG: From Tiny Files to Tolstoy with OpenAI’s API and LangChain

, I walked you through setting up a very simple RAG pipeline in Python, using OpenAI’s API, LangChain, and your ...
Read more
Are You Being Unfair to LLMs?

hype surrounding AI, some ill-informed ideas about the nature of LLM intelligence are floating around, and I’d like to address ...
Read more
Building a Сustom MCP Chatbot | Towards Data Science

a method to standardise communication between AI applications and external tools or data sources. This standardisation helps to reduce the ...
Read more
The Crucial Role of NUMA Awareness in High-Performance Deep Learning

world of deep learning training, the role of the ML developer can be likened to that of the conductor of ...
Read more
Work Data Is the Next Frontier for GenAI

, the work output of knowledge workers, is the single most valuable data source for LLM training, uniquely capable of ...
Read more
How to Fine-Tune Small Language Models to Think with Reinforcement Learning

in fashion. DeepSeek-R1, Gemini-2.5-Pro, OpenAI’s O-series models, Anthropic’s Claude, Magistral, and Qwen3 — there is a new one every month. ...
Read more
Run Your Python Code up to 80x Faster Using the Cython Library

excellent language for rapid prototyping and code development, but one thing I often hear people say about using it is ...
Read more
The Five-Second Fingerprint: Inside Shazam’s Instant Song ID

This post continues Behind the Tap, a series exploring the hidden mechanics of everyday tech — from Uber to Spotify to search ...
Read more
GraphRAG in Action: A Simple Agent for Know-Your-Customer Investigations

the world of financial services, Know-Your-Customer (KYC) and Anti-Money Laundering (AML) are critical defense lines against illicit activities. KYC is ...
Read more