LLM-as-a-Judge: A Practical Guide | Towards Data Science
If features powered by LLMs, you already know how important evaluation is. Getting a model to say something is easy, ...
Read more Beyond Model Stacking: The Architecture Principles That Make Multimodal AI Systems Work
1. It with a Vision While rewatching Iron Man, I found myself captivated by how deeply JARVIS could understand a ...
Read more Regularisation: A Deep Dive into Theory, Implementation, and Practical Insights
This blog is a deep dive into regularisation techniques, intended to give you simple intuitions, mathematical foundations, and implementation details. ...
Read more AI Is Not a Black Box (Relatively Speaking)
Summary: Opinion piece for the general TDS audience. I argue that AI is more transparent than humans in tangible ways. ...
Read more Connecting the Dots for Better Movie Recommendations
promises of retrieval-augmented generation (RAG) is that it allows AI systems to answer questions using up-to-date or domain-specific information, without ...
Read more How I Automated My Machine Learning Workflow with Just 10 Lines of Python
is magical — until you’re stuck trying to decide which model to use for your dataset. Should you go with a random ...
Read more Prescriptive Modeling Unpacked: A Complete Guide to Intervention With Bayesian Modeling.
In this article, I will demonstrate how to move from simply forecasting outcomes to actively intervening in systems to steer ...
Read more Perplexity’s CEO Sees AI Agents as the Next Web Battleground
Wait though … Perplexity—like other AI search engines—has been criticized for hallucinating and getting things wrong. We welcome this criticism, ...
Read more Data Drift Is Not the Actual Problem: Your Monitoring Strategy Is
is an approach to accuracy that devours data, learns patterns, and predicts. However, with the best models, even those predictions ...
Read more Evaluating LLMs for Inference, or Lessons from Teaching for Machine Learning
opportunities recently to work on the task of evaluating LLM Inference performance, and I think it’s a good topic to ...
Read more