...

How to Perform Comprehensive Large Scale LLM Validation

How to Perform Comprehensive Large Scale LLM Validation
and evaluations are critical to ensuring robust, high-performing LLM applications. However, such topics are often overlooked in the greater scheme ...
Read more

How to Use LLMs for Powerful Automatic Evaluations

How to Use LLMs for Powerful Automatic Evaluations
discuss how you can perform automatic evaluations using LLM as a judge. LLMs are widely used today for a variety ...
Read more

Agentic AI: On Evaluations | Towards Data Science

Agentic AI: On Evaluations | Towards Data Science
mostly a It’s not the most exciting topic, but more and more companies are paying attention. So it’s worth digging ...
Read more

Evaluation-Driven Development for LLM-Powered Products: Lessons from Building in Healthcare

Evaluation-Driven Development for LLM-Powered Products: Lessons from Building in Healthcare
in the field of large language models (LLM) and their applications is extraordinarily rapid. Costs are coming down and foundation ...
Read more

LLM-as-a-Judge: A Practical Guide | Towards Data Science

LLM-as-a-Judge: A Practical Guide | Towards Data Science
If features powered by LLMs, you already know how important evaluation is. Getting a model to say something is easy, ...
Read more

Evaluating LLMs for Inference, or Lessons from Teaching for Machine Learning

Evaluating LLMs for Inference, or Lessons from Teaching for Machine Learning
opportunities recently to work on the task of evaluating LLM Inference performance, and I think it’s a good topic to ...
Read more