Developer Day
Jan 31, 2024
3:00 pm
-
3:55 pm

Evaluate & Customize LLMs with Airtrain.ai

Emmanuel will guide participants through Airtrain.ai's features for evaluating and customizing LLMs. Learn how to score model outputs, validate JSON schema compliance, and compare models using unsupervised metrics. Gain practical insights into configuring evaluation jobs and analyzing model performance.

Add to Calendar Jan 31, 2024 Jan 31, 2024 America/Los_Angeles Evaluate & Customize LLMs with Airtrain.ai Emmanuel will guide participants through Airtrain.ai's features for evaluating and customizing LLMs. Learn how to score model outputs, validate JSON schema compliance, and compare models using unsupervised metrics. Gain practical insights into configuring evaluation jobs and analyzing model performance. Workshop 3 | The Hibernia | 1 Jones St, San Francisco, CA 94102
Access Slides

About this session

In this one-hour workshop on Airtrain.ai, participants will explore the platform's unique features for evaluating, scoring, and comparing LLMs. Airtrain.ai, a no-code compute platform, facilitates batch evaluation workloads, allowing users to upload datasets, select models for evaluation, and design benchmark metrics tailored to specific applications. The workshop will cover three main evaluation methods: LLM-assisted evaluation for scoring model outputs based on plain-English task descriptions, JSON schema validation for ensuring model output compliance with application requirements, and unsupervised metrics for straightforward model comparisons. Attendees will gain practical insights into configuring evaluation jobs and utilizing Airtrain AI's visualization tools to analyze model performance across various metrics​​​​​​.

Add to Calendar Jan 31, 2024 Jan 31, 2024 America/Los_Angeles Evaluate & Customize LLMs with Airtrain.ai Emmanuel will guide participants through Airtrain.ai's features for evaluating and customizing LLMs. Learn how to score model outputs, validate JSON schema compliance, and compare models using unsupervised metrics. Gain practical insights into configuring evaluation jobs and analyzing model performance. Workshop 3 | The Hibernia | 1 Jones St, San Francisco, CA 94102

More sessions

Developer Day
Jan 31, 2024
12:45 pm

AI-Powered AppSec With Qwiet AI

Stuart, CEO of Qwiet AI, presents the latest features of our AI powered AppSec platform and unveils our exciting roadmap for 2024.

Developer Day
Jan 31, 2024
2:25 pm

Automating Codebase Migrations With Second

Eric, founder and CEO of Second, will showcase the latest features revolutionizing complex codebase migrations. Discover the cutting-edge advancements empowering developers worldwide. Get an exclusive sneak peek into Second's ambitious roadmap for 2024.

Developer Day
Jan 31, 2024
3:40 pm

Building Reliable AI Agents for Software Development

Explore how Grit combines LLMs and compilers to deliver reliable migrations at scale