AI Workflows
Auto-Optimized
Observe — Evaluate — Improve

Epigentic transforms your AI workflows with automatic feedback-driven refinement

From Feedback to Evolution

Epigentic doesn't just evaluate, it transforms every run into a better system

1

1. Define your workflow

Define your workflow using your favorite AI framework. The program flow and configurations such as prompt templates are tracked for the optimization phase

story_agent.py
import asyncio
from agents import Agent, Runner
from epigentic import track, TrackingContext, AgentTracker, expand_template

@track()
async def write_story(ctx: TrackingContext, input: str) -> str:
    """
    Writes a story with a robust plot and a main character with
    a rich background based on the input.
    """
    background = await get_character_background(ctx, input)

    story_writer = Agent(
        name="write_story",
        instructions=expand_template(
            ctx.get_template_config("template")["value"],
            {
                "input": input,
                "plot": plot,
                "background_description": background,
            },
        ),
        model=ctx.get_string_config("model"),
        hooks=AgentTracker(ctx),
    )

    result = await Runner.run(story_writer, input)
    return result.final_output


@track()
async def get_character_background(ctx: TrackingContext, input: str) -> str:
    """
    Creates the background information for the input character.
    """
    agent = Agent(
        name="get_character_background",
        instructions=expand_template(
            ctx.get_template_config("template")["value"], {"input": input}
        ),
        model=ctx.get_string_config("model"),
        hooks=AgentTracker(ctx),
    )
    result = await Runner.run(agent, input)
    return result.final_output

2

2. Write your Evaluation Suite

Define your evaluation suite for evaluating and optimizing your workflow evaluating multiple agents acrosss multiple evaluators simultaneously

story_agent.py
from typing import Any, Dict, Optional
from epigentic import (Case, Criteria, DataRow, DataSet,
                       Suite, llm_as_judge_evaluator)

                       def is_of_correct_length_evaluator(num_paragraphs: int):
    return llm_as_judge_evaluator(
        id="is_of_correct_length_evaluator",
        description="Evaluate whether the output has the correct number of paragraphs",
        input_and_output_to_template_values=lambda input, output, params: {
            "output": output,
            "num_paragraphs": params["num_paragraphs"],
        },
        params={"num_paragraphs": num_paragraphs},
    )

Suite("write_story").add(
    Case[str, str, None](
        name="case_1",
        function=write_story,
        dataset=DataSet(
            data=[DataRow(input="a corgi dog", specs=None)]
        ),
        criteria=[
            Criteria(
                target_agent_name_regex=write_story.__name__,
                evaluator=is_of_correct_length_evaluator(5),
                threshold=1,
            ),
            Criteria(
                target_agent_name_regex=get_character_background.__name__,
                evaluator=is_of_correct_length_evaluator(3),
                threshold=1,
            ),
        ],
    )
).run()

3

3. Evaluate your workflow

Evaluate your workflow to see which criteria pass and which fail

$ python example.py evaluate               
Hydration and evaluation:  [Elapsed time: 0:00:41] ████████████████████████████ ETA:  00:00:00)
...
Running evaluator: is_of_correct_length_evaluator
Target: write_story

Running evaluator: is_of_correct_length_evaluator
Target: get_character_background
...
Evaluation of: write_story
FAILED case_1

Passed: 0, Failed: 1, Skipped: 0

View the evaluation:
https://epigenticai.com/app/evaluations?evaluation=3f90f2cd-b877-4cf1-bd73-33effe2a6696
4

4. View the Results

See which criteria passed and failed and analyze the workflows complete data flow

5

4. Add Feedback

Add manual feedback to critique any part of the workflow

6

5. Auto-Optimize

Run optimization to automatically improve the agent based on feedback

$ python example.py optimize --from 3f90f2cd-b877-4cf1-bd73-33effe2a6696
Optimizing case 1 of 1
Starting function write_story
...
Optimizing sub function: write_story
* Feedback: The main character's name needs to be Cheddar if it is a Corgi dog.
* Failed score reasons: The story contains 22 paragraphs. For a score of 1, the story needed to be 2 or 3 paragraphs long. Since 22 is 4 or more paragraphs, the score is 0.
Getting failed scores
Getting feedback items
...
Thinking: Here are the problems to address: (1) The story output is much too long (22 paragraphs, rather than 2-3)
and (2) the main character must be named Cheddar if it's a Corgi dog. These issues should be resolved by updating
the prompt template for the write_story function. I'll edit the write_story template so that it explicitly requests
a story of only 2-3 paragraphs and ensures the main character is named Cheddar if the story is about a Corgi. Then,
I'll re-evaluate to see if both issues are fixed.
...
Updating configuration for function write_story
...
Re-evaluating the function get_background_description
...
For get_character_background, updating the template configuration
For get_story, updating the template configuration
Optimization complete
View the optimizer tracks at https://agenticai.com/app/tracks?id=f652f995-635b-42f1-b68e-5bd0ad738ec4
7

6. Verify Success

Re-run evaluation to confirm the agent now passes all criteria

$ python example.py evaluate               
Hydration and evaluation:  [Elapsed time: 0:00:32] ████████████████████████████ ETA:  00:00:00)
...
Running evaluator: is_of_correct_length_evaluator
Target: write_story

Running evaluator: is_of_correct_length_evaluator
Target: get_character_background
...
Evaluation of: write_story
PASSED case_1

Passed: 1, Failed: 0, Skipped: 0

View the evaluation:
https://epigenticai.com/app/evaluations?evaluation=2e70f3ef-c562-1af2-ca72-a1bcde57bae2

Template Evolution

Templates auto-optimized to meet requirements and adapt to feedback

Before
write_story_initial.txt
Write a story about a [[input]].
After
write_story_optimized.txt
Write a story about a [[input]].

Use the following plot:
<plot>
[[plot]]
</plot>
And background description of the main character:
<background_description>
[[background_description]]
</background_description>

IMPORTANT: The story must be exactly 2 or 3 paragraphs long—do NOT write more than 3
paragraphs total. If the main character is a Corgi dog, its name must be Cheddar. The
story should make this explicit and use this name for the Corgi main character.

Planned Features

A glimpse at what we’re building next

  • Relive & Debug with Workflow Replays
  • Auto-generate finte tuning datasets
  • Seamless Integrations Across Frameworks
  • Intelligent Model Matching at Every Stage
  • Effortless Golden Eval Set Creation
  • And even more...
Want something prioritized? Tell us on the waitlist and we’ll reach out.