4 Comments
User's avatar
Scott Ashton's avatar

Great insights! I know you open sourced some of your repo but I haven’t checked - Have you guys open sourced your agent loop?

I ask because where I work in fintech the agent directed loops produced utter garbage, and this is literally one week ago. Wasn’t until we scripted the path, inferred intermediate inputs, and really just scoped out the probabilistic DAG that we were able to get decent results.

I think the big differentiator is likely some form of automated feedback. Does the code compile, tests pass, etc. Not really possible to do that in our domain

Scott Ashton's avatar

We used OpenAIs agent sdk, but I’ll have to give Vercels a try! I’m still leery of needless abstraction, especially so early in the evolution of agentic loops, so I’m more inclined to roll our own tool calling loop. Would be interested to hear your reasoning on using their agent sdk vs the competitors vs rolling your own!

Also, in a future post, I’d love to hear you dig into more details on how your evals have improved since switching to a more open ended agentic loop!

dal's avatar

Our agent is built with the AI SDK from Vercel so the loop basically comes out of the box when you run it with function calling. Here is some info for digging into that: https://ai-sdk.dev/docs/agents/building-agents#building-agents

It gives you granularity to dig into each iteration the LLM takes.

It's not always perfect, but I would totally try to think through ways you can validate results and put good effort into building it.

Our visualizations logic is what some call "bi-as-code" which is just a yml file with the specs and then we have type checks and logical checks in it. The Zod library in TS has some good functionality for type validation with that.

This is also a great resource on building agents: https://github.com/humanlayer/12-factor-agents