With LangSmith LLM Gateway, detection, investigation, and remediation happen on the same surface where agents are built.
✅ Fewer tools
✅ Fewer context switches
✅ Policy events that arrive next to the trace data that explains them
Learn more: langchain.com/blog/introduci…
The next phase of agent development in financial services will be measured by trust, control, and production readiness.
In our latest guide, we look at how @jpmorgan, @Chime, and Bridgewater are approaching production agents across research, member experience, and investment
Read the guide to learn what teams are prioritizing across observability, evals, governance, security, and human-in-the loop reviews. info.langchain.com/guide/definiti…?
another banger from Sydney! i think this whole hierarchy of loops is still super early but some primitives we know work
ex: verification as a primitive is so ridiculously important for non-slop semi-long-horizon work, it’s worth spending days to weeks making sure the
What are Online Evals?
Most agent evals run "offline": a premade dataset of inputs goes through the agent, and an intermediate step or final output gets scored. They answer "is this version better than the last?"
Online evals answer a different question: "is the agent still
Agent governance isn't something that should be bolted onto agentic systems.
LangSmith LLM Gateway lets you enforce the rules in the same platforms where agents are built, observed, and evaluated.
LangSmith Sandboxes are the right layer when your agent needs to do something, not just say something. Here's when you want to reach for them 🧵
✅ Your agent generates code and you want it to verify that code runs before responding
✅ You're building a coding assistant, CI
✅ You're running multi-step workflows where state needs to persist across tool calls
✅ You need burst capacity (i.e. thousands of parallel environments for RL training or evals) that has to scale from zero in seconds
✅ You're accepting any user-supplied input that could end