<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[The AI Blueprint: Practical Content for AI Builders : ⚙️ BuildAIers’ Toolkit ⚙️]]></title><description><![CDATA[🛠️ In-Depth Reviews & Technical Breakdowns for AI Builders 🛠️

We know it can be overwhelming for any builder to navigate the AI tool stack. So here, we deep dive into the tools, frameworks, and APIs that power modern AI.

We offer hands-on reviews, feature breakdowns, and performance benchmarks to help you—a buildAIer—pick the right tools for the job.]]></description><link>https://neurlcreators.substack.com/s/buildaiers-toolkit</link><image><url>https://substackcdn.com/image/fetch/$s_!6udc!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5b15961-5020-4a71-b040-30f9b3d3f232_256x256.png</url><title>The AI Blueprint: Practical Content for AI Builders : ⚙️ BuildAIers’ Toolkit ⚙️</title><link>https://neurlcreators.substack.com/s/buildaiers-toolkit</link></image><generator>Substack</generator><lastBuildDate>Sun, 03 May 2026 17:09:47 GMT</lastBuildDate><atom:link href="https://neurlcreators.substack.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Neurl LLC.]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[neural-blueprint@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[neural-blueprint@substack.com]]></itunes:email><itunes:name><![CDATA[Neurl Creators]]></itunes:name></itunes:owner><itunes:author><![CDATA[Neurl Creators]]></itunes:author><googleplay:owner><![CDATA[neural-blueprint@substack.com]]></googleplay:owner><googleplay:email><![CDATA[neural-blueprint@substack.com]]></googleplay:email><googleplay:author><![CDATA[Neurl Creators]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Understanding OpenClaw’s Hook: The Key to Evaluating Agents Properly]]></title><description><![CDATA[Instrument your OpenClaw Agent with Arize]]></description><link>https://neurlcreators.substack.com/p/understanding-openclaws-hook-the</link><guid isPermaLink="false">https://neurlcreators.substack.com/p/understanding-openclaws-hook-the</guid><dc:creator><![CDATA[Neurl Creators]]></dc:creator><pubDate>Tue, 21 Apr 2026 15:31:37 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/982f8614-9c66-4080-9c3a-4be13897b87f_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div id="youtube2-SQUS4-cQWfs" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;SQUS4-cQWfs&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/SQUS4-cQWfs?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><div><hr></div><p>Working with a tool like OpenClaw can sometimes feel like a black box, with many moving parts that make it hard to understand what is happening under the hood. However, OpenClaw&#8217;s Hook functionality changes that by giving you clear visibility into your agent&#8217;s behavior.</p><p>OpenClaw Hooks let you monitor events as they occur in real time, such as when a message is received or a tool call is made, providing detailed insight into your agent&#8217;s execution flow. When combined with an LLM observability tool, they make it easier to debug issues, understand decision-making, and optimize performance.</p><p>In this article, we&#8217;ll break down how Hooks work and then explore how to integrate them with <a href="https://arize.com/">Arize</a> for monitoring and improving your agent.</p><div><hr></div><p style="text-align: center;">Get your agent to set up the code for this article using the Agent&#8217;s skill</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://github.com/Neurl-LLC/OpenClaw_Observability_plugin_with_Arize/blob/main/Skill.md&quot;,&quot;text&quot;:&quot;Get Skill&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://github.com/Neurl-LLC/OpenClaw_Observability_plugin_with_Arize/blob/main/Skill.md"><span>Get Skill</span></a></p><div><hr></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NQy2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1473d24-1fa9-4705-971e-617466e9fdbf_2232x2590.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NQy2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1473d24-1fa9-4705-971e-617466e9fdbf_2232x2590.png 424w, https://substackcdn.com/image/fetch/$s_!NQy2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1473d24-1fa9-4705-971e-617466e9fdbf_2232x2590.png 848w, https://substackcdn.com/image/fetch/$s_!NQy2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1473d24-1fa9-4705-971e-617466e9fdbf_2232x2590.png 1272w, https://substackcdn.com/image/fetch/$s_!NQy2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1473d24-1fa9-4705-971e-617466e9fdbf_2232x2590.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NQy2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1473d24-1fa9-4705-971e-617466e9fdbf_2232x2590.png" width="2232" height="2590" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e1473d24-1fa9-4705-971e-617466e9fdbf_2232x2590.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2590,&quot;width&quot;:2232,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:818072,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://neurlcreators.substack.com/i/194493311?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0217ce59-a3ca-42f2-8866-2bebad12b064_2232x3189.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!NQy2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1473d24-1fa9-4705-971e-617466e9fdbf_2232x2590.png 424w, https://substackcdn.com/image/fetch/$s_!NQy2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1473d24-1fa9-4705-971e-617466e9fdbf_2232x2590.png 848w, https://substackcdn.com/image/fetch/$s_!NQy2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1473d24-1fa9-4705-971e-617466e9fdbf_2232x2590.png 1272w, https://substackcdn.com/image/fetch/$s_!NQy2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1473d24-1fa9-4705-971e-617466e9fdbf_2232x2590.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Hooks in OpenClaw</h2><p>Within OpenClaw, whenever an action occurs, an event is fired to signal that the action has taken place. For example, when a user sends a message, an event is triggered to indicate this interaction. Similarly, when the agent responds or uses a tool, corresponding events are fired.</p><p>OpenClaw Hooks allow you to attach additional logic to these events as they occur. In essence, hooks enable you to tap into OpenClaw&#8217;s event-driven system, giving you the ability to customize, and extend the system&#8217;s behavior.</p><p>OpenClaw hooks can be divided into the following categories:</p><ul><li><p><strong>Agent lifecycle:</strong> Hooks triggered during the agent&#8217;s execution.</p></li><li><p><strong>Message flow:</strong> Hooks related to messages entering and leaving the system.</p></li><li><p><strong>Tool execution:</strong> Hooks triggered before and after tool execution.</p></li><li><p><strong>Subagent coordination:</strong> Hooks triggered when subagents are invoked.</p></li><li><p><strong>Gateway lifecycle:</strong> Hooks triggered during the OpenClaw gateway&#8217;s operation.</p></li></ul><p>With all these hooks at our disposal, we gain visibility into a system that would otherwise be a black box.</p><h2>The need for Observability in Agents</h2><p>Observability is essential for understanding how an AI agent or LLM behaves and why it performs tasks in a certain way. Without it, you&#8217;re effectively running your agent in isolation, constantly guessing what went wrong.</p><p>OpenClaw is no exception. In fact, it requires strong observability for a few key reasons:</p><ul><li><p><strong>Sensitive operations:</strong> It often runs directly on a user&#8217;s machine and may handle sensitive tasks, making visibility critical.</p></li><li><p><strong>Autonomous behavior:</strong> It can operate proactively through features like <a href="https://docs.openclaw.ai/gateway/heartbeat">heartbeats</a> or <a href="https://docs.openclaw.ai/automation/cron-jobs">cron jobs</a>, executing tasks without direct user input.</p></li><li><p><strong>System complexity:</strong> A single interaction may involve multiple tool calls and internal steps, making it harder to track what&#8217;s happening under the hood.</p></li></ul><p>To address these challenges, observability becomes a necessity. Observing a system like OpenClaw requires a dedicated observability tool, and this is where <a href="https://docs.openclaw.ai/automation/cron-jobs">Arize</a> comes in.</p><p>Arize is a leading platform for LLM and agent observability, enabling developers to instrument their applications and capture every input and output flowing through the system.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HNk9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fb3b9dd-26f6-403c-b6ac-0343106f7a12_1536x435.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HNk9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fb3b9dd-26f6-403c-b6ac-0343106f7a12_1536x435.png 424w, https://substackcdn.com/image/fetch/$s_!HNk9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fb3b9dd-26f6-403c-b6ac-0343106f7a12_1536x435.png 848w, https://substackcdn.com/image/fetch/$s_!HNk9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fb3b9dd-26f6-403c-b6ac-0343106f7a12_1536x435.png 1272w, https://substackcdn.com/image/fetch/$s_!HNk9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fb3b9dd-26f6-403c-b6ac-0343106f7a12_1536x435.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HNk9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fb3b9dd-26f6-403c-b6ac-0343106f7a12_1536x435.png" width="1456" height="412" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3fb3b9dd-26f6-403c-b6ac-0343106f7a12_1536x435.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:412,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HNk9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fb3b9dd-26f6-403c-b6ac-0343106f7a12_1536x435.png 424w, https://substackcdn.com/image/fetch/$s_!HNk9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fb3b9dd-26f6-403c-b6ac-0343106f7a12_1536x435.png 848w, https://substackcdn.com/image/fetch/$s_!HNk9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fb3b9dd-26f6-403c-b6ac-0343106f7a12_1536x435.png 1272w, https://substackcdn.com/image/fetch/$s_!HNk9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fb3b9dd-26f6-403c-b6ac-0343106f7a12_1536x435.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">How Arize Works</figcaption></figure></div><p>This captured data is then sent to Arize using the <a href="https://opentelemetry.io/docs/specs/otel/protocol/">OpenTelemetry protocol</a>, where it can be monitored and analyzed in detail.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rnZU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F048ac314-8e8e-4d70-bd72-38b16f12fe35_1844x1080.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rnZU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F048ac314-8e8e-4d70-bd72-38b16f12fe35_1844x1080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!rnZU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F048ac314-8e8e-4d70-bd72-38b16f12fe35_1844x1080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!rnZU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F048ac314-8e8e-4d70-bd72-38b16f12fe35_1844x1080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!rnZU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F048ac314-8e8e-4d70-bd72-38b16f12fe35_1844x1080.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rnZU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F048ac314-8e8e-4d70-bd72-38b16f12fe35_1844x1080.jpeg" width="1456" height="853" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/048ac314-8e8e-4d70-bd72-38b16f12fe35_1844x1080.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:853,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rnZU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F048ac314-8e8e-4d70-bd72-38b16f12fe35_1844x1080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!rnZU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F048ac314-8e8e-4d70-bd72-38b16f12fe35_1844x1080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!rnZU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F048ac314-8e8e-4d70-bd72-38b16f12fe35_1844x1080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!rnZU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F048ac314-8e8e-4d70-bd72-38b16f12fe35_1844x1080.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Arize dashboard been used for observability</figcaption></figure></div><p>Instrumentation is relatively straightforward when building applications from scratch, as Arize provides <a href="https://arize.com/docs/ax/integrations">integrations</a> for many LLM and agent frameworks. However, it becomes more challenging when working with an existing system like OpenClaw.</p><p>Fortunately, OpenClaw&#8217;s hooks provide a solution. By leveraging hooks, we can instrument the agent and gain the visibility needed to properly monitor and evaluate its behavior.</p><div><hr></div><p style="text-align: center;">Need creative, high-quality technical content? Happy to chat! Book a call with our Creative Engineers&#128071;&#127998;</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://calendar.app.google/s6ekF2rDbPjKWLLQ8&quot;,&quot;text&quot;:&quot;Let&#8217;s Chat! &#128222;&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://calendar.app.google/s6ekF2rDbPjKWLLQ8"><span>Let&#8217;s Chat! &#128222;</span></a></p><div><hr></div><h2>Creating Hooks in OpenClaw</h2><p>OpenClaw provides two ways to create hooks: directly within the project or through plugins.</p><p>The <strong>direct method</strong> involves creating a hook directory containing two key files:</p><ul><li><p><strong>HOOK.md</strong> &#8211; defines the hook&#8217;s metadata and documentation.</p></li><li><p><strong>handler.ts</strong> &#8211; implements the hook&#8217;s logic.</p></li></ul><p>The <strong>second method</strong> is to create a <strong>plugin</strong> that utilizes hooks. In this article, we will focus on <a href="https://docs.openclaw.ai/automation/hooks#plugin-hooks">plugin hooks</a> because they offer greater flexibility and are much easier to distribute and reuse.</p><p>To learn more about creating hooks directly, refer to the <a href="https://docs.openclaw.ai/automation/hooks">OpenClaw documentation</a>.</p><h3>Building a Simple Plugin Hook</h3><p>Let&#8217;s create a simple plugin that leverages hooks.</p><h4><strong>1. Initialize the Project</strong></h4><p>Start by creating a TypeScript npm project named arize_openclaw_hook. Then install OpenClaw as a dependency:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;ae2da095-0e74-49c1-8674-608768e88bc9&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">npm install openclaw</code></pre></div><p>Ensure that OpenClaw is already installed and properly configured on your system or in the environment where it is running.</p><p>Below is an example of what your package.json might look like:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;json&quot;,&quot;nodeId&quot;:&quot;a37a250c-e5f0-4977-8bad-f6c453caed99&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-json">{
  "name": "arize_openclaw_hook",
  "version": "1.0.0",
  "description": "",
  "main": "index.ts",
  "scripts": {
    "test": "echo \"Error: no test specified\" &amp;&amp; exit 1"
  },
  "openclaw": {
    "extensions": [
      "./index.ts"
    ],
    "compat": {
      "pluginApi": "&gt;=2026.3.24-beta.2",
      "minGatewayVersion": "2026.3.24-beta.2"
    },
    "build": {
      "openclawVersion": "2026.3.24-beta.2",
      "pluginSdkVersion": "2026.3.24-beta.2"
    }
  },
  "keywords": [],
  "author": "",
  "license": "ISC",
  "type": "module",
  "dependencies": {
    "openclaw": "^2026.3.28"
  }
}
</code></pre></div><p>We&#8217;ve also included additional metadata such as the OpenClaw version and Plugin SDK version.</p><h4><strong>2. Create the Plugin Manifest</strong></h4><p>All OpenClaw plugins require a manifest file named Openclaw.plugin.json, which contains metadata about the plugin:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;json&quot;,&quot;nodeId&quot;:&quot;d0b3b13c-7b87-4f21-b269-fc932dfd12fd&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-json">{
  "id": "arize_openclaw_hook",
  "name": "Arize OpenClaw Hook",
  "description": "Observability plugin for OpenClaw with Arize.",
  "configSchema": {
    "type": "object",
    "additionalProperties": false,
  },
}
</code></pre></div><blockquote><p><strong>Note:</strong> The plugin id should match the name of the npm project.</p></blockquote><h4><strong>3. Define the Plugin Entry</strong></h4><p>Next, create an index.ts file to define the plugin:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;typescript&quot;,&quot;nodeId&quot;:&quot;b42368dc-2a6c-4b8f-9304-984f8d0949f2&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-typescript">import { definePluginEntry } from "openclaw/plugin-sdk/plugin-entry";

export default definePluginEntry({
    id: "arize_openclaw_hook",
    name: "Arize OpenClaw Hook",
    description: "Observability plugin for OpenClaw with Arize.",
    register(api) {
        api.registerService(arizeService(api));
    },
});
</code></pre></div><p>The <code>definePluginEntry </code>function accepts an object with the plugin&#8217;s metadata and a <code>register</code> method. This method is used to register the plugin&#8217;s functionality with OpenClaw.</p><h4><strong>4. Implement the Plugin Service</strong></h4><p>We will register our plugin as a <strong>background service</strong> using api.registerService. A service must return an OpenClawPluginService object:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;typescript&quot;,&quot;nodeId&quot;:&quot;a708b7d6-2953-4ad2-a7b5-046ca9b19907&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-typescript">export type OpenClawPluginService = {
    id: string;
    start: (ctx: OpenClawPluginServiceContext) =&gt; void | Promise&lt;void&gt;;
    stop?: (ctx: OpenClawPluginServiceContext) =&gt; void | Promise&lt;void&gt;;
};
</code></pre></div><ul><li><p><strong>id:</strong> Uniquely identifies the service.</p></li><li><p><strong>start:</strong> Executed when the plugin starts.</p></li><li><p><strong>stop:</strong> Optional method executed during shutdown for cleanup.</p></li></ul><p>Here is a minimal implementation:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;typescript&quot;,&quot;nodeId&quot;:&quot;1b799ecb-d727-452f-b622-60ddd5f27f04&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-typescript">function arizeService(api: OpenClawPluginApi): OpenClawPluginService {
    
    api.logger.info("Initializing Arize OpenClaw Hook service...");
    
    return {
        id: "arize-openclaw",
        async start(ctx: OpenClawPluginServiceContext) {
            const log = { info: ctx.logger.info.bind(ctx.logger) };

            log.info("Arize OpenClaw Hook service is starting...");
        },
        async stop(ctx: OpenClawPluginServiceContext) {
            const log = { info: ctx.logger.info.bind(ctx.logger) };

            log.info("Arize OpenClaw Hook service is stopping...");
        }
    };
}
</code></pre></div><p>This service simply logs messages using the OpenClaw logger. The <code>api.logger</code> is available during registration, while <code>ctx.logger</code> is used within the <code>start</code> and <code>stop</code> lifecycle methods.</p><h4><strong>5. Adding Hooks to the Service</strong></h4><p>Next, we attach hooks to listen for specific OpenClaw events:</p><ul><li><p><strong>llm_input</strong> &#8211; Triggered when the LLM receives a prompt.</p></li><li><p><strong>llm_output</strong> &#8211; Triggered when the LLM generates a response.</p></li><li><p><strong>before_tool_call</strong> &#8211; Triggered before the agent invokes a tool.</p></li></ul><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;typescript&quot;,&quot;nodeId&quot;:&quot;b0ff9cab-49bf-4cff-8c25-19081c1a0192&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-typescript">import { definePluginEntry,OpenClawPluginApi,OpenClawPluginService, OpenClawPluginServiceContext} from "openclaw/plugin-sdk/plugin-entry";


function arizeService(api: OpenClawPluginApi): OpenClawPluginService {

    api.logger.info("Initializing Arize OpenClaw Hook service...");

    api.on("llm_input", (event) =&gt; {
       api.logger.info(`Prompt: ${event.prompt}`);
       api.logger.info(`Model name: ${event.model}`);
       api.logger.info(`Model provide: ${event.provider}`);
    });

    api.on("llm_output", (event) =&gt; {
        api.logger.info(`Assistant Texts: ${event.assistantTexts.toString()}`);
    });

    api.on("before_tool_call", (event) =&gt; {
        api.logger.info(`Tool Name: ${event.toolName}`);
        api.logger.info(`Tool Call ID: ${event.toolCallId}`);
    });

    return {
        id: "arize-openclaw",
        async start(ctx: OpenClawPluginServiceContext) {
            const log = { info: ctx.logger.info.bind(ctx.logger) };
            log.info("Arize OpenClaw Hook service is starting...");
        },
        async stop(ctx: OpenClawPluginServiceContext) {
            const log = { info: ctx.logger.info.bind(ctx.logger) };
            log.info("Arize OpenClaw Hook service is stopping...");
        }
    };
}

export default definePluginEntry({
    id: "arize_openclaw_hook",
    name: "Arize OpenClaw Hook",
    description: "Observability plugin for OpenClaw with Arize.",
    register(api) {
        api.registerService(arizeService(api));
    },
});
</code></pre></div><h5><strong>Event Data Overview</strong></h5><ul><li><p><strong>llm_input</strong>: Contains the prompt, model name, and provider.</p></li><li><p><strong>llm_output</strong>: Includes the assistant&#8217;s generated response.</p></li><li><p><strong>before_tool_call</strong>: Provides details about the tool being invoked, such as its name and call ID.</p></li></ul><h4><strong>6. Installing and Testing the Plugin</strong></h4><p>Once the implementation is complete, install the plugin locally:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;b68eed1b-3393-4040-94f4-e13c50095cd2&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">openclaw plugins install .</code></pre></div><p>After installation, restart the OpenClaw gateway to activate the plugin:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;ecd20e6b-62e7-4b55-8425-ca6514d21c09&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">openclaw gateway restart</code></pre></div><p>Next, we need to open the OpenClaw dashboard to communicate with the agent. We can do this using the following command:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;8844d616-bed6-49ba-bd60-14e639fab11d&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">openclaw dashboard</code></pre></div><p>Once the dashboard is open, you can send a message to the agent from the Chat section in the sidebar.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-2I7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38b3eb79-75a5-4c94-b103-3e656ce00cd1_1600x777.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-2I7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38b3eb79-75a5-4c94-b103-3e656ce00cd1_1600x777.png 424w, https://substackcdn.com/image/fetch/$s_!-2I7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38b3eb79-75a5-4c94-b103-3e656ce00cd1_1600x777.png 848w, https://substackcdn.com/image/fetch/$s_!-2I7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38b3eb79-75a5-4c94-b103-3e656ce00cd1_1600x777.png 1272w, https://substackcdn.com/image/fetch/$s_!-2I7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38b3eb79-75a5-4c94-b103-3e656ce00cd1_1600x777.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-2I7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38b3eb79-75a5-4c94-b103-3e656ce00cd1_1600x777.png" width="1456" height="707" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/38b3eb79-75a5-4c94-b103-3e656ce00cd1_1600x777.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:707,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-2I7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38b3eb79-75a5-4c94-b103-3e656ce00cd1_1600x777.png 424w, https://substackcdn.com/image/fetch/$s_!-2I7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38b3eb79-75a5-4c94-b103-3e656ce00cd1_1600x777.png 848w, https://substackcdn.com/image/fetch/$s_!-2I7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38b3eb79-75a5-4c94-b103-3e656ce00cd1_1600x777.png 1272w, https://substackcdn.com/image/fetch/$s_!-2I7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38b3eb79-75a5-4c94-b103-3e656ce00cd1_1600x777.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The OpenClaw Dashboard</figcaption></figure></div><p>Next, go to Logs in the sidebar and verify that the data was logged.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!u1ww!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F394b08d6-543f-4bf7-81bf-34953f32a7aa_1600x777.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!u1ww!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F394b08d6-543f-4bf7-81bf-34953f32a7aa_1600x777.png 424w, https://substackcdn.com/image/fetch/$s_!u1ww!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F394b08d6-543f-4bf7-81bf-34953f32a7aa_1600x777.png 848w, https://substackcdn.com/image/fetch/$s_!u1ww!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F394b08d6-543f-4bf7-81bf-34953f32a7aa_1600x777.png 1272w, https://substackcdn.com/image/fetch/$s_!u1ww!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F394b08d6-543f-4bf7-81bf-34953f32a7aa_1600x777.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!u1ww!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F394b08d6-543f-4bf7-81bf-34953f32a7aa_1600x777.png" width="1456" height="707" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/394b08d6-543f-4bf7-81bf-34953f32a7aa_1600x777.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:707,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!u1ww!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F394b08d6-543f-4bf7-81bf-34953f32a7aa_1600x777.png 424w, https://substackcdn.com/image/fetch/$s_!u1ww!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F394b08d6-543f-4bf7-81bf-34953f32a7aa_1600x777.png 848w, https://substackcdn.com/image/fetch/$s_!u1ww!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F394b08d6-543f-4bf7-81bf-34953f32a7aa_1600x777.png 1272w, https://substackcdn.com/image/fetch/$s_!u1ww!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F394b08d6-543f-4bf7-81bf-34953f32a7aa_1600x777.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Logs in the OpenClaw Dashboard</figcaption></figure></div><p>From the image above, we can see that the data was successfully logged, confirming that our plugin works as expected. Next, we need to add instrumentation so that instead of being logged locally, the data is sent to the Arize server.</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;1f42810f-2fbe-4ee8-866f-aca77204273f&quot;,&quot;caption&quot;:&quot;OpenClaw went viral this year because of its simplicity in allowing users to communicate with AI agents running on their own hardware. It wasn&#8217;t a new model, a new architecture, or a new agentic protocol. Instead, it demonstrated a new way people could work with AI agents using technology they were already familiar with.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Building A Voice AI Agent with OpenClaw and AssemblyAI&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:49908626,&quot;name&quot;:&quot;Eteimorde Youdiowei&quot;,&quot;bio&quot;:&quot;Simplifying complex ideas is my passion.....&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/90f6ea8f-0227-42b7-8c09-47e819b7f743_661x661.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null},{&quot;id&quot;:280510396,&quot;name&quot;:&quot;Neurl Creators&quot;,&quot;bio&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1b0dc913-49dd-4e59-ac39-bbbf289e1744_256x256.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2026-03-26T13:31:49.545Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/637e8655-2575-455f-ac20-3e665ffcc3b3_1456x1048.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://neurlcreators.substack.com/p/building-a-voice-ai-agent-with-openclaw&quot;,&quot;section_name&quot;:&quot;&#9881;&#65039; BuildAIers&#8217; Toolkit &#9881;&#65039;&quot;,&quot;video_upload_id&quot;:null,&quot;id&quot;:192193050,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:0,&quot;comment_count&quot;:2,&quot;publication_id&quot;:3228552,&quot;publication_name&quot;:&quot;The AI Blueprint: Practical Content for AI Builders &quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!6udc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5b15961-5020-4a71-b040-30f9b3d3f232_256x256.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><h2>Hooks + Observervability = Better Agent</h2><p>In this section, we&#8217;ll extend the plugin from the previous section and add instrumentation so we can observe the agent&#8217;s behavior in Arize instead of relying only on local logs.</p><h3>Installing Instrumentation Dependencies</h3><p>First, install the required OpenTelemetry and Arize instrumentation packages:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;bf70ac87-c163-429e-ad7b-16157bcdb2e2&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">npm install @opentelemetry/api @opentelemetry/exporter-trace-otlp-proto @opentelemetry/resources @opentelemetry/sdk-trace-base @opentelemetry/sdk-trace-node @opentelemetry/semantic-conventions @arizeai/openinference-semantic-conventions openai</code></pre></div><h3>Configuring Arize Credentials</h3><p>Next, we need to configure the credentials required to send traces to Arize:</p><ul><li><p>ARIZE_API_KEY: Your Arize API key</p></li><li><p>ARIZE_SPACE_ID: Your Arize space ID</p></li><li><p>ARIZE_PROJECT_NAME: The name of your project</p></li></ul><p>Normally, these would be stored in an environment file. However, since we are working within OpenClaw, the better approach is to expose them as configurable plugin values via the manifest file.</p><p>We update <code>Openclaw.plugin.json</code> to include these configuration values:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;json&quot;,&quot;nodeId&quot;:&quot;21824612-d5ec-4987-abc5-468accb8bd9b&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-json">{
  "id": "arize_openclaw_hook",
  "name": "Arize OpenClaw Hook",
  "description": "Observability plugin for OpenClaw with Arize.",
  "configSchema": {
    "type": "object",
    "additionalProperties": false,
    "properties": {
      "spaceID": {
        "type": "string"
      },
      "apiKey": {
        "type": "string"
      },
      "projectName": {
        "type": "string"
      }
    }
  },
  "uiHints": {
    "apiKey": {
      "label": "Arize API Key",
      "placeholder": "your-api-key",
      "sensitive": true
    },
    "spaceID": {
      "label": "Arize Space ID",
      "placeholder": "your-space-id",
      "help": "The ID of the Arize space to use."
    },
    "projectName": {
      "label": "Project Name",
      "placeholder": "openclaw"
    }
  }
}
</code></pre></div><p>These values can now be set either in <code>openclaw.json</code> or directly from the dashboard thanks to the UI hints defined above.</p><h3>Adding Instrumentation</h3><p>Next, we implement instrumentation to connect OpenClaw to Arize. This involves initializing an OpenTelemetry <a href="https://opentelemetry.io/docs/concepts/signals/traces/">tracer</a> that sends <a href="https://opentelemetry.io/docs/concepts/signals/traces/#spans">spans</a> to the Arize server.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;typescript&quot;,&quot;nodeId&quot;:&quot;634fedd0-dbb7-47c8-a38d-65d4e656f539&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-typescript">function arizeService(api: OpenClawPluginApi): OpenClawPluginService {
   
    const COLLECTOR_ENDPOINT = "https://otlp.arize.com";
    const SERVICE_NAME = process.env.ARIZE_PROJECT_NAME || "";
    
    api.logger.info(`Initializing Arize OpenClaw Hook with service name: ${SERVICE_NAME}`);
    
    const provider = new NodeTracerProvider({
        resource: resourceFromAttributes({
            [ATTR_SERVICE_NAME]: SERVICE_NAME,
            [SEMRESATTRS_PROJECT_NAME]: SERVICE_NAME,
        }),
        spanProcessors: [
            new SimpleSpanProcessor(
                new OTLPTraceExporter({
                    url: `${COLLECTOR_ENDPOINT}/v1/traces`,
                    headers: {
                        'space_id': process.env.ARIZE_SPACE_ID || "",
                        'api_key': process.env.ARIZE_API_KEY || "",
                    },
                })
            ),
        ],
    });

    provider.register();

    const tracer = trace.getTracer(SERVICE_NAME); 
    
    return {
        id: "arize-openclaw",
        async start(ctx: OpenClawPluginServiceContext) {
            const log = { info: ctx.logger.info.bind(ctx.logger) };

            log.info("Arize OpenClaw Hook service is starting...");
        },
        async stop(ctx: OpenClawPluginServiceContext) {
            const log = { info: ctx.logger.info.bind(ctx.logger) };
            log.info("Arize OpenClaw Hook service is stopping...");
        }
    };
}
</code></pre></div><p>At this point, we have instrumentation configured, but no hooks attached yet.</p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/understanding-openclaws-hook-the?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Know someone who might need this? Share this post with your network and friends!</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/understanding-openclaws-hook-the?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/understanding-openclaws-hook-the?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><h3>Connecting Hooks to Traces</h3><p>In the code below, we added our hooks as before, but this time we use the tracer to create spans that instrument all calls and send them to Arize.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;typescript&quot;,&quot;nodeId&quot;:&quot;86909071-11b7-41ad-b70f-ae1a30b7de0e&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-typescript">function arizeService(api: OpenClawPluginApi): OpenClawPluginService {
    
    // Remaining code above
    
    api.on("llm_input", (event) =&gt; {
        tracer.startActiveSpan("call-llm", async (span) =&gt; {
            span.setAttribute(SemanticConventions.OPENINFERENCE_SPAN_KIND, OpenInferenceSpanKind.CHAIN);
            span.setAttribute(INPUT_VALUE, event.prompt);
            span.setAttribute(LLM_MODEL_NAME, event.model);
            span.setAttribute(LLM_PROVIDER, event.provider);


            api.on("llm_output", (event) =&gt; {
                span.setAttribute(OUTPUT_VALUE, event.assistantTexts.toString());
                span.setStatus({ code: SpanStatusCode.OK });
                span.end();
            });
        });
    });

    api.on("before_tool_call", (event) =&gt; {
        const span = tracer.startSpan("tool-call");
        span.setAttribute(SemanticConventions.OPENINFERENCE_SPAN_KIND, OpenInferenceSpanKind.TOOL);
        span.setAttribute(TOOL_NAME, event.toolName);
        span.setAttribute(TOOL_CALL_ID, event.toolCallId || "unknown");
        span.setStatus({ code: SpanStatusCode.OK });
        span.end();
    });
    
    // Remaining code below
}

</code></pre></div><p>Here is the full implementation:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;typescript&quot;,&quot;nodeId&quot;:&quot;03e85055-5b27-4471-8c58-62fac2237d7b&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-typescript">import { definePluginEntry } from "openclaw/plugin-sdk/plugin-entry";
import type { OpenClawPluginApi, OpenClawPluginService , OpenClawPluginServiceContext } from "openclaw/plugin-sdk";
import { OTLPTraceExporter } from "@opentelemetry/exporter-trace-otlp-proto";
import { resourceFromAttributes } from "@opentelemetry/resources";
import { SimpleSpanProcessor } from "@opentelemetry/sdk-trace-base";
import { NodeTracerProvider } from "@opentelemetry/sdk-trace-node";
import { ATTR_SERVICE_NAME } from "@opentelemetry/semantic-conventions";
import { SEMRESATTRS_PROJECT_NAME } from "@arizeai/openinference-semantic-conventions";


import { trace, SpanStatusCode } from "@opentelemetry/api";
import {
    INPUT_VALUE, OUTPUT_VALUE, LLM_MODEL_NAME, LLM_PROVIDER, SemanticConventions, TOOL_NAME, TOOL_PARAMETERS, TOOL_CALL_ID,
    OpenInferenceSpanKind
} from "@arizeai/openinference-semantic-conventions";


function arizeService(api: OpenClawPluginApi): OpenClawPluginService {

    const COLLECTOR_ENDPOINT = "https://otlp.arize.com";
    const SERVICE_NAME = process.env.ARIZE_PROJECT_NAME || "";
    
    api.logger.info(`Initializing Arize OpenClaw Hook with service name: ${SERVICE_NAME}`);
    
    const provider = new NodeTracerProvider({
        resource: resourceFromAttributes({
            [ATTR_SERVICE_NAME]: SERVICE_NAME,
            [SEMRESATTRS_PROJECT_NAME]: SERVICE_NAME,
        }),
        spanProcessors: [
            new SimpleSpanProcessor(
                new OTLPTraceExporter({
                    url: `${COLLECTOR_ENDPOINT}/v1/traces`,
                    headers: {
                        'space_id': process.env.ARIZE_SPACE_ID || "",
                        'api_key': process.env.ARIZE_API_KEY || "",
                    },
                })
            ),
        ],
    });

    provider.register();

    const tracer = trace.getTracer(SERVICE_NAME);


    api.on("llm_input", (event) =&gt; {
        tracer.startActiveSpan("call-llm", async (span) =&gt; {
            span.setAttribute(SemanticConventions.OPENINFERENCE_SPAN_KIND, OpenInferenceSpanKind.CHAIN);
            span.setAttribute(INPUT_VALUE, event.prompt);
            span.setAttribute(LLM_MODEL_NAME, event.model);
            span.setAttribute(LLM_PROVIDER, event.provider);


            api.on("llm_output", (event) =&gt; {
                span.setAttribute(OUTPUT_VALUE, event.assistantTexts.toString());
                span.setStatus({ code: SpanStatusCode.OK });
                span.end();
            });
        });
    });

    api.on("before_tool_call", (event) =&gt; {
        const span = tracer.startSpan("tool-call");
        span.setAttribute(SemanticConventions.OPENINFERENCE_SPAN_KIND, OpenInferenceSpanKind.TOOL);
        span.setAttribute(TOOL_NAME, event.toolName);
        span.setAttribute(TOOL_CALL_ID, event.toolCallId || "unknown");
        span.setStatus({ code: SpanStatusCode.OK });
        span.end();
    });


    return {
        id: "arize-openclaw",
        async start(ctx: OpenClawPluginServiceContext) {
            const log = { info: ctx.logger.info.bind(ctx.logger) };

            log.info("Arize OpenClaw Hook service is starting...");
        },
        async stop(ctx: OpenClawPluginServiceContext) {
            const log = { info: ctx.logger.info.bind(ctx.logger) };
            log.info("Arize OpenClaw Hook service is stopping...");
        }
    };
}

export default definePluginEntry({
    id: "arize_openclaw_hook",
    name: "Arize OpenClaw Hook",
    description: "Observability plugin for OpenClaw with Arize.",
    register(api) {
        api.registerService(arizeService(api));
    },
});
</code></pre></div><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;941e9859-33db-4c9f-b758-c61e405e3098&quot;,&quot;caption&quot;:&quot;Retrieval Augmented Generation (RAG) is one of the most popular ways of extending the capabilities of Large Language Models (LLMs). Before RAG, LLMs were stuck with static knowledge, struggled with hallucinations, and had no real way to access up-to-date information without fine-tuning. RAG changed that and reshaped how we build and use language models.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Combining Retrieval Augmented Generation with Image Generation (RAGE)&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:280510396,&quot;name&quot;:&quot;Neurl Creators&quot;,&quot;bio&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1b0dc913-49dd-4e59-ac39-bbbf289e1744_256x256.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null},{&quot;id&quot;:49908626,&quot;name&quot;:&quot;Eteimorde Youdiowei&quot;,&quot;bio&quot;:&quot;Simplifying complex ideas is my passion.....&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/90f6ea8f-0227-42b7-8c09-47e819b7f743_661x661.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2026-02-05T12:30:26.380Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/65ada00c-77c8-4645-8067-45569ad6c507_1456x1048.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://neurlcreators.substack.com/p/combining-retrieval-augmented-generation&quot;,&quot;section_name&quot;:&quot;&#9881;&#65039; BuildAIers&#8217; Toolkit &#9881;&#65039;&quot;,&quot;video_upload_id&quot;:null,&quot;id&quot;:186955908,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:3,&quot;comment_count&quot;:0,&quot;publication_id&quot;:3228552,&quot;publication_name&quot;:&quot;The AI Blueprint: Practical Content for AI Builders &quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!6udc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5b15961-5020-4a71-b040-30f9b3d3f232_256x256.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><h3>Testing the Plugin</h3><p>Now that we have observability in place, let&#8217;s test the plugin again.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NqCS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89e52a24-ae24-48dd-977f-db944cd663f5_1920x889.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NqCS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89e52a24-ae24-48dd-977f-db944cd663f5_1920x889.png 424w, https://substackcdn.com/image/fetch/$s_!NqCS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89e52a24-ae24-48dd-977f-db944cd663f5_1920x889.png 848w, https://substackcdn.com/image/fetch/$s_!NqCS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89e52a24-ae24-48dd-977f-db944cd663f5_1920x889.png 1272w, https://substackcdn.com/image/fetch/$s_!NqCS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89e52a24-ae24-48dd-977f-db944cd663f5_1920x889.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NqCS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89e52a24-ae24-48dd-977f-db944cd663f5_1920x889.png" width="1456" height="674" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/89e52a24-ae24-48dd-977f-db944cd663f5_1920x889.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:674,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NqCS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89e52a24-ae24-48dd-977f-db944cd663f5_1920x889.png 424w, https://substackcdn.com/image/fetch/$s_!NqCS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89e52a24-ae24-48dd-977f-db944cd663f5_1920x889.png 848w, https://substackcdn.com/image/fetch/$s_!NqCS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89e52a24-ae24-48dd-977f-db944cd663f5_1920x889.png 1272w, https://substackcdn.com/image/fetch/$s_!NqCS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89e52a24-ae24-48dd-977f-db944cd663f5_1920x889.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Tool calls in the OpenClaw dashboard</figcaption></figure></div><p>In the OpenClaw dashboard, I created a new chat session. When a session starts, OpenClaw loads memory and other contextual data using internal tools.</p><p>Switching to the Arize dashboard, we can see that all tool calls and the new session events are being sent successfully.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HfM5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f6c97e-96b3-4f16-8c65-5451d8e14891_1920x889.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HfM5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f6c97e-96b3-4f16-8c65-5451d8e14891_1920x889.png 424w, https://substackcdn.com/image/fetch/$s_!HfM5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f6c97e-96b3-4f16-8c65-5451d8e14891_1920x889.png 848w, https://substackcdn.com/image/fetch/$s_!HfM5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f6c97e-96b3-4f16-8c65-5451d8e14891_1920x889.png 1272w, https://substackcdn.com/image/fetch/$s_!HfM5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f6c97e-96b3-4f16-8c65-5451d8e14891_1920x889.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HfM5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f6c97e-96b3-4f16-8c65-5451d8e14891_1920x889.png" width="1456" height="674" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/76f6c97e-96b3-4f16-8c65-5451d8e14891_1920x889.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:674,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HfM5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f6c97e-96b3-4f16-8c65-5451d8e14891_1920x889.png 424w, https://substackcdn.com/image/fetch/$s_!HfM5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f6c97e-96b3-4f16-8c65-5451d8e14891_1920x889.png 848w, https://substackcdn.com/image/fetch/$s_!HfM5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f6c97e-96b3-4f16-8c65-5451d8e14891_1920x889.png 1272w, https://substackcdn.com/image/fetch/$s_!HfM5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f6c97e-96b3-4f16-8c65-5451d8e14891_1920x889.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Traces of the OpenClaw Agent&#8217;s action in arize</figcaption></figure></div><p>If we inspect the session creation event, we can see:</p><ul><li><p>Input message captured</p></li><li><p>Output recorded</p></li><li><p>Additional metadata such as latency and execution details</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uL5y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29c84740-0b57-4b77-a86e-832036203f26_1920x889.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uL5y!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29c84740-0b57-4b77-a86e-832036203f26_1920x889.png 424w, https://substackcdn.com/image/fetch/$s_!uL5y!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29c84740-0b57-4b77-a86e-832036203f26_1920x889.png 848w, https://substackcdn.com/image/fetch/$s_!uL5y!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29c84740-0b57-4b77-a86e-832036203f26_1920x889.png 1272w, https://substackcdn.com/image/fetch/$s_!uL5y!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29c84740-0b57-4b77-a86e-832036203f26_1920x889.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uL5y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29c84740-0b57-4b77-a86e-832036203f26_1920x889.png" width="1456" height="674" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/29c84740-0b57-4b77-a86e-832036203f26_1920x889.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:674,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uL5y!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29c84740-0b57-4b77-a86e-832036203f26_1920x889.png 424w, https://substackcdn.com/image/fetch/$s_!uL5y!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29c84740-0b57-4b77-a86e-832036203f26_1920x889.png 848w, https://substackcdn.com/image/fetch/$s_!uL5y!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29c84740-0b57-4b77-a86e-832036203f26_1920x889.png 1272w, https://substackcdn.com/image/fetch/$s_!uL5y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29c84740-0b57-4b77-a86e-832036203f26_1920x889.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Detailed Overview of chat message in Arize</figcaption></figure></div><blockquote><p>You can get the full source code implementation <a href="https://github.com/Neurl-LLC/OpenClaw_Observability_plugin_with_Arize">here</a> on github.</p></blockquote><div><hr></div><p style="text-align: center;"><em>Join the conversation and share your experiences in the comments below!</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/understanding-openclaws-hook-the/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/understanding-openclaws-hook-the/comments"><span>Leave a comment</span></a></p><div><hr></div><h2>Conclusion</h2><p>OpenClaw&#8217;s hooks are a powerful feature that let you tap directly into the system&#8217;s internal events. By wrapping hooks inside a plugin, you can fully customize your OpenClaw agent and extend its capabilities in meaningful ways.</p><p>In this article, we used that flexibility to add observability to our agent using Arize. Observability is critical when working with systems like OpenClaw, where agents are complex, autonomous, and often operate behind multiple layers of abstraction. </p><p>With the right instrumentation in place, you move from guessing what your agent is doing to clearly understanding its behavior in real time.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"><em>Subscribe to <a href="https://neurlcreators.substack.com/">The Neural Blueprint</a>&#128071;&#127998; for hands-on guides! &#129761; Follow us on <a href="http://www.youtube.com/@neurlcreators">YouTube</a>, <a href="https://x.com/NeurlCreators">X</a>, and <a href="https://www.linkedin.com/showcase/neurl-creators/">LinkedIn</a>.</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[Building A Voice AI Agent with OpenClaw and AssemblyAI]]></title><description><![CDATA[Customize Your OpenClaw Agent&#8217;s Media Understanding with Universal-3 Pro]]></description><link>https://neurlcreators.substack.com/p/building-a-voice-ai-agent-with-openclaw</link><guid isPermaLink="false">https://neurlcreators.substack.com/p/building-a-voice-ai-agent-with-openclaw</guid><dc:creator><![CDATA[Eteimorde Youdiowei]]></dc:creator><pubDate>Thu, 26 Mar 2026 13:31:49 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/637e8655-2575-455f-ac20-3e665ffcc3b3_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div id="youtube2-wf5bE15JO_w" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;wf5bE15JO_w&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/wf5bE15JO_w?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><div><hr></div><p>OpenClaw went viral this year because of its simplicity in allowing users to communicate with AI agents running on their own hardware. It wasn&#8217;t a new model, a new architecture, or a new agentic protocol. Instead, it demonstrated a new way people could work with AI agents using technology they were already familiar with.</p><p>OpenClaw allows users to communicate with their agents through chat apps such as Telegram and WhatsApp. A user can simply pick up their device and send a text message to the AI agent. This ease of use is what attracted many users to OpenClaw.</p><p>But can this interaction be simplified even further? Yes, it can. One way to do this is by turning your OpenClaw agent into a <a href="https://www.assemblyai.com/blog/ai-voice-agents">voice AI agent</a>. Since OpenClaw already works through chat apps, and most of these apps support voice notes, voice interaction becomes a natural extension of the system.</p><p>In this article, we will show you how to set up OpenClaw as a voice AI agent. We will also demonstrate how to bring your own speech-to-text model and integrate it with OpenClaw. The model we will use is <strong><a href="https://www.assemblyai.com/universal-3-pro">Universal-3 Pro</a></strong>, and we will explore how its prompting capabilities can be used to create a more customized voice interaction experience.</p><h2>OpenClaw in a nutshell</h2><p>There is a lot of confusion surrounding OpenClaw. Even the name itself <a href="https://www.forbes.com/sites/ronschmelzer/2026/01/30/moltbot-molts-again-and-becomes-openclaw-pushback-and-concerns-grow/">creates confusion</a>. So in this section, we will do a quick breakdown of what OpenClaw is and how it works.</p><h3>What is OpenClaw?</h3><p>You can think of OpenClaw as a gateway between your chat app and your AI agent.</p><p>The chat app can be Telegram, WhatsApp, or Slack. The AI agent can be powered by cloud-based Large Language Models (<a href="https://www.ibm.com/think/topics/large-language-models">LLMs</a>) such as those provided by Anthropic or OpenAI, or even by a locally hosted model.</p><p>The AI agent also has access to a computer system. This could be your personal computer (though this is <a href="https://www.pcworld.com/article/3064874/openclaw-ai-is-going-viral-dont-install-it.html">not advisable</a>), a Mac Mini, a Raspberry Pi, or a cloud server.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!c8Tt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53be0fd9-5000-42bf-b497-f3595cbab294_1600x1267.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!c8Tt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53be0fd9-5000-42bf-b497-f3595cbab294_1600x1267.png 424w, https://substackcdn.com/image/fetch/$s_!c8Tt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53be0fd9-5000-42bf-b497-f3595cbab294_1600x1267.png 848w, https://substackcdn.com/image/fetch/$s_!c8Tt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53be0fd9-5000-42bf-b497-f3595cbab294_1600x1267.png 1272w, https://substackcdn.com/image/fetch/$s_!c8Tt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53be0fd9-5000-42bf-b497-f3595cbab294_1600x1267.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!c8Tt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53be0fd9-5000-42bf-b497-f3595cbab294_1600x1267.png" width="1456" height="1153" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/53be0fd9-5000-42bf-b497-f3595cbab294_1600x1267.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1153,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!c8Tt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53be0fd9-5000-42bf-b497-f3595cbab294_1600x1267.png 424w, https://substackcdn.com/image/fetch/$s_!c8Tt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53be0fd9-5000-42bf-b497-f3595cbab294_1600x1267.png 848w, https://substackcdn.com/image/fetch/$s_!c8Tt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53be0fd9-5000-42bf-b497-f3595cbab294_1600x1267.png 1272w, https://substackcdn.com/image/fetch/$s_!c8Tt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53be0fd9-5000-42bf-b497-f3595cbab294_1600x1267.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">OpenClaw&#8217;s setup consists of a chat application that the user uses to communicate with an agent that has access to a computer.</figcaption></figure></div><p>The OpenClaw setup consists of the following:</p><ul><li><p>The <strong>chat application</strong> serves as the user interface</p></li><li><p><strong>OpenClaw</strong> acts as the orchestrator</p></li><li><p>The <strong>AI agent</strong> is the brain</p></li><li><p>The <strong>computer</strong> serves as the universal tool</p></li></ul><h3>What makes openclaw different?</h3><p>Agents are not a new concept. Agents that have access to a computer are not new either, and chatting with AI is certainly not new.</p><p>However, two things make OpenClaw stand out.</p><p>The first is the <strong>medium of communication</strong>. Unlike many chatbots that require you to use a separate app or a dedicated website, OpenClaw allows you to communicate with your agent through the chat apps you already use.</p><p>The second difference is that the OpenClaw agent is <strong>more proactive</strong>. It is not just another chat session. The agent can maintain memory, send reminders about tasks it is working on, and interact with the computer it has access to.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sPl-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feeeeaa79-7263-40be-bfe6-c344e6a9c646_1600x1267.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sPl-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feeeeaa79-7263-40be-bfe6-c344e6a9c646_1600x1267.png 424w, https://substackcdn.com/image/fetch/$s_!sPl-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feeeeaa79-7263-40be-bfe6-c344e6a9c646_1600x1267.png 848w, https://substackcdn.com/image/fetch/$s_!sPl-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feeeeaa79-7263-40be-bfe6-c344e6a9c646_1600x1267.png 1272w, https://substackcdn.com/image/fetch/$s_!sPl-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feeeeaa79-7263-40be-bfe6-c344e6a9c646_1600x1267.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sPl-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feeeeaa79-7263-40be-bfe6-c344e6a9c646_1600x1267.png" width="1456" height="1153" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/eeeeaa79-7263-40be-bfe6-c344e6a9c646_1600x1267.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1153,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sPl-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feeeeaa79-7263-40be-bfe6-c344e6a9c646_1600x1267.png 424w, https://substackcdn.com/image/fetch/$s_!sPl-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feeeeaa79-7263-40be-bfe6-c344e6a9c646_1600x1267.png 848w, https://substackcdn.com/image/fetch/$s_!sPl-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feeeeaa79-7263-40be-bfe6-c344e6a9c646_1600x1267.png 1272w, https://substackcdn.com/image/fetch/$s_!sPl-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feeeeaa79-7263-40be-bfe6-c344e6a9c646_1600x1267.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">OpenClaw equips the agent with several capabilities that extend its functionality, such as access to tools and skills, as well as markdown files that define the agent&#8217;s behavior and even its memory.</figcaption></figure></div><p>Since the agent has access to the system, it can perform actions such as reading files, editing files, and running commands. In many ways, OpenClaw feels like giving a personal computer to an AI assistant.</p><h3>Setting up openclaw</h3><p>When it comes to installing openclaw they are several different options you can choose. The easiest option is to install it on your personal computer but this is not advisable since an AI agent will have full control over your computer and a lot security experts have warned <a href="https://www.xda-developers.com/please-stop-using-openclaw/">several security vulnerabilities</a> on openclaw.</p><p>But running on your personal computer is the fastest way to experiment with it. The other option is have a dedicated computer for openclaw such as mac mini or a raspberry pi. Another option is to run openclaw within container using docker that way it is sandboxed.</p><p>If you&#8217;re on Mac or Linux, you can install OpenClaw with this one-liner:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;shell&quot;,&quot;nodeId&quot;:&quot;196f748d-1eca-4da7-9d7b-18c9f0b8b320&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-shell">curl -fsSL https://openclaw.ai/install.sh | bash</code></pre></div><p>Then set it up by running:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;shell&quot;,&quot;nodeId&quot;:&quot;b699bcf6-95d0-4755-a4d3-39cbe4f285f1&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-shell">openclaw onboard --install-daemon</code></pre></div><p>This will prompt you to configure your model. The<code> --install-daemon</code> flag sets up OpenClaw as a background service, so it runs automatically whenever your device starts.</p><p>Once setup is complete, you can confirm everything is running with:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;shell&quot;,&quot;nodeId&quot;:&quot;1bf0a62c-431f-422b-beac-2877598e79b7&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-shell">openclaw gateway status</code></pre></div><p>For other installation methods, refer to the official OpenClaw <a href="https://docs.openclaw.ai/install">installation guide</a>.</p><h3>Setting up a channel for communication</h3><p>When OpenClaw is installed, the next thing you need to do is set up the channel of communication. This is essentially the chat app you wish to use to communicate with OpenClaw from.</p><p>All channels support communication via text, but since we are working with a voice agent, we want one that can support other media types, such as audio. Telegram is perfect for this because it offers the easiest setup compared to other channels, and you can send voice notes to the openclaw via Telegram.</p><p>Go through the <a href="https://docs.openclaw.ai/channels/telegram">setup guide on Telegram</a> on openclaw documentation.</p><div class="pullquote"><p>Need creative, high-quality technical content? Happy to chat! Book a call with our Creative Engineers&#128071;&#127998;</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://calendar.app.google/s6ekF2rDbPjKWLLQ8&quot;,&quot;text&quot;:&quot;Let&#8217;s Chat! &#128222;&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://calendar.app.google/s6ekF2rDbPjKWLLQ8"><span>Let&#8217;s Chat! &#128222;</span></a></p></div><h2>OpenClaw&#8217;s media understanding capabilities</h2><p>OpenClaw&#8217;s <a href="https://docs.openclaw.ai/nodes/media-understanding">media understanding</a> capabilities allow it to process more than just text. When it receives a media file, such as an image or audio, it can use one of its model providers to transform it into a format the agent can understand.</p><p>For example, if OpenClaw receives a voice note from a channel like Telegram, it will use a speech-to-text (<a href="https://www.assemblyai.com/blog/speech-to-text-ai-a-complete-guide-to-modern-speech-recognition-technology">STT</a>) model to convert the audio into text before passing it to the LLM. Similarly, if it receives an image, it can summarize the content using an image model and send that information to the agent. In this article, we are focusing on the audio understanding aspect of OpenClaw.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nXkq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0491c4f7-1b5d-48d1-b7e1-603cd3c0b6da_1600x1267.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nXkq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0491c4f7-1b5d-48d1-b7e1-603cd3c0b6da_1600x1267.png 424w, https://substackcdn.com/image/fetch/$s_!nXkq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0491c4f7-1b5d-48d1-b7e1-603cd3c0b6da_1600x1267.png 848w, https://substackcdn.com/image/fetch/$s_!nXkq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0491c4f7-1b5d-48d1-b7e1-603cd3c0b6da_1600x1267.png 1272w, https://substackcdn.com/image/fetch/$s_!nXkq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0491c4f7-1b5d-48d1-b7e1-603cd3c0b6da_1600x1267.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nXkq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0491c4f7-1b5d-48d1-b7e1-603cd3c0b6da_1600x1267.png" width="1456" height="1153" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0491c4f7-1b5d-48d1-b7e1-603cd3c0b6da_1600x1267.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1153,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nXkq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0491c4f7-1b5d-48d1-b7e1-603cd3c0b6da_1600x1267.png 424w, https://substackcdn.com/image/fetch/$s_!nXkq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0491c4f7-1b5d-48d1-b7e1-603cd3c0b6da_1600x1267.png 848w, https://substackcdn.com/image/fetch/$s_!nXkq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0491c4f7-1b5d-48d1-b7e1-603cd3c0b6da_1600x1267.png 1272w, https://substackcdn.com/image/fetch/$s_!nXkq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0491c4f7-1b5d-48d1-b7e1-603cd3c0b6da_1600x1267.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">OpenClaw&#8217;s Media Understanding Using a Default Transcription Model</figcaption></figure></div><p>By default, OpenClaw supports a limited set of STT providers, including OpenAI, Mixtral Voxtral, and Deepgram. In this guide, we&#8217;ll go a step further by integrating a custom STT model, extending OpenClaw beyond its built-in options.</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;9a803284-dea9-4f87-86d6-7d0cc4fa32b5&quot;,&quot;caption&quot;:&quot;In the courtroom, every word matters. Transcripts form the official record of legal proceedings, capturing exactly what was said, by whom, and when.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Generating Court Transcriptions with Deepgram's Nova-3&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:49908626,&quot;name&quot;:&quot;Eteimorde Youdiowei&quot;,&quot;bio&quot;:&quot;Simplifying complex ideas is my passion.....&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/90f6ea8f-0227-42b7-8c09-47e819b7f743_661x661.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null},{&quot;id&quot;:280510396,&quot;name&quot;:&quot;Neurl Creators&quot;,&quot;bio&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1b0dc913-49dd-4e59-ac39-bbbf289e1744_256x256.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-08-09T17:00:31.591Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fc818fb3-8875-4913-a2fb-c97d2a6bfe42_1456x1048.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://neurlcreators.substack.com/p/generating-court-transcriptions-with&quot;,&quot;section_name&quot;:&quot;&#9881;&#65039; BuildAIers&#8217; Toolkit &#9881;&#65039;&quot;,&quot;video_upload_id&quot;:null,&quot;id&quot;:170541739,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:1,&quot;comment_count&quot;:0,&quot;publication_id&quot;:3228552,&quot;publication_name&quot;:&quot;The AI Blueprint: Practical Content for AI Builders &quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!6udc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5b15961-5020-4a71-b040-30f9b3d3f232_256x256.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><h2>Bring your own Speech To Text model</h2><p>There are several ways to extend OpenClaw&#8217;s capabilities, one of which is via <a href="https://docs.openclaw.ai/plugins/building-plugins">plugins</a>. While writing a plugin to perform media understanding is possible, it is often overkill. OpenClaw already provides a built-in way to extend media understanding using a custom script.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GPcZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38195adf-3fe1-4214-a571-877fcb73c00b_1600x1267.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GPcZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38195adf-3fe1-4214-a571-877fcb73c00b_1600x1267.png 424w, https://substackcdn.com/image/fetch/$s_!GPcZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38195adf-3fe1-4214-a571-877fcb73c00b_1600x1267.png 848w, https://substackcdn.com/image/fetch/$s_!GPcZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38195adf-3fe1-4214-a571-877fcb73c00b_1600x1267.png 1272w, https://substackcdn.com/image/fetch/$s_!GPcZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38195adf-3fe1-4214-a571-877fcb73c00b_1600x1267.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GPcZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38195adf-3fe1-4214-a571-877fcb73c00b_1600x1267.png" width="1456" height="1153" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/38195adf-3fe1-4214-a571-877fcb73c00b_1600x1267.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1153,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GPcZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38195adf-3fe1-4214-a571-877fcb73c00b_1600x1267.png 424w, https://substackcdn.com/image/fetch/$s_!GPcZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38195adf-3fe1-4214-a571-877fcb73c00b_1600x1267.png 848w, https://substackcdn.com/image/fetch/$s_!GPcZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38195adf-3fe1-4214-a571-877fcb73c00b_1600x1267.png 1272w, https://substackcdn.com/image/fetch/$s_!GPcZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38195adf-3fe1-4214-a571-877fcb73c00b_1600x1267.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">OpenClaw&#8217;s Media Understanding Using a custom script</figcaption></figure></div><p>With a custom script, you simply tell OpenClaw that whenever it receives an audio file, it should run the script. The script processes the audio and returns the transcribed text. All the heavy lifting is handled by OpenClaw. You just need to write the script and configure the <code>openclaw.json</code>.</p><p>Since we get to write the script, we can choose any STT model provider. In this guide, we will use <a href="https://www.assemblyai.com/">AssemblyAI</a>.</p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/building-a-voice-ai-agent-with-openclaw?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Know someone who might need this? Share this post with your network and friends!</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/building-a-voice-ai-agent-with-openclaw?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/building-a-voice-ai-agent-with-openclaw?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><h3><strong>Step 1: Set Up Your Environment</strong></h3><p>It is best to create a dedicated Python environment first. Then, install the AssemblyAI SDK:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;shell&quot;,&quot;nodeId&quot;:&quot;89640d6f-7fa1-4764-a73a-c26be54e10b2&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-shell">pip install assemblyai</code></pre></div><p>Next, create an AssemblyAI API key and store it in an environment variable:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;shell&quot;,&quot;nodeId&quot;:&quot;05ea3123-388c-42cd-8416-05bc604c4b11&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-shell">export ASSEMBLYAI_API_KEY="your_api_key_here"</code></pre></div><p>For global access, it is recommended to add this line to your <code>.bashrc</code> or <code>.zshrc</code> file.</p><h3><strong>Step 2: Create the Transcription Script</strong></h3><p>Create a Python file called main.py and add the following:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;a2139f0d-7b35-4eca-b172-810ec2f99bbb&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">import argparse
import os
import sys
import assemblyai as aai


def main():
    # 1. Set up the argument parser
    parser = argparse.ArgumentParser(
        description="Transcribe an audio file using AssemblyAI."
    )
    
    # Add positional argument for the audio file path
    parser.add_argument(
        "audio_file", 
        type=str, 
        help="Path to the audio file you want to transcribe (e.g., ./voice_note.ogg)"
    )
    
    # Add optional argument for the API key
    parser.add_argument(
        "--api-key", 
        type=str, 
        help="Your AssemblyAI API key (can also be set via ASSEMBLYAI_API_KEY env variable)",
        default=None
    )

    args = parser.parse_args()

    # 2. Configure API Key
    api_key = args.api_key or os.environ.get("ASSEMBLYAI_API_KEY")
    if not api_key:
        print("Error: API key is missing.")
        print("Please set the ASSEMBLYAI_API_KEY environment variable or pass it via --api-key.")
        sys.exit(1)
        
    aai.settings.api_key = api_key

    # 3. Configure and run the transcription
    config = aai.TranscriptionConfig(speech_models=["universal-3-pro"], 
                                     language_detection=True, 
                                     prompt="Transcribe the audio make sure include fillers and stutters in the transcript.")

    print(f"Transcribing '{args.audio_file}'... Please wait.")

    try:
        transcript = aai.Transcriber(config=config).transcribe(args.audio_file)

        if transcript.status == "error":
            raise RuntimeError(f"Transcription failed: {transcript.error}")

        print("\n--- Transcript ---")
        print(transcript.text)
        print("------------------\n")


    except Exception as e:
        print(f"\nAn error occurred: {e}")
        sys.exit(1)

if __name__ == "__main__":
    main()
</code></pre></div><p>This script takes the path to an audio file, transcribes it using AssemblyAI, and prints the result.</p><h3><strong>Step 3: Configure OpenClaw</strong></h3><p>Next, integrate the script with OpenClaw by editing <code>openclaw.json</code>:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;json&quot;,&quot;nodeId&quot;:&quot;d50e9c56-d972-4026-8136-fb7c99743209&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-json">  "tools":{
    "media": {
      "audio": {
        "enabled": true,
        "models": [
          {
            "type": "cli",
            "command": "python",
            "args": ["/PATH/TO/SCRIPT/main.py", "{{MediaPath}}"]
          }
        ]
      }
    }
  }</code></pre></div><p>This configuration tells OpenClaw to enable audio understanding, but instead of using a default provider, it will run your custom script.</p><blockquote><p>Tip: If you are using a Python virtual environment, set the <code>command</code> to the full path of your environment&#8217;s Python binary. You can find it with:</p></blockquote><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;shell&quot;,&quot;nodeId&quot;:&quot;35c2bb96-164b-4fdd-994b-0f48ce66fd98&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-shell">whereis python</code></pre></div><h3><strong>Step 4: Restart OpenClaw</strong></h3><p>After setup, restart OpenClaw to apply the changes:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;shell&quot;,&quot;nodeId&quot;:&quot;3f5c7624-bf1e-406c-b742-111d29ec3d5b&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-shell">openclaw daemon restart</code></pre></div><p>With this setup, you now have full control over OpenClaw&#8217;s audio understanding capabilities.</p><blockquote><p>You can find the complete implementation in the <a href="https://github.com/Neurl-LLC/OpenClaw_AssemblyAI_Voice_AI_Agent">GitHub repository</a>.</p></blockquote><h2>Why Use Universal-3 Pro with OpenClaw?</h2><p>A common question when working with OpenClaw&#8217;s media capabilities is: <em>why switch to a </em>different <em>STT model?</em> After all, don&#8217;t all STT models just convert speech to text?</p><p>The answer is no. Different STT models have different strengths and trade-offs, for example:</p><ul><li><p><strong>Speed:</strong> Some models prioritize fast transcription, making them suitable for real-time applications.</p></li><li><p><strong>Accuracy (<a href="https://www.assemblyai.com/blog/word-error-rate">WER</a>):</strong> Others focus on achieving a low Word Error Rate, improving transcription quality.</p></li><li><p><strong>Domain specialization:</strong> Certain models are optimized for specific areas such as medicine, legal, or customer support.</p></li><li><p><strong>Customization:</strong> Some models allow fine-tuning or prompting to handle unique names, jargon, or phrases.</p></li><li><p><strong>Deployment preference:</strong> Developers may prefer local models for privacy, control, or cost reasons.</p></li></ul><p>This flexibility allows you to create a voice AI agent that is tailored to your specific use case, rather than relying solely on OpenClaw&#8217;s built-in media understanding.</p><p>In this article, we use <strong>AssemblyAI&#8217;s Universal-3 Pro</strong> because of its powerful <strong><a href="https://www.assemblyai.com/docs/pre-recorded-audio/universal-3-pro/prompting">prompting capabilities</a></strong>. For example, my name is <em>Eteimorde</em>. It is not an English name and rarely appears in standard datasets. </p><p>While building my personal voice AI agent with OpenClaw, I noticed that default STT models consistently misheard my name. To solve this, I used Universal-3 Pro&#8217;s <strong><a href="https://www.assemblyai.com/docs/keyterms-prompting">keyterm prompting</a></strong> feature to explicitly define my name as an important term:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;a079f462-1dfe-46e5-86cf-a161745fa5d8&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">config = aai.TranscriptionConfig(
    speech_models=["universal-3-pro"], 
    language_detection=True, 
    keyterms_prompt=["Eteimorde"]
)</code></pre></div><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;6ee43a20-88d6-41e1-b9fe-9583b22f792c&quot;,&quot;caption&quot;:&quot;For a while, I had been experimenting with real-time speech-to-text (STT), mostly through Deepgram&#8217;s API. I built a real-time transcription React app powered by Deepgram&#8217;s Nova 3. It worked well, but at some point, curiosity kicked in. I started asking myself what actually happens under the hood of a real-time STT service.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Mini-App: Build a Real-Time Speech-to-Text (STT) API with OpenAI&#8217;s Whisper&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:49908626,&quot;name&quot;:&quot;Eteimorde Youdiowei&quot;,&quot;bio&quot;:&quot;Simplifying complex ideas is my passion.....&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/90f6ea8f-0227-42b7-8c09-47e819b7f743_661x661.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null},{&quot;id&quot;:280510396,&quot;name&quot;:&quot;Neurl Creators&quot;,&quot;bio&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1b0dc913-49dd-4e59-ac39-bbbf289e1744_256x256.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-12-18T11:20:23.354Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2e14f5ea-dcab-44c1-a3f2-98c00ac2358d_1456x1048.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://neurlcreators.substack.com/p/how-do-you-build-a-real-time-speech&quot;,&quot;section_name&quot;:&quot;&#9881;&#65039; BuildAIers&#8217; Toolkit &#9881;&#65039;&quot;,&quot;video_upload_id&quot;:null,&quot;id&quot;:181768689,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:4,&quot;comment_count&quot;:0,&quot;publication_id&quot;:3228552,&quot;publication_name&quot;:&quot;The AI Blueprint: Practical Content for AI Builders &quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!6udc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5b15961-5020-4a71-b040-30f9b3d3f232_256x256.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><h3><strong>Additional Capabilities of Universal-3 Pro via Prompting</strong></h3><p>Universal-3 Pro provides advanced features that can be easily leveraged through <strong>prompting</strong>. You can customize the behavior of the model by updating the <code>prompt</code> in the transcription configuration:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;9329e59d-d642-4ddb-9b14-b54765eef46f&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">config = aai.TranscriptionConfig(
    speech_models=["universal-3-pro"], 
    language_detection=True, 
    prompt="YOUR_PROMPT_GOES_HERE"
)</code></pre></div><p>Using prompting, the model can perform the following tasks:</p><ol><li><p><strong>Verbatim transcription and disfluencies<br></strong> Preserve natural speech patterns such as filler words, repetitions, and self-corrections.</p></li><li><p><strong>Audio event tagging<br></strong> Mark non-speech sounds like laughter, music, applause, or background noise.</p></li><li><p><strong>Crosstalk labeling<br></strong> Identify overlapping speech, interruptions, and multiple speakers talking at once.</p></li><li><p><strong>Numbers and measurements formatting<br></strong> Control how numbers, percentages, and measurements are represented.</p></li><li><p><strong>Context-aware clues<br></strong> Improve transcription for domain-specific terms, names, and jargon by providing relevant hints in the prompt.</p></li><li><p><strong>Speaker attribution<br></strong> Detect and label different speakers in a conversation.</p></li><li><p><strong>PII redaction<br></strong> Tag personal identifiable information such as names, addresses, and contact details, useful for limiting what the agent can access.</p></li></ol><p>By using <strong>prompting</strong>, these capabilities allow your OpenClaw voice agent to become <strong>more accurate, context-aware, and personalized</strong>, going beyond the default transcription behavior.</p><div class="pullquote"><p>Join the conversation and share your experiences in the comments below!</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/building-a-voice-ai-agent-with-openclaw/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/building-a-voice-ai-agent-with-openclaw/comments"><span>Leave a comment</span></a></p></div><h2>Conclusion</h2><p>OpenClaw makes it easy to run AI agents through chat apps you already use, and adding voice capabilities takes the interaction to a whole new level. By integrating your own speech-to-text models, such as <strong>Universal-3 Pro</strong>, you unlock features beyond OpenClaw&#8217;s built-in media understanding. </p><p>Its prompting capabilities allow users to customize how the model transcribes audio, accurately recognize custom keyterms, and leverage features like <strong>verbatim transcription </strong>to preserve natural speech and <strong>audio event tagging</strong> to capture non-speech context such as background noise or laughter.</p><p>With this setup, your OpenClaw agent behaves more like a true personal assistant. It can remember context, send proactive reminders, and leverage system tools to perform tasks. Voice interaction, combined with Universal-3 Pro&#8217;s advanced prompting features, transforms the agent from a simple chat companion into a more robust, seamless, and highly personalized experience.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"><em>Subscribe to <a href="https://neurlcreators.substack.com/">The Neural Blueprint</a>&#128071;&#127998; for hands-on guides! &#129761; Follow us on <a href="http://www.youtube.com/@neurlcreators">YouTube</a>, <a href="https://x.com/NeurlCreators">X</a>, and <a href="https://www.linkedin.com/showcase/neurl-creators/">LinkedIn</a>.</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[FORM Your Images With JSON Prompting]]></title><description><![CDATA[Give JSON Prompting a Better Interface with Forms]]></description><link>https://neurlcreators.substack.com/p/form-your-images-with-json-prompting</link><guid isPermaLink="false">https://neurlcreators.substack.com/p/form-your-images-with-json-prompting</guid><dc:creator><![CDATA[Neurl Creators]]></dc:creator><pubDate>Thu, 26 Feb 2026 11:30:51 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/e766fdad-30c1-4724-9cb8-b883bf5d02d6_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>When working with an image generation model, the most common approach is to prompt it with plain text. This method is simple, straightforward, and natural: provide a description, and get an image in return. But is this really the best way to work with image generation models?</p><p>Instead of relying on free-form text, would it be better to structure the input before passing it to the model?</p><p>That&#8217;s where JSON prompting comes in. With JSON prompting, all prompts provided to the image generation model are structured. So instead of writing:</p><p>&#8220;A person walking down the beach during the sunset&#8221;</p><p>You would format the prompt like this:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;json&quot;,&quot;nodeId&quot;:&quot;44884c31-b964-431d-8d25-723b258122a8&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-json">{ "subject": "person", 
  "action": "walking along the shore", 
  "environment": "beach", 
  "time_of_day": "sunset" 
}</code></pre></div><p>Although JSON prompting is gaining prominence in the image generation space, it does come with drawbacks. One of the main challenges is complexity. Outside the developer community, relatively few people are comfortable working directly with JSON. It introduces structure, but it also introduces friction.</p><p>However, there is one structured format that almost everyone is familiar with: forms.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tZUL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60b8e77e-d3ac-4745-9c27-246d330e921b_1600x863.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tZUL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60b8e77e-d3ac-4745-9c27-246d330e921b_1600x863.png 424w, https://substackcdn.com/image/fetch/$s_!tZUL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60b8e77e-d3ac-4745-9c27-246d330e921b_1600x863.png 848w, https://substackcdn.com/image/fetch/$s_!tZUL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60b8e77e-d3ac-4745-9c27-246d330e921b_1600x863.png 1272w, https://substackcdn.com/image/fetch/$s_!tZUL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60b8e77e-d3ac-4745-9c27-246d330e921b_1600x863.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tZUL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60b8e77e-d3ac-4745-9c27-246d330e921b_1600x863.png" width="1456" height="785" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/60b8e77e-d3ac-4745-9c27-246d330e921b_1600x863.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:785,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tZUL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60b8e77e-d3ac-4745-9c27-246d330e921b_1600x863.png 424w, https://substackcdn.com/image/fetch/$s_!tZUL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60b8e77e-d3ac-4745-9c27-246d330e921b_1600x863.png 848w, https://substackcdn.com/image/fetch/$s_!tZUL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60b8e77e-d3ac-4745-9c27-246d330e921b_1600x863.png 1272w, https://substackcdn.com/image/fetch/$s_!tZUL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60b8e77e-d3ac-4745-9c27-246d330e921b_1600x863.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Using form to generate an image</figcaption></figure></div><p>In this article, we take JSON prompting one step further by showing how users can literally form their images. Instead of writing raw JSON, users interact with structured inputs behind the scenes. By the end, you will see whether structured prompting is truly necessary and whether it is something you should incorporate into your own image generation workflows.</p><h2>Structured vs Unstructured Text in Image Generation</h2><p>One of the major breakthroughs in image generation came when models gained the ability to generate images directly from text prompts. That moment marked the real beginning of modern text-to-image AI, bringing natural language understanding into the visual domain.</p><p>Today&#8217;s image generation models, such as Nano Banana, can produce accurate images even when prompts contain misspellings or loosely written descriptions. Within this space, prompting approaches can generally be divided into two categories:</p><ul><li><p>Unstructured Text for Image Generation</p></li><li><p>Structured Text for Image Generation</p></li></ul><h3>Unstructured Text for Image Generation</h3><p>Unstructured text is the natural way humans communicate. When we express intent, we do not usually follow a rigid format. We describe things as they come to mind. This is one of the main reasons it remains the default method for working with image generation models. Anyone, regardless of technical background, can type a sentence and generate an image.</p><p>Unstructured text is not only natural for humans, it is also the default format used during model training. Image generation models are typically trained on large datasets consisting of image&#8211;caption pairs. Most captions found across the internet are written in free-form, unstructured language.</p><p>Because of this, many image generation models perform exceptionally well with unstructured text. That is the format they were primarily trained on. The majority of large-scale datasets contain images paired with natural language descriptions, not structured schemas.</p><h3>Structured Text for Image Generation</h3><p>While unstructured text is natural for both humans and machines, structured text has been gaining prominence in the image generation space. This has to do with the fact that most image generation models are paired with a language model. This enables them to possess <a href="https://medium.com/@liechticonsulting/native-image-generation-using-llms-219bdc8d428c">native image generation</a><strong> </strong>capabilities.</p><p>How this works in theory is that a language model is first given a prompt. The prompt is then converted into image tokens that will be used to generate the image. This means that image generation models carry the world knowledge of LLMs and use it to generate images.</p><p>Since LLMs have a strong understanding of structured formats like JSON, they can easily interpret intent when generating images, regardless of the format in which the prompt is provided. This is where <strong>JSON prompting</strong> comes in.</p><p>By structuring prompts in formats such as JSON, we can guide the model more explicitly while still leveraging the native image generation capabilities powered by the underlying language model.</p><p>Researchers are also exploring the use of structured captions paired with images to further improve the generation capabilities of open-source image models. For example, the paper <em><a href="https://arxiv.org/abs/2511.06876">Generating an Image From 1,000 Words: Enhancing Text-to-Image With Structured Captions</a></em> introduced <a href="https://huggingface.co/briaai/FIBO">FIBO</a>, a model trained on structured text.</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;06f7a16c-7db2-4e18-be34-04f30e367085&quot;,&quot;caption&quot;:&quot;When GPT-4o Image launched, it took the world by storm. Within just a week, users had generated over 700 million images. Social media was flooded with Studio Ghibli-style portraits, and everyone seemed to be trying it out. In contrast, the release of Google&#8217;s Nano Banana felt almost secretive. Yet, despite its quiet debut, its capabilities quickly made &#8230;&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;md&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;How Nano Banana Compares to GPT-4o image&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:49908626,&quot;name&quot;:&quot;Eteimorde Youdiowei&quot;,&quot;bio&quot;:&quot;Simplifying complex ideas is my passion.....&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/90f6ea8f-0227-42b7-8c09-47e819b7f743_661x661.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null},{&quot;id&quot;:280510396,&quot;name&quot;:&quot;Neurl Creators&quot;,&quot;bio&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1b0dc913-49dd-4e59-ac39-bbbf289e1744_256x256.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-09-12T10:00:22.019Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/346d4034-d31a-4daa-bd3e-392e69da49e4_1456x1048.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://neurlcreators.substack.com/p/how-nano-banana-compares-to-gpt-4o&quot;,&quot;section_name&quot;:&quot;&#9881;&#65039; BuildAIers&#8217; Toolkit &#9881;&#65039;&quot;,&quot;video_upload_id&quot;:null,&quot;id&quot;:173385653,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:1,&quot;comment_count&quot;:0,&quot;publication_id&quot;:3228552,&quot;publication_name&quot;:&quot;The Neural Blueprint: Practical Content for AI Builders &quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!6udc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5b15961-5020-4a71-b040-30f9b3d3f232_256x256.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p></p><h2>Why JSON Prompting?</h2><p>JSON prompting can sound like extra work. Instead of writing a simple sentence, you now have to think in terms of fields, attributes, and structure. So why bother?</p><p>Because it offers several practical advantages:</p><ul><li><p>Detailed image creation</p></li><li><p>Image consistency</p></li><li><p>Image recreation</p></li></ul><h3><strong>Detailed Image Creation</strong></h3><p>JSON prompting makes it possible to generate highly detailed images in a controlled way. Instead of relying on a single descriptive paragraph, you can explicitly define elements such as:</p><ul><li><p>Subject</p></li><li><p>Lighting</p></li><li><p>Camera angle</p></li><li><p>Style</p></li><li><p>Environment</p></li><li><p>Mood</p></li><li><p>Composition</p></li></ul><p>By structuring these components, you reduce ambiguity. Free-form text can be vague and often relies on unstated assumptions. Structured prompts force clarity. Each aspect of the image is intentionally defined, which makes it easier to produce rich and precise outputs.</p><h3>Image Consistency</h3><p>With JSON prompting, maintaining consistency across multiple generated images becomes much easier. Because the structure is explicit and reusable, you can:</p><ul><li><p>Keep the same character traits across generations</p></li><li><p>Maintain consistent lighting or camera settings</p></li><li><p>Reuse stylistic attributes</p></li><li><p>Change only specific fields, such as the environment or time of day</p></li></ul><p>For example, you can keep a character&#8217;s appearance identical while swapping out different scenes. Or you can maintain the same scene while experimenting with different lighting conditions. The structure gives you control over what changes and what stays the same.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ABV9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1e25f7d-803a-4fd2-b498-5595a668010a_1600x900.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ABV9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1e25f7d-803a-4fd2-b498-5595a668010a_1600x900.png 424w, https://substackcdn.com/image/fetch/$s_!ABV9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1e25f7d-803a-4fd2-b498-5595a668010a_1600x900.png 848w, https://substackcdn.com/image/fetch/$s_!ABV9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1e25f7d-803a-4fd2-b498-5595a668010a_1600x900.png 1272w, https://substackcdn.com/image/fetch/$s_!ABV9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1e25f7d-803a-4fd2-b498-5595a668010a_1600x900.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ABV9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1e25f7d-803a-4fd2-b498-5595a668010a_1600x900.png" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b1e25f7d-803a-4fd2-b498-5595a668010a_1600x900.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ABV9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1e25f7d-803a-4fd2-b498-5595a668010a_1600x900.png 424w, https://substackcdn.com/image/fetch/$s_!ABV9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1e25f7d-803a-4fd2-b498-5595a668010a_1600x900.png 848w, https://substackcdn.com/image/fetch/$s_!ABV9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1e25f7d-803a-4fd2-b498-5595a668010a_1600x900.png 1272w, https://substackcdn.com/image/fetch/$s_!ABV9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1e25f7d-803a-4fd2-b498-5595a668010a_1600x900.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">JSON prompted image from both Nano Banana and GPT-Image [<a href="https://www.fofr.ai/prompting-with-json">Source</a>]</figcaption></figure></div><p>This consistency can also extend across models that support JSON prompting. A well-defined structured prompt can be reused with different image generation systems, helping you achieve similar outputs without rewriting everything from scratch.</p><div class="pullquote"><p><em>Need creative, high-quality technical content? Happy to chat! Book a call with our Creative Engineers&#128071;&#127998;</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://calendar.app.google/s6ekF2rDbPjKWLLQ8&quot;,&quot;text&quot;:&quot;Let&#8217;s Chat! &#128222;&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://calendar.app.google/s6ekF2rDbPjKWLLQ8"><span>Let&#8217;s Chat! &#128222;</span></a></p></div><h3>Image Recreation</h3><p>JSON prompting also enables image recreation. Given an existing image, a vision model can generate a detailed JSON description of that image, capturing elements such as:</p><ul><li><p>Objects in the scene</p></li><li><p>Their positions</p></li><li><p>Lighting conditions</p></li><li><p>Camera perspective</p></li><li><p>Style and composition</p></li></ul><p>This structured representation preserves the essential characteristics of the original image.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2mj1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7ccb687-ff0b-4599-85f3-adcf5b38b35a_1600x900.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2mj1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7ccb687-ff0b-4599-85f3-adcf5b38b35a_1600x900.png 424w, https://substackcdn.com/image/fetch/$s_!2mj1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7ccb687-ff0b-4599-85f3-adcf5b38b35a_1600x900.png 848w, https://substackcdn.com/image/fetch/$s_!2mj1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7ccb687-ff0b-4599-85f3-adcf5b38b35a_1600x900.png 1272w, https://substackcdn.com/image/fetch/$s_!2mj1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7ccb687-ff0b-4599-85f3-adcf5b38b35a_1600x900.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2mj1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7ccb687-ff0b-4599-85f3-adcf5b38b35a_1600x900.png" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e7ccb687-ff0b-4599-85f3-adcf5b38b35a_1600x900.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2mj1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7ccb687-ff0b-4599-85f3-adcf5b38b35a_1600x900.png 424w, https://substackcdn.com/image/fetch/$s_!2mj1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7ccb687-ff0b-4599-85f3-adcf5b38b35a_1600x900.png 848w, https://substackcdn.com/image/fetch/$s_!2mj1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7ccb687-ff0b-4599-85f3-adcf5b38b35a_1600x900.png 1272w, https://substackcdn.com/image/fetch/$s_!2mj1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7ccb687-ff0b-4599-85f3-adcf5b38b35a_1600x900.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Image Recreation with JSON prompting [<a href="https://www.fofr.ai/prompting-with-json">Source</a>] </figcaption></figure></div><p>The generated JSON prompt can then be passed to an image generation model to reproduce the image with a similar level of detail and structure. Instead of loosely describing what you see, you are reconstructing it in a format that is both machine-readable and reusable.</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;d828f9cd-588f-4631-a75a-1e95a3d610c8&quot;,&quot;caption&quot;:&quot;Your LLM promises JSON, but delivers &#8220;almost-JSON&#8221; &#8212; extra commas, missing fields, wrong types.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;[Infographics] JSON Contracts: Make Tool Outputs Parse Every Time&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:280510396,&quot;name&quot;:&quot;Neurl Creators&quot;,&quot;bio&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1b0dc913-49dd-4e59-ac39-bbbf289e1744_256x256.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-09-24T16:30:37.340Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!Ehc2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4fd6587-927e-4581-bffc-7a8825639f24_2232x3158.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://neurlcreators.substack.com/p/infographics-json-contracts-make&quot;,&quot;section_name&quot;:&quot;&#128202; Visuals for BuildAIers &#128202;&quot;,&quot;video_upload_id&quot;:null,&quot;id&quot;:174420699,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:0,&quot;comment_count&quot;:0,&quot;publication_id&quot;:3228552,&quot;publication_name&quot;:&quot;The Neural Blueprint: Practical Content for AI Builders &quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!6udc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5b15961-5020-4a71-b040-30f9b3d3f232_256x256.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><h2>Does JSON Prompting actually work</h2><p>One question arises with JSON prompting: does it actually work?</p><p>Given that most image generation models are trained on plain text, is JSON prompting really necessary?</p><p><a href="https://chasejarvis.com/blog/does-json-prompting-actually-work-tested-with-nano-banana/">Critics of JSON</a> prompting have raised several concerns:</p><ul><li><p>It is complex</p></li><li><p>There is no real difference between JSON prompting and writing a detailed plain-text prompt</p></li><li><p>You are adding unnecessary tokens in the form of brackets, quotation marks, and field names</p></li></ul><p>These criticisms are not entirely unreasonable.</p><p>Personally, while using JSON prompting, I encountered many of these same issues. I took a structured JSON prompt and converted it into a carefully written detailed paragraph. In many cases, the results were almost identical.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2zld!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb95e0d7-de47-4b0e-a8b8-97f030704329_1024x559.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2zld!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb95e0d7-de47-4b0e-a8b8-97f030704329_1024x559.png 424w, https://substackcdn.com/image/fetch/$s_!2zld!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb95e0d7-de47-4b0e-a8b8-97f030704329_1024x559.png 848w, https://substackcdn.com/image/fetch/$s_!2zld!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb95e0d7-de47-4b0e-a8b8-97f030704329_1024x559.png 1272w, https://substackcdn.com/image/fetch/$s_!2zld!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb95e0d7-de47-4b0e-a8b8-97f030704329_1024x559.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2zld!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb95e0d7-de47-4b0e-a8b8-97f030704329_1024x559.png" width="1024" height="559" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cb95e0d7-de47-4b0e-a8b8-97f030704329_1024x559.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:559,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2zld!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb95e0d7-de47-4b0e-a8b8-97f030704329_1024x559.png 424w, https://substackcdn.com/image/fetch/$s_!2zld!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb95e0d7-de47-4b0e-a8b8-97f030704329_1024x559.png 848w, https://substackcdn.com/image/fetch/$s_!2zld!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb95e0d7-de47-4b0e-a8b8-97f030704329_1024x559.png 1272w, https://substackcdn.com/image/fetch/$s_!2zld!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb95e0d7-de47-4b0e-a8b8-97f030704329_1024x559.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">An image generated from a detailed prompt using Nano Banana</figcaption></figure></div><p>Some argue that JSON prompting creates a kind of placebo effect. And in certain situations, that might be true. But not all the time.</p><p>JSON prompting provides something that plain text does not: structure that is easy to manipulate. It allows you to introduce prompt templates in a clean and organized way. For example, if you want to add a specific camera view to your image, you can simply introduce a &#8220;camera&#8221; field rather than rewriting an entire paragraph.</p><p>You can also swap attributes quickly. This ability to make targeted edits is where JSON prompting really stands out. You can search for a specific attribute and modify it without scanning through long blocks of descriptive text.</p><p>You can change the subject, lighting, or environment in seconds, with minimal friction.</p><p>As for the concern about extra tokens from brackets and quotation marks, that limitation can be mitigated by using alternative structured formats such as YAML, which tends to be more compact while preserving structure.</p><p>The biggest issue, however, remains complexity. Writing raw JSON can feel tedious and technical, especially for non-developers. But that problem is not inherent to structured prompting itself. It is an interface problem. And that is where forms come in.</p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/form-your-images-with-json-prompting?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Know someone who might need this? Share this post with your network and friends!</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/form-your-images-with-json-prompting?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/form-your-images-with-json-prompting?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><h2>Use Form instead of JSON</h2><p>Instead of typing JSON manually, users can interact with structured fields through a form-based interface that generates the JSON behind the scenes. This preserves the benefits of structured prompting while removing the friction of writing it by hand.</p><p>To explore this idea, I built a simple interface that lets you literally form your images.</p><p>Let&#8217;s walk through how it works.</p><div id="youtube2-yTZ1jjJjUZU" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;yTZ1jjJjUZU&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/yTZ1jjJjUZU?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>First, the interface allows you to create fields that accept text values. These fields represent different attributes of the image, such as subject, environment, style, or mood.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dVcP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f612549-2969-488f-bbf4-bc72b6f2fee7_1600x863.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dVcP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f612549-2969-488f-bbf4-bc72b6f2fee7_1600x863.png 424w, https://substackcdn.com/image/fetch/$s_!dVcP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f612549-2969-488f-bbf4-bc72b6f2fee7_1600x863.png 848w, https://substackcdn.com/image/fetch/$s_!dVcP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f612549-2969-488f-bbf4-bc72b6f2fee7_1600x863.png 1272w, https://substackcdn.com/image/fetch/$s_!dVcP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f612549-2969-488f-bbf4-bc72b6f2fee7_1600x863.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dVcP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f612549-2969-488f-bbf4-bc72b6f2fee7_1600x863.png" width="1456" height="785" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9f612549-2969-488f-bbf4-bc72b6f2fee7_1600x863.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:785,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dVcP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f612549-2969-488f-bbf4-bc72b6f2fee7_1600x863.png 424w, https://substackcdn.com/image/fetch/$s_!dVcP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f612549-2969-488f-bbf4-bc72b6f2fee7_1600x863.png 848w, https://substackcdn.com/image/fetch/$s_!dVcP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f612549-2969-488f-bbf4-bc72b6f2fee7_1600x863.png 1272w, https://substackcdn.com/image/fetch/$s_!dVcP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f612549-2969-488f-bbf4-bc72b6f2fee7_1600x863.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Text fields in FORM</figcaption></figure></div><p>You can also create nested fields. This is important for building prompts with nested JSON objects, where certain attributes belong inside broader categories. For example, a camera object might contain focal length, angle, and depth of field.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!f9kq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78da18fe-fd08-4b3c-be44-5030b8836574_1600x863.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!f9kq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78da18fe-fd08-4b3c-be44-5030b8836574_1600x863.png 424w, https://substackcdn.com/image/fetch/$s_!f9kq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78da18fe-fd08-4b3c-be44-5030b8836574_1600x863.png 848w, https://substackcdn.com/image/fetch/$s_!f9kq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78da18fe-fd08-4b3c-be44-5030b8836574_1600x863.png 1272w, https://substackcdn.com/image/fetch/$s_!f9kq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78da18fe-fd08-4b3c-be44-5030b8836574_1600x863.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!f9kq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78da18fe-fd08-4b3c-be44-5030b8836574_1600x863.png" width="1456" height="785" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/78da18fe-fd08-4b3c-be44-5030b8836574_1600x863.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:785,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!f9kq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78da18fe-fd08-4b3c-be44-5030b8836574_1600x863.png 424w, https://substackcdn.com/image/fetch/$s_!f9kq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78da18fe-fd08-4b3c-be44-5030b8836574_1600x863.png 848w, https://substackcdn.com/image/fetch/$s_!f9kq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78da18fe-fd08-4b3c-be44-5030b8836574_1600x863.png 1272w, https://substackcdn.com/image/fetch/$s_!f9kq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78da18fe-fd08-4b3c-be44-5030b8836574_1600x863.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Object fields in FORM</figcaption></figure></div><p>There is also a prompt template feature that allows you to load predefined templates, such as camera setups, lighting configurations, environments, and more. A user can select one of these templates and merge it directly into their existing prompt structure instead of rebuilding it from scratch.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!P5Je!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb3deb9f-c78d-4b3c-8775-bce2ba4e594e_1600x830.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!P5Je!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb3deb9f-c78d-4b3c-8775-bce2ba4e594e_1600x830.png 424w, https://substackcdn.com/image/fetch/$s_!P5Je!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb3deb9f-c78d-4b3c-8775-bce2ba4e594e_1600x830.png 848w, https://substackcdn.com/image/fetch/$s_!P5Je!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb3deb9f-c78d-4b3c-8775-bce2ba4e594e_1600x830.png 1272w, https://substackcdn.com/image/fetch/$s_!P5Je!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb3deb9f-c78d-4b3c-8775-bce2ba4e594e_1600x830.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!P5Je!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb3deb9f-c78d-4b3c-8775-bce2ba4e594e_1600x830.png" width="1456" height="755" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/eb3deb9f-c78d-4b3c-8775-bce2ba4e594e_1600x830.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:755,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!P5Je!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb3deb9f-c78d-4b3c-8775-bce2ba4e594e_1600x830.png 424w, https://substackcdn.com/image/fetch/$s_!P5Je!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb3deb9f-c78d-4b3c-8775-bce2ba4e594e_1600x830.png 848w, https://substackcdn.com/image/fetch/$s_!P5Je!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb3deb9f-c78d-4b3c-8775-bce2ba4e594e_1600x830.png 1272w, https://substackcdn.com/image/fetch/$s_!P5Je!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb3deb9f-c78d-4b3c-8775-bce2ba4e594e_1600x830.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Structured template prompts in FORM </figcaption></figure></div><p>When the user clicks Generate, the form data is converted into JSON and sent to the image generation model. If needed, the generated JSON can also be copied and reused in any other image generation interface that supports structured prompting.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PNJn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a8b538c-9a3f-4491-8ac4-dd462682fea0_1600x855.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PNJn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a8b538c-9a3f-4491-8ac4-dd462682fea0_1600x855.png 424w, https://substackcdn.com/image/fetch/$s_!PNJn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a8b538c-9a3f-4491-8ac4-dd462682fea0_1600x855.png 848w, https://substackcdn.com/image/fetch/$s_!PNJn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a8b538c-9a3f-4491-8ac4-dd462682fea0_1600x855.png 1272w, https://substackcdn.com/image/fetch/$s_!PNJn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a8b538c-9a3f-4491-8ac4-dd462682fea0_1600x855.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PNJn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a8b538c-9a3f-4491-8ac4-dd462682fea0_1600x855.png" width="1456" height="778" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6a8b538c-9a3f-4491-8ac4-dd462682fea0_1600x855.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:778,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PNJn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a8b538c-9a3f-4491-8ac4-dd462682fea0_1600x855.png 424w, https://substackcdn.com/image/fetch/$s_!PNJn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a8b538c-9a3f-4491-8ac4-dd462682fea0_1600x855.png 848w, https://substackcdn.com/image/fetch/$s_!PNJn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a8b538c-9a3f-4491-8ac4-dd462682fea0_1600x855.png 1272w, https://substackcdn.com/image/fetch/$s_!PNJn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a8b538c-9a3f-4491-8ac4-dd462682fea0_1600x855.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">A generated image using FORM</figcaption></figure></div><p>Editing is just as simple. Instead of scanning through long paragraphs of text, you can modify a specific field.</p><p>For example, in the image below we were able to change the subject from a person to a dog in seconds, without touching the rest of the prompt.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nKYP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f5fcdf7-88f9-49ef-837f-aef8058ecd0f_1600x863.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nKYP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f5fcdf7-88f9-49ef-837f-aef8058ecd0f_1600x863.png 424w, https://substackcdn.com/image/fetch/$s_!nKYP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f5fcdf7-88f9-49ef-837f-aef8058ecd0f_1600x863.png 848w, https://substackcdn.com/image/fetch/$s_!nKYP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f5fcdf7-88f9-49ef-837f-aef8058ecd0f_1600x863.png 1272w, https://substackcdn.com/image/fetch/$s_!nKYP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f5fcdf7-88f9-49ef-837f-aef8058ecd0f_1600x863.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nKYP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f5fcdf7-88f9-49ef-837f-aef8058ecd0f_1600x863.png" width="1456" height="785" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3f5fcdf7-88f9-49ef-837f-aef8058ecd0f_1600x863.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:785,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nKYP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f5fcdf7-88f9-49ef-837f-aef8058ecd0f_1600x863.png 424w, https://substackcdn.com/image/fetch/$s_!nKYP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f5fcdf7-88f9-49ef-837f-aef8058ecd0f_1600x863.png 848w, https://substackcdn.com/image/fetch/$s_!nKYP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f5fcdf7-88f9-49ef-837f-aef8058ecd0f_1600x863.png 1272w, https://substackcdn.com/image/fetch/$s_!nKYP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f5fcdf7-88f9-49ef-837f-aef8058ecd0f_1600x863.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>By leveraging this kind of interface, we retain the structure and control of JSON prompting while restoring the simplicity and accessibility that plain-text prompting originally offered.</p><blockquote><p>The full implementation is available on GitHub: <a href="https://github.com/Neurl-LLC/FORM">https://github.com/Neurl-LLC/FORM</a></p></blockquote><p>You can clone the repository, run it locally, and adapt it to your own image generation projects.</p><div class="pullquote"><p>Join the conversation and share your experiences in the comments below!</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/form-your-images-with-json-prompting/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/form-your-images-with-json-prompting/comments"><span>Leave a comment</span></a></p></div><h2>Conclusion</h2><p>Structured prompting is a way of organizing your prompts when working with image generation models. While many practitioners prefer using JSON as the structured format, there is still room to experiment with other structured formats to see what works best.</p><p>In this article, we used forms as the interface for working with JSON prompts. As we have seen, this approach improves the user experience and makes working with JSON prompting simpler and more practical.</p><div class="pullquote"><p><em>Subscribe to <a href="https://neurlcreators.substack.com/">The Neural Blueprint</a>&#128071;&#127998; for hands-on guides! &#129761; Follow us on <a href="http://www.youtube.com/@neurlcreators">YouTube</a>, <a href="https://x.com/NeurlCreators">X</a>, and <a href="https://www.linkedin.com/showcase/neurl-creators/">LinkedIn</a>.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/subscribe?"><span>Subscribe now</span></a></p></div><p></p>]]></content:encoded></item><item><title><![CDATA[Combining Retrieval Augmented Generation with Image Generation (RAGE)]]></title><description><![CDATA[Extending RAG Beyond Text with In-Context Image Generation]]></description><link>https://neurlcreators.substack.com/p/combining-retrieval-augmented-generation</link><guid isPermaLink="false">https://neurlcreators.substack.com/p/combining-retrieval-augmented-generation</guid><dc:creator><![CDATA[Neurl Creators]]></dc:creator><pubDate>Thu, 05 Feb 2026 12:30:26 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/65ada00c-77c8-4645-8067-45569ad6c507_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Retrieval Augmented Generation (<a href="https://aws.amazon.com/what-is/retrieval-augmented-generation/">RAG</a>) is one of the most popular ways of extending the capabilities of Large Language Models (<a href="https://www.ibm.com/think/topics/large-language-models">LLMs</a>). Before RAG, LLMs were stuck with static knowledge, struggled with <a href="https://cloud.google.com/discover/what-are-ai-hallucinations">hallucinations</a>, and had no real way to access up-to-date information without fine-tuning. RAG changed that and reshaped how we build and use language models.</p><p>Now the obvious question is: if RAG works so well for text, can we apply the same idea to other parts of the generative AI landscape? One place where this makes a lot of sense is image generation.</p><p>Modern image generation models like <a href="https://blog.google/products-and-platforms/products/gemini/how-nano-banana-got-its-name/">Nano Banana</a> and <a href="https://openai.com/index/new-chatgpt-images-is-here/">GPT-Image</a> are not just good at generating images from text prompts. They can also generate images using one or more reference images provided in-context. The catch is that this process is usually manual. The user has to find the right images, pass them in, and then write a prompt that ties everything together.</p><p>But what if we didn&#8217;t have to do this manually? That&#8217;s where combining retrieval augmented generation with image generation comes in. Several researchers have already explored this idea, with approaches like <a href="https://arxiv.org/abs/2502.09411">ImageRAG</a> and <a href="https://arxiv.org/abs/2505.21956">Cross-Modal RAG</a>.</p><p>In this article, we introduce the concept of <strong>Retrieval Augmented Generation for Images</strong>. Let&#8217;s call it <strong>RAGE</strong> for short. We walk through how it works, show a working implementation, and explore practical use cases.</p><h2>Understanding In-Context Learning in Image Generation Models</h2><p>The paper <em><a href="https://arxiv.org/abs/2005.14165">Language Models are Few-Shot Learners</a></em> introduced the idea of in-context learning for Large Language Models. In the paper, researchers from OpenAI showed that GPT-3 could learn to perform tasks like translation and question answering simply by seeing a <a href="https://www.promptingguide.ai/techniques/fewshot">few-shot examples</a> in its context.</p><p>In-context learning made it possible to extend a model&#8217;s capabilities without fine-tuning.</p><p>Fast forward to today, and in-context learning is no longer limited to language models. Image generation models are now capable of it as well.</p><p>Modern image generation models can generate images not only from a text prompt, but also from images provided directly in their context.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!p2nP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21ee13f0-3e18-4870-87da-7edce7c9b785_1600x1222.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!p2nP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21ee13f0-3e18-4870-87da-7edce7c9b785_1600x1222.png 424w, https://substackcdn.com/image/fetch/$s_!p2nP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21ee13f0-3e18-4870-87da-7edce7c9b785_1600x1222.png 848w, https://substackcdn.com/image/fetch/$s_!p2nP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21ee13f0-3e18-4870-87da-7edce7c9b785_1600x1222.png 1272w, https://substackcdn.com/image/fetch/$s_!p2nP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21ee13f0-3e18-4870-87da-7edce7c9b785_1600x1222.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!p2nP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21ee13f0-3e18-4870-87da-7edce7c9b785_1600x1222.png" width="1456" height="1112" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/21ee13f0-3e18-4870-87da-7edce7c9b785_1600x1222.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1112,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!p2nP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21ee13f0-3e18-4870-87da-7edce7c9b785_1600x1222.png 424w, https://substackcdn.com/image/fetch/$s_!p2nP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21ee13f0-3e18-4870-87da-7edce7c9b785_1600x1222.png 848w, https://substackcdn.com/image/fetch/$s_!p2nP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21ee13f0-3e18-4870-87da-7edce7c9b785_1600x1222.png 1272w, https://substackcdn.com/image/fetch/$s_!p2nP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21ee13f0-3e18-4870-87da-7edce7c9b785_1600x1222.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">In-context learning for image generation. Given a prompt and four reference images, the model generates a new image using both as context.</figcaption></figure></div><p>Given multiple reference images and a text prompt, an image generation model can produce a new image that blends both the prompt and the provided images. This is the model learning in-context.</p><p>Image generation models like Nano Banana showed how we can combine several images that are in-context to create blended images, change scenery, and even mix up styles. No fine-tuning is needed for a specific task.</p><p>This ability of image generation models to learn from context is what allows us to combine them with RAG to form RAGE.</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;a92e5575-2fde-4285-a9df-150357fa1608&quot;,&quot;caption&quot;:&quot;When GPT-4o Image launched, it took the world by storm. Within just a week, users had generated over 700 million images. Social media was flooded with Studio Ghibli-style portraits, and everyone seemed to be trying it out. In contrast, the release of Google&#8217;s Nano Banana felt almost secretive. Yet, despite its quiet debut, its capabilities quickly made &#8230;&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;How Nano Banana Compares to GPT-4o image&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:49908626,&quot;name&quot;:&quot;Eteimorde Youdiowei&quot;,&quot;bio&quot;:&quot;Simplifying complex ideas is my passion.....&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/90f6ea8f-0227-42b7-8c09-47e819b7f743_661x661.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null},{&quot;id&quot;:280510396,&quot;name&quot;:&quot;Neurl Creators&quot;,&quot;bio&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1b0dc913-49dd-4e59-ac39-bbbf289e1744_256x256.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-09-12T10:00:22.019Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/346d4034-d31a-4daa-bd3e-392e69da49e4_1456x1048.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://neurlcreators.substack.com/p/how-nano-banana-compares-to-gpt-4o&quot;,&quot;section_name&quot;:&quot;&#9881;&#65039; BuildAIers&#8217; Toolkit &#9881;&#65039;&quot;,&quot;video_upload_id&quot;:null,&quot;id&quot;:173385653,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:1,&quot;comment_count&quot;:0,&quot;publication_id&quot;:3228552,&quot;publication_name&quot;:&quot;The Neural Blueprint: Practical Content for AI Builders &quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!6udc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5b15961-5020-4a71-b040-30f9b3d3f232_256x256.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><h2>Automating the Process: Combining RAG with Image Generation</h2><p>RAG was introduced in the paper <a href="https://arxiv.org/abs/2005.11401">Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks</a>, and it marked a major shift in how we build language-model systems.</p><p>The core idea was simple: instead of relying only on a model&#8217;s internal knowledge, we retrieve relevant external information and use it directly in-context to improve generation.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_No8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff95185f4-29fa-4681-bb00-ee38eb11537a_1600x676.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_No8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff95185f4-29fa-4681-bb00-ee38eb11537a_1600x676.png 424w, https://substackcdn.com/image/fetch/$s_!_No8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff95185f4-29fa-4681-bb00-ee38eb11537a_1600x676.png 848w, https://substackcdn.com/image/fetch/$s_!_No8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff95185f4-29fa-4681-bb00-ee38eb11537a_1600x676.png 1272w, https://substackcdn.com/image/fetch/$s_!_No8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff95185f4-29fa-4681-bb00-ee38eb11537a_1600x676.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_No8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff95185f4-29fa-4681-bb00-ee38eb11537a_1600x676.png" width="1456" height="615" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f95185f4-29fa-4681-bb00-ee38eb11537a_1600x676.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:615,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_No8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff95185f4-29fa-4681-bb00-ee38eb11537a_1600x676.png 424w, https://substackcdn.com/image/fetch/$s_!_No8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff95185f4-29fa-4681-bb00-ee38eb11537a_1600x676.png 848w, https://substackcdn.com/image/fetch/$s_!_No8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff95185f4-29fa-4681-bb00-ee38eb11537a_1600x676.png 1272w, https://substackcdn.com/image/fetch/$s_!_No8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff95185f4-29fa-4681-bb00-ee38eb11537a_1600x676.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Retrieval-Augmented Generation for LLMs</figcaption></figure></div><p>At a high level, RAG automates in-context learning for language models and consists of two stages:</p><ul><li><p>Indexing</p></li><li><p>Retrieval</p></li></ul><p>The same idea can be applied to image generation, which leads us to RAGE.</p><h3>Image Indexing</h3><p>To apply RAG to images, we first need a way to index them. Each image must be converted into a vector representation.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Oe3y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fcbb30e-b3bd-4c9f-bd6c-d3865b646435_1600x1267.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Oe3y!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fcbb30e-b3bd-4c9f-bd6c-d3865b646435_1600x1267.png 424w, https://substackcdn.com/image/fetch/$s_!Oe3y!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fcbb30e-b3bd-4c9f-bd6c-d3865b646435_1600x1267.png 848w, https://substackcdn.com/image/fetch/$s_!Oe3y!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fcbb30e-b3bd-4c9f-bd6c-d3865b646435_1600x1267.png 1272w, https://substackcdn.com/image/fetch/$s_!Oe3y!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fcbb30e-b3bd-4c9f-bd6c-d3865b646435_1600x1267.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Oe3y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fcbb30e-b3bd-4c9f-bd6c-d3865b646435_1600x1267.png" width="1456" height="1153" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5fcbb30e-b3bd-4c9f-bd6c-d3865b646435_1600x1267.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1153,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Oe3y!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fcbb30e-b3bd-4c9f-bd6c-d3865b646435_1600x1267.png 424w, https://substackcdn.com/image/fetch/$s_!Oe3y!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fcbb30e-b3bd-4c9f-bd6c-d3865b646435_1600x1267.png 848w, https://substackcdn.com/image/fetch/$s_!Oe3y!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fcbb30e-b3bd-4c9f-bd6c-d3865b646435_1600x1267.png 1272w, https://substackcdn.com/image/fetch/$s_!Oe3y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fcbb30e-b3bd-4c9f-bd6c-d3865b646435_1600x1267.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">To index an image from its caption, a vision model first generates the caption, which is then converted into a vector embedding and stored in a vector database.</figcaption></figure></div><p>One approach is to generate a text description of an image using a <a href="https://www.ibm.com/think/topics/vision-language-models">vision model</a>, then pass that description through an <a href="https://www.coursera.org/articles/embedding-model">embedding model</a> and store the resulting vector representation in a vector database.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ycKE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ad8983-53b0-4110-8716-5e5eccdee7f4_1600x1267.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ycKE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ad8983-53b0-4110-8716-5e5eccdee7f4_1600x1267.png 424w, https://substackcdn.com/image/fetch/$s_!ycKE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ad8983-53b0-4110-8716-5e5eccdee7f4_1600x1267.png 848w, https://substackcdn.com/image/fetch/$s_!ycKE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ad8983-53b0-4110-8716-5e5eccdee7f4_1600x1267.png 1272w, https://substackcdn.com/image/fetch/$s_!ycKE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ad8983-53b0-4110-8716-5e5eccdee7f4_1600x1267.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ycKE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ad8983-53b0-4110-8716-5e5eccdee7f4_1600x1267.png" width="1456" height="1153" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a7ad8983-53b0-4110-8716-5e5eccdee7f4_1600x1267.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1153,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ycKE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ad8983-53b0-4110-8716-5e5eccdee7f4_1600x1267.png 424w, https://substackcdn.com/image/fetch/$s_!ycKE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ad8983-53b0-4110-8716-5e5eccdee7f4_1600x1267.png 848w, https://substackcdn.com/image/fetch/$s_!ycKE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ad8983-53b0-4110-8716-5e5eccdee7f4_1600x1267.png 1272w, https://substackcdn.com/image/fetch/$s_!ycKE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ad8983-53b0-4110-8716-5e5eccdee7f4_1600x1267.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">An image can be indexed directly using a vision encoder to create a vector representation that is stored in a vector database.</figcaption></figure></div><p>Another option is to use a <a href="https://www.emergentmind.com/topics/vision-encoder">vision encoder</a> that generates embeddings directly from images, skipping the text step entirely.</p><p>This process is repeated for as many images as needed, building an image index that can later be queried.</p><div class="pullquote"><p>Know someone who might need this? Share this post with your network and friends!</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/combining-retrieval-augmented-generation?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/combining-retrieval-augmented-generation?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><h3>Image Retrieval and Generation</h3><p>Once indexing is complete, the system moves to the retrieval and generation stage.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!AW35!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0b99d46-c637-4334-97cc-5af972b447ee_1600x1435.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AW35!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0b99d46-c637-4334-97cc-5af972b447ee_1600x1435.png 424w, https://substackcdn.com/image/fetch/$s_!AW35!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0b99d46-c637-4334-97cc-5af972b447ee_1600x1435.png 848w, https://substackcdn.com/image/fetch/$s_!AW35!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0b99d46-c637-4334-97cc-5af972b447ee_1600x1435.png 1272w, https://substackcdn.com/image/fetch/$s_!AW35!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0b99d46-c637-4334-97cc-5af972b447ee_1600x1435.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AW35!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0b99d46-c637-4334-97cc-5af972b447ee_1600x1435.png" width="1456" height="1306" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b0b99d46-c637-4334-97cc-5af972b447ee_1600x1435.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1306,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!AW35!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0b99d46-c637-4334-97cc-5af972b447ee_1600x1435.png 424w, https://substackcdn.com/image/fetch/$s_!AW35!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0b99d46-c637-4334-97cc-5af972b447ee_1600x1435.png 848w, https://substackcdn.com/image/fetch/$s_!AW35!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0b99d46-c637-4334-97cc-5af972b447ee_1600x1435.png 1272w, https://substackcdn.com/image/fetch/$s_!AW35!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0b99d46-c637-4334-97cc-5af972b447ee_1600x1435.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Retrieval Augmented Generation for Images involves retrieving the most relevant images that match a text prompt and using them as context to generate a new image.</figcaption></figure></div><p>Here, a user provides a text prompt. From that prompt, the system retrieves the images from the vector database that best match the user&#8217;s prompt.</p><p>These retrieved images, together with the original prompt, are then provided in-context to an image generation model, which uses both the text and visual context to generate a new image.</p><p>This end-to-end flow is what defines RAGE: combining retrieval with in-context image generation to automate what would otherwise be a manual process.</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;34f6051d-f84c-40f4-84ec-962dfe497a2f&quot;,&quot;caption&quot;:&quot;Vector databases have become an indispensable tool in modern AI workflows, particularly in retrieval-augmented generation (RAG), semantic search, recommendation systems, and multimodal applications.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Top 6 AI Vector Databases Compared (2025): Which One Should You Choose as an AI Builder?&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:280510396,&quot;name&quot;:&quot;Neurl Creators&quot;,&quot;bio&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1b0dc913-49dd-4e59-ac39-bbbf289e1744_256x256.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null},{&quot;id&quot;:85806,&quot;name&quot;:&quot;Stephen FIYINFOLUWA Oladele&quot;,&quot;bio&quot;:&quot;AI Engineer | Technical Creator | Building Tech for Creatives | Faith is the principal thing &#128170;&#127997; | Athletics | The Modern Day Generalist (TMG) &#127963;&#65039; | Make everything you do beautiful; make them art &#127912; | Ecclesiastes 9:10 &#10013;&#65039;&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/15119b47-1c09-4ae1-b6d6-df7bffdbab04_2160x2160.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-02-16T22:43:08.679Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cac77e32-25d5-4804-9567-b93ce28cb288_1920x1080.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://neurlcreators.substack.com/p/comparing-vector-databases-in-2025&quot;,&quot;section_name&quot;:&quot;&#9881;&#65039; BuildAIers&#8217; Toolkit &#9881;&#65039;&quot;,&quot;video_upload_id&quot;:null,&quot;id&quot;:157275167,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:1,&quot;comment_count&quot;:0,&quot;publication_id&quot;:3228552,&quot;publication_name&quot;:&quot;The Neural Blueprint: Practical Content for AI Builders &quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!6udc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5b15961-5020-4a71-b040-30f9b3d3f232_256x256.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><h2>Real-World Implementation</h2><p>Now that we&#8217;ve covered the core idea behind RAGE, let&#8217;s walk through a working demo application that brings it to life.</p><div id="youtube2-fQHHGeZ4yVQ" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;fQHHGeZ4yVQ&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/fQHHGeZ4yVQ?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>The video above shows a fully functional RAGE application. The workflow is simple:</p><ol><li><p>The user provides a text prompt.</p></li><li><p>Images that match the prompt are retrieved from the image store.</p></li><li><p>The user selects one or more images from the retrieved results.</p></li><li><p>Once selected, the user clicks <strong>Generate</strong> to create a new image.</p></li></ol><p>The demo is built using <a href="https://www.trychroma.com/">ChromaDB</a> as the vector database and <a href="https://platform.openai.com/docs/guides/image-generation">OpenAI GPT-Image</a> as the image generation model.</p><p>Each part of the stack is modular, making it easy to swap components and customize the overall experience.</p><blockquote><p>Explore the full implementation on GitHub: <a href="https://github.com/Neurl-LLC/RAGE">RAGE</a></p></blockquote><h3>Use Cases for RAGE</h3><p>Let&#8217;s explore some real-world ways RAGE can be used in practice.</p><h3>Marketing and Creative Asset Generation</h3><p>RAGE can be used to generate new marketing images from an existing catalogue of brand assets. Instead of starting from scratch, the system retrieves relevant product images, brand visuals, or campaign materials and uses them as context to generate fresh content.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TMF_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F917575bd-11e8-4060-8982-24357a0b84b9_1600x859.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TMF_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F917575bd-11e8-4060-8982-24357a0b84b9_1600x859.png 424w, https://substackcdn.com/image/fetch/$s_!TMF_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F917575bd-11e8-4060-8982-24357a0b84b9_1600x859.png 848w, https://substackcdn.com/image/fetch/$s_!TMF_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F917575bd-11e8-4060-8982-24357a0b84b9_1600x859.png 1272w, https://substackcdn.com/image/fetch/$s_!TMF_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F917575bd-11e8-4060-8982-24357a0b84b9_1600x859.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TMF_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F917575bd-11e8-4060-8982-24357a0b84b9_1600x859.png" width="1456" height="782" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/917575bd-11e8-4060-8982-24357a0b84b9_1600x859.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:782,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TMF_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F917575bd-11e8-4060-8982-24357a0b84b9_1600x859.png 424w, https://substackcdn.com/image/fetch/$s_!TMF_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F917575bd-11e8-4060-8982-24357a0b84b9_1600x859.png 848w, https://substackcdn.com/image/fetch/$s_!TMF_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F917575bd-11e8-4060-8982-24357a0b84b9_1600x859.png 1272w, https://substackcdn.com/image/fetch/$s_!TMF_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F917575bd-11e8-4060-8982-24357a0b84b9_1600x859.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">RAGE used for marketing and creative asset generation</figcaption></figure></div><p>In the example above, when the user prompted, <em>&#8220;A realistic image of a salesman trying to sell lotion and soap to a group of ladies,&#8221;</em> the retriever pulled images of the lotion and soap from the image store. These retrieved images were then used in-context to generate a scene where the salesman and the ladies are holding the same lotion and soap. Without those in-context images, the model would have generated completely different products that would not match the brand.</p><h3>Image Editing and Enhancement</h3><p>RAGE can also be used to retrieve and edit existing images using natural language prompts. This includes tasks such as changing backgrounds, adjusting scenery, modifying objects, enhancing image quality, or combining multiple images.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OiQi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927bc7bd-2ca2-4fd5-9d0f-6e0bd4ee13e7_1600x859.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OiQi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927bc7bd-2ca2-4fd5-9d0f-6e0bd4ee13e7_1600x859.png 424w, https://substackcdn.com/image/fetch/$s_!OiQi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927bc7bd-2ca2-4fd5-9d0f-6e0bd4ee13e7_1600x859.png 848w, https://substackcdn.com/image/fetch/$s_!OiQi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927bc7bd-2ca2-4fd5-9d0f-6e0bd4ee13e7_1600x859.png 1272w, https://substackcdn.com/image/fetch/$s_!OiQi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927bc7bd-2ca2-4fd5-9d0f-6e0bd4ee13e7_1600x859.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OiQi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927bc7bd-2ca2-4fd5-9d0f-6e0bd4ee13e7_1600x859.png" width="1456" height="782" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/927bc7bd-2ca2-4fd5-9d0f-6e0bd4ee13e7_1600x859.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:782,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OiQi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927bc7bd-2ca2-4fd5-9d0f-6e0bd4ee13e7_1600x859.png 424w, https://substackcdn.com/image/fetch/$s_!OiQi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927bc7bd-2ca2-4fd5-9d0f-6e0bd4ee13e7_1600x859.png 848w, https://substackcdn.com/image/fetch/$s_!OiQi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927bc7bd-2ca2-4fd5-9d0f-6e0bd4ee13e7_1600x859.png 1272w, https://substackcdn.com/image/fetch/$s_!OiQi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927bc7bd-2ca2-4fd5-9d0f-6e0bd4ee13e7_1600x859.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">RAGE used for Image Editing and Enhancement</figcaption></figure></div><p>Because the model has access to the original images in-context, edits stay grounded in the source material. In the example above, we prompted RAGE with <em>&#8220;Combine the image of the lotion and soap,&#8221;</em> which retrieved the relevant images and merged them into a single generated output.</p><div class="pullquote"><p><em>Need creative, high-quality technical content? Happy to chat! Book a call with our Creative Engineers&#128071;&#127998;</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://calendar.app.google/s6ekF2rDbPjKWLLQ8&quot;,&quot;text&quot;:&quot;Let&#8217;s Chat! &#128222;&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://calendar.app.google/s6ekF2rDbPjKWLLQ8"><span>Let&#8217;s Chat! &#128222;</span></a></p></div><h3>Style Transfer and Visual Consistency</h3><p>By storing different visual styles in the image index, RAGE enables style transfer without any fine-tuning. The system retrieves images that represent a specific aesthetic and applies that style to new generations.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qfoF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb9138c3-a03a-42ac-ad5b-45d6c31bf458_1600x859.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qfoF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb9138c3-a03a-42ac-ad5b-45d6c31bf458_1600x859.png 424w, https://substackcdn.com/image/fetch/$s_!qfoF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb9138c3-a03a-42ac-ad5b-45d6c31bf458_1600x859.png 848w, https://substackcdn.com/image/fetch/$s_!qfoF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb9138c3-a03a-42ac-ad5b-45d6c31bf458_1600x859.png 1272w, https://substackcdn.com/image/fetch/$s_!qfoF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb9138c3-a03a-42ac-ad5b-45d6c31bf458_1600x859.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qfoF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb9138c3-a03a-42ac-ad5b-45d6c31bf458_1600x859.png" width="1456" height="782" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/db9138c3-a03a-42ac-ad5b-45d6c31bf458_1600x859.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:782,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qfoF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb9138c3-a03a-42ac-ad5b-45d6c31bf458_1600x859.png 424w, https://substackcdn.com/image/fetch/$s_!qfoF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb9138c3-a03a-42ac-ad5b-45d6c31bf458_1600x859.png 848w, https://substackcdn.com/image/fetch/$s_!qfoF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb9138c3-a03a-42ac-ad5b-45d6c31bf458_1600x859.png 1272w, https://substackcdn.com/image/fetch/$s_!qfoF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb9138c3-a03a-42ac-ad5b-45d6c31bf458_1600x859.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">RAGE used for visual consistency</figcaption></figure></div><p>This is especially useful for maintaining visual consistency across illustrations, artwork, or branded assets. In the example above, a scented candle image was generated to match the same style as the retrieved images.</p><h3>Product and Design Prototyping</h3><p>Design teams can use RAGE to explore new product ideas by retrieving existing designs, sketches, or reference images. This allows for rapid experimentation while staying aligned with previous designs and constraints.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PM_9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518a7936-cc1a-4926-99f4-43da7236dd0b_1600x859.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PM_9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518a7936-cc1a-4926-99f4-43da7236dd0b_1600x859.png 424w, https://substackcdn.com/image/fetch/$s_!PM_9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518a7936-cc1a-4926-99f4-43da7236dd0b_1600x859.png 848w, https://substackcdn.com/image/fetch/$s_!PM_9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518a7936-cc1a-4926-99f4-43da7236dd0b_1600x859.png 1272w, https://substackcdn.com/image/fetch/$s_!PM_9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518a7936-cc1a-4926-99f4-43da7236dd0b_1600x859.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PM_9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518a7936-cc1a-4926-99f4-43da7236dd0b_1600x859.png" width="1456" height="782" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/518a7936-cc1a-4926-99f4-43da7236dd0b_1600x859.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:782,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PM_9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518a7936-cc1a-4926-99f4-43da7236dd0b_1600x859.png 424w, https://substackcdn.com/image/fetch/$s_!PM_9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518a7936-cc1a-4926-99f4-43da7236dd0b_1600x859.png 848w, https://substackcdn.com/image/fetch/$s_!PM_9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518a7936-cc1a-4926-99f4-43da7236dd0b_1600x859.png 1272w, https://substackcdn.com/image/fetch/$s_!PM_9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518a7936-cc1a-4926-99f4-43da7236dd0b_1600x859.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">RAGE used for product experimentation</figcaption></figure></div><p>In the example above, we experimented with a transparent product design by retrieving an incense kit from the image store and generating a transparent version of the kit to visualize how it might look.</p><div class="pullquote"><p><em>Join the conversation and share your experiences in the comments below!</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/combining-retrieval-augmented-generation/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/combining-retrieval-augmented-generation/comments"><span>Leave a comment</span></a></p></div><h2>Conclusion</h2><p>RAGE takes the core idea of RAG, retrieving relevant context and using it in generation, and applies it to images. Instead of manually feeding reference images into an image generator, RAGE automates the process by searching a database of visuals and using the most relevant ones to guide generation. This makes image generation more consistent, controllable, and scalable.</p><p>The use cases in this article are just a glimpse of what&#8217;s possible when Retrieval Augmented Generation is combined with image generation. By connecting an image generation model to an image store, you can unlock entirely new creative possibilities.</p><div class="pullquote"><p><em>Subscribe to <a href="https://neurlcreators.substack.com/">The Neural Blueprint</a>&#128071;&#127998; for hands-on guides! &#129761; Follow us on <a href="http://www.youtube.com/@neurlcreators">YouTube</a>, <a href="https://x.com/NeurlCreators">X</a>, and <a href="https://www.linkedin.com/showcase/neurl-creators/">LinkedIn</a>.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/subscribe?"><span>Subscribe now</span></a></p></div>]]></content:encoded></item><item><title><![CDATA[Convert Your PDFs into a Queryable, Agent-Ready Catalog with fenic]]></title><description><![CDATA[A workflow that parses PDFs, builds a usable sections index and exposes it as clean MCP tools, so agents (and humans) can ask real questions instead of skimming page by page.]]></description><link>https://neurlcreators.substack.com/p/fenic-pdf-to-mcp-tools</link><guid isPermaLink="false">https://neurlcreators.substack.com/p/fenic-pdf-to-mcp-tools</guid><dc:creator><![CDATA[Stephen FIYINFOLUWA Oladele]]></dc:creator><pubDate>Tue, 23 Dec 2025 11:34:20 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/aa915642-a927-42ad-9384-6a6659650dd7_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2><strong>TL;DR</strong></h2><ul><li><p>Turn messy PDFs into a clean <strong>sections table</strong> (H1&#8211;H3 chunks with paths) and a <strong>topics index</strong> (e.g., <em>training</em>, <em>SOC</em>, <em>governance</em>).</p></li><li><p>Publish your curated tables as <strong>minimal MCP tools</strong> (list_whitepapers, sections_by_topic, and search_sections) so any MCP-capable host (or your own client) can query them.</p></li><li><p>Keep it small, deterministic, and agent-ready; add vectors/LLM ranking later as an optional search layer.</p></li></ul><blockquote><p><strong>Try it: </strong><a href="https://github.com/typedef-ai/fenic-examples/tree/main/pdf_catalog_agent">Open the demo repo</a><strong>&#8599;&#65039;</strong> and point it at a small set of your PDFs first.</p></blockquote><div><hr></div><h2><strong>Introduction</strong></h2><p>Most teams eventually inherit a pile of PDF whitepapers (e.g., governance frameworks, training standards, SOC roadmaps) scattered across folders. You know the answers are &#8220;in there somewhere&#8221;, but searching the raw PDFs is slow, token-hungry, and brittle for agents.</p><p>In our new demo, we&#8217;ll build a compact, production-shaped pipeline with <strong>fenic</strong> that:</p><p>(1) parses PDFs into structured <strong>sections</strong>,</p><p>(2) indexes headings by <strong>topics of interest</strong>, and</p><p>(3) exposes the curated dataset as <strong>three MCP tools</strong>.</p><p>The goal is to <strong>prove the value of curation</strong> with simple tools over clean tables.</p><p><strong>What you&#8217;ll learn:</strong></p><ul><li><p>How to split Markdown&#8217;d PDFs into <strong>H1&#8211;H3 sections</strong> with stable paths.</p></li><li><p>How to build a <strong>topics index</strong> (e.g., <em>training</em>, <em>SOC</em>, <em>governance</em>) from earlier classifications.</p></li><li><p>How to <strong>persist</strong> tables and expose them as <strong>MCP tools</strong> you can call in a few lines.</p></li><li><p>An optional tool to extend with vectors or LLM ranking <em>later</em>.</p></li></ul><blockquote><p><a href="https://github.com/typedef-ai/fenic-examples/tree/main/pdf_catalog_agent">Open and star &#127775; the demo repo</a><strong>&#8599;&#65039;</strong></p></blockquote><h2><strong>Why not &#8220;just let the agent read my PDFs&#8221;?</strong></h2><p>Short answer: you&#8217;ll get <strong>inconsistent</strong> answers, <strong>higher</strong> latency/cost, and <strong>chaotic</strong> control over what the model reads. A curated dataset fixes that:</p><ul><li><p><strong>Deterministic scope.</strong> You decide the rows and columns with no accidental 200-page token floods.</p></li><li><p><strong>Human-legible structure.</strong> Sections (heading, level, path) are explainable, journalable, and testable.</p></li><li><p><strong>Reusability.</strong> The same tables support search, analytics, and agents without re-ingesting PDFs.</p></li></ul><p>Agents still have a place, just on top of a <strong>small, intentional surface area</strong>.</p><h2><strong>What we&#8217;re building</strong></h2><p><strong>Data products:</strong></p><ul><li><p><code>whitepaper_sections</code>, one row per section (H1&#8211;H3):<br><code>id, name, heading, level, content, full_path</code></p></li><li><p><code>whitepaper_topics</code>, one row per (doc, heading, topic):<br><code>doc_id, name, topic, heading</code></p></li></ul><p><strong>MCP tools:</strong></p><ol><li><p><code>list_whitepapers()</code>: PDF names + section counts</p></li><li><p><code>sections_by_topic(topic)</code>: rows from the curated topic index</p></li><li><p><code>search_sections(query)</code>: simple, case-insensitive substring match</p></li></ol><p>That&#8217;s it. Three tools, three mental models, and everything flows from the tables.</p><h2><strong>Step 1. Normalize the source</strong></h2><p>We start from a normalized fenic DataFrame (see deduped_docs_pdf_content_final in the demo) with:</p><ul><li><p><code>id</code>: numeric or stable id per document</p></li><li><p><code>name</code>: filename/title</p></li><li><p><code>markdown_content</code>: the full body as Markdown (from your PDF-to-MD step)</p></li><li><p>optional: <code>toc</code>, <code>page_count</code>, earlier <code>content_categorization</code> output</p></li></ul><p>If you&#8217;re evaluating your own content, add a unique <code>source_uri</code> and <code>ingested_at</code> for provenance.</p><p><strong>Tip:</strong> Keep <strong>Markdown</strong> as the canonical text. <a href="https://docs.fenic.ai/latest/examples/markdown_processing/">fenic&#8217;s Markdown utilities</a> are built for it, and you&#8217;ll avoid HTML cleanup.</p><div class="pullquote"><p>Know someone who might need this? Share this post with your network and friends!</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/fenic-pdf-to-mcp-tools?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/fenic-pdf-to-mcp-tools?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><h2><strong>Step 2. Split Markdown into sections (H1&#8211;H3)</strong></h2><p>The core trick is to chunk each whitepaper by its headings. With fenic&#8217;s markdown operator you can extract header-scoped chunks at specific levels, then union them into a single column of section structs:</p><ul><li><p><code>heading</code>: the section title</p></li><li><p><code>level</code>: 1, 2, or 3</p></li><li><p><code>content</code>: the body of that section</p></li><li><p><code>full_path</code>: breadcrumb string like H1 &gt; H2 &gt; H3</p></li></ul><p>Why H1&#8211;H3? In practice, this is the <strong>sweet spot</strong>: it preserves structure without exploding into paragraphs. You can always tighten/loosen later.</p><p><strong>Quality checks you can print instead of schemas:</strong></p><ul><li><p>Count of sections per doc.</p></li><li><p>Distribution by <code>level</code>.</p></li><li><p>A quick sample of <code>full_path</code> strings (valid breadcrumbs).</p></li></ul><h2><strong>Step 3. Build a topics index from earlier classification</strong></h2><p>Earlier in the notebook, you likely produced a <code>content_categorization</code> object per document. Turn those lists (e.g., <code>sections_about_model_training</code>) into a <strong>long form &#8220;topics&#8221; table</strong> with one row per (<code>doc_id</code>, <code>topic</code>, <code>heading</code>).</p><p>The result: <code>whitepaper_topics</code> can answer &#8220;show me every <em>Training</em> heading across the catalog,&#8221; using clear group-by and joins.</p><div class="pullquote"><p><em>Need creative, high-quality technical content? Happy to chat! Book a call with our Creative Engineers&#128071;&#127998;</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://calendar.app.google/s6ekF2rDbPjKWLLQ8&quot;,&quot;text&quot;:&quot;Let&#8217;s Chat! &#128222;&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://calendar.app.google/s6ekF2rDbPjKWLLQ8"><span>Let&#8217;s Chat! &#128222;</span></a></p></div><h2><strong>Step 4. Persist the tables (idempotent)</strong></h2><p>Save both DataFrames as tables so your MCP tools can reference them by name:</p><ul><li><p><code>whitepaper_sections</code></p></li><li><p><code>whitepaper_topics</code></p></li></ul><p>Two rules keep things robust for demos and CI:</p><ol><li><p><strong>Idempotent writes</strong> (e.g., <code>mode=&#8221;overwrite&#8221;</code> for dev runs).</p></li><li><p><strong>No UDFs inside tool definitions.</strong> Keep tool plans to built-ins so they serialize cleanly.</p></li></ol><p>This avoids the common &#8220;UDFExpr cannot be serialized&#8221; error when saving catalog tools.</p><h2><strong>Step 5. Publish MCP tools</strong></h2><p>With the tables saved, register three MCP tools:</p><ol><li><p><code>list_whitepapers()</code><strong><br></strong>Group by <code>id</code>, <code>name</code>, count sections, sort by name.<br><strong>Why it matters</strong><em>:</em> gives users a friendly &#8220;index&#8221; and builds confidence that your catalog is concrete.</p></li><li><p><code>sections_by_topic(topic)</code><strong><br></strong>Join <code>whitepaper_topics</code> to <code>whitepaper_sections</code> on (<code>doc_id</code>, <code>heading</code>), filter by normalized topic, sort by (<code>name</code>, <code>full_path</code>, <code>level</code>).<br><strong>Why it matters:</strong> demonstrates your curated taxonomy from Step 3.</p></li><li><p><code>search_sections(query)</code><strong><br></strong>Case-insensitive substring match on heading or content.<br><strong>Why it matters:</strong> the simplest possible search. Predictable, explainable, and fast.</p><div class="pullquote"><p>Join the conversation and share your experiences in the comments below!</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/fenic-pdf-to-mcp-tools/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/fenic-pdf-to-mcp-tools/comments"><span>Leave a comment</span></a></p></div></li></ol><h2><strong>Step 6. Smoke-test in a few lines</strong></h2><p>Use the simple test client from the notebook:</p><ol><li><p><code>list_whitepapers()</code></p></li><li><p><code>sections_by_topic(&#8221;training&#8221;)</code></p></li><li><p><code>search_sections(&#8221;latency&#8221;)</code></p></li></ol><p>Each prints a short markdown table (15 rows max) with sensible columns. Done. That&#8217;s enough to understand &#8220;what&#8217;s in the box&#8221; and for you to wire the tools into any MCP-capable host or a thin UI.</p><h2><strong>What &#8220;done&#8221; looks like</strong></h2><ul><li><p>A printed line like:<br><code>&#9989; MCP HTTP server ready at http://127.0.0.1:54217/mcp</code></p></li><li><p><code>whitepaper_sections</code> with thousands of rows (H1&#8211;H3 blocks) and consistent <code>full_paths</code>.</p></li><li><p><code>whitepaper_topics</code> with a handful of topics and the headings they touch.</p></li><li><p>Three tools you can <strong>demo interactively</strong> and wire into an MCP host later.</p></li></ul><p>From here, product and compliance users can ask concrete questions: </p><blockquote><p><em>&#8220;Show me all Governance sections across vendor A and B,&#8221;</em> </p></blockquote><p>or </p><blockquote><p><em>&#8220;Where do we talk about training data retention?&#8221;</em> </p></blockquote><p>&#8230; without opening a single PDF.</p><h2><strong>Extending the demo (when you&#8217;re ready)</strong></h2><p>These are intentionally <strong>out of the core</strong> demo but easy to layer on when you need them:</p><ol><li><p><strong>Vector ranking (</strong><code>similar_sections</code><strong>)<br></strong>Precompute an embedding for a <code>joined_text</code> column (e.g., <code>heading + two newlines + content</code>) with <a href="https://docs.fenic.ai/latest/reference/fenic/api/functions/semantic/#fenic.api.functions.semantic.embed">semantic.embed</a> and rank with cosine similarity. This is great for &#8220;find me nearby&#8221; retrieval, but it&#8217;s a <em>bonus</em>. Keep it capped (<code>result_limit</code>) so payloads stay small.</p></li><li><p><strong>LLM scoring (</strong><code>qa_sections_llm</code><strong>)<br></strong>If you want the model to &#8220;vote&#8221; on the best leaf for a question, map a short instruction over a candidate set and parse a numeric score. A common pitfall: ensure your <a href="https://docs.fenic.ai/latest/reference/fenic/api/functions/semantic/#fenic.api.functions.semantic.map">semantic.map</a> instruction is a string literal (the prompt), and if you must use UDFs, <strong>don&#8217;t</strong> embed them inside tool plans; materialize their outputs first.</p></li><li><p><strong>Contact signals.<br></strong>Regex UDFs for emails/URLs/phones can be useful for compliance and outreach. Again, do the extraction in the table, but keep the tool definitions UDF-free to avoid serialization errors.</p></li><li><p><strong>Richer taxonomies.</strong></p></li></ol><blockquote><p>Expand from three topics to your real taxonomy. As it grows, the <code>sections_by_topic</code> join stays the same.</p></blockquote><h2><strong>Performance, cost, and safety notes</strong></h2><ul><li><p><strong>Keep rows short:</strong> sections (vs. full PDFs) dramatically reduce tokens if you later add LLM steps.</p></li><li><p><strong>Cache deliberately:</strong> when computing embeddings or heavy transforms, cache the intermediate DF that you&#8217;ll reuse.</p></li><li><p><strong>Idempotency matters:</strong> overwrite in dev; use versioned table names in CI.</p></li><li><p><strong>No UDFs in tools:</strong> build tool queries from built-in expressions so they serialize cleanly into the catalog.</p></li><li><p><strong>Deterministic joins:</strong> When you link topics to sections, join on (<code>doc_id</code>, <code>heading</code>) and sort by (<code>name</code>, <code>full_path</code>, <code>level</code>) for tidy, predictable outputs.</p></li></ul><h2><strong>How teams can use this</strong></h2><ol><li><p><strong>Security and compliance search<br></strong>Your security team asks, &#8220;Where do we promise data retention limits?&#8221; instead of skimming PDFs, they call:</p><ul><li><p><code>search_sections(&#8221;retention&#8221;)</code></p></li><li><p>If they need curated context: <code>sections_by_topic(&#8221;governance&#8221;)</code>.</p></li></ul></li><li><p><strong>Audit prep/due diligence<br></strong>During vendor reviews, you can export slices to CSV and hand them to legal or partners. The query is always the same: <em>&#8220;Someone missed a paragraph on page 63.&#8221;</em></p></li><li><p><strong>Agent surfaces<br></strong>Whether you use a CLI host or a UI, the agent doesn&#8217;t freestyle answers. It calls deterministic tools and renders results with no hallucinated policy claims.</p></li><li><p><strong>Docs Ops/editorial<br></strong>Basic counts (<code>list_whitepapers</code>) often reveal <em>&#8220;we have 3 policy docs with 300+ sections; we should consolidate&#8221;</em> or <em>&#8220;governance sections are thin compared to training. Time to write.&#8221;</em></p></li></ol><h2><strong>Adopting this in your stack</strong></h2><ol><li><p><a href="https://colab.research.google.com/github/typedef-ai/fenic-examples/blob/main/pdf_catalog_agent/fenic_pdf_catalog_agent.ipynb">Point the Colab</a><strong>&#8599;&#65039;</strong> at your PDFs (or markdown).</p></li><li><p>Keep the header levels small at first (H1&#8211;H3 usually suffices).</p></li><li><p>Save the two tables and wire up the MCP tools.</p></li><li><p>Share the MCP endpoints with your security, docs, and support teams.</p></li><li><p>Iterate on the topics index with real stakeholders (that&#8217;s where the value clicks).</p></li><li><p>Only then consider vectors or LLM rankings as optional upgrades (See the <em><a href="https://colab.research.google.com/github/typedef-ai/fenic-examples/blob/main/pdf_catalog_agent/fenic_pdf_catalog_agent.ipynb#scrollTo=rocuc8P1p-At">Next Steps section</a></em> of the notebook).</p></li></ol><h2><strong>Conclusion and closing thoughts</strong></h2><p>In <a href="https://github.com/typedef-ai/fenic-examples/tree/main/pdf_catalog_agent">this demo</a>, you have taken raw PDFs, imposed a small set of consistent structures, and <strong>published</strong> them behind three MCP tools. That&#8217;s the shape agents, analysts, and engineers can all agree on.</p><p>With <strong><a href="https://fenic.ai/">fenic</a></strong>, the whole pipeline stays in a single, readable DataFrame flow. You don&#8217;t use glue code, no batching, and no mystery prompts hiding in strings.</p><p>When you&#8217;re ready, add vectors or LLM scoring as <strong>optional</strong> layers. But ship the curated tables first.</p><div><hr></div><h3><strong>Try the demo, then iterate</strong></h3><ul><li><p><strong><a href="https://github.com/typedef-ai/fenic-examples/tree/main/pdf_catalog_agent">Open the demo repo</a>&#8599;&#65039;:</strong> run the notebook and replace the sample PDFs with your own slice.</p></li><li><p><strong><a href="https://docs.fenic.ai/latest/">Docs</a>&#8599;&#65039;:</strong> browse fenic&#8217;s Markdown, text, and semantic operators to extend the pipeline.</p></li><li><p><strong><a href="https://github.com/typedef-ai/fenic/tree/main/examples">Examples</a>&#8599;&#65039;:</strong> look at other fenic demos for inspiration, then keep your surface area small and focused.</p></li></ul><div><hr></div><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;d2260ba0-117c-4e12-8e02-12a04ea15709&quot;,&quot;caption&quot;:&quot;This technical guide is a repost from what originally appeared on Typedef AI&#8217;s blog as AI Content Pipeline for Search, Recommenders &amp; Agents with fenic&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;md&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Ship a Practical AI Content Pipeline for Search, Recommendations, and Agents with fenic&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:85806,&quot;name&quot;:&quot;Stephen FIYINFOLUWA Oladele&quot;,&quot;bio&quot;:&quot;AI Engineer | Technical Creator | Building Tech for Creatives | Faith is the principal thing &#128170;&#127997; | Athletics | The Modern Day Generalist (TMG) &#127963;&#65039; | Make everything you do beautiful; make them art &#127912; | Ecclesiastes 9:10 &#10013;&#65039;&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/15119b47-1c09-4ae1-b6d6-df7bffdbab04_2160x2160.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null},{&quot;id&quot;:280510396,&quot;name&quot;:&quot;Neurl Creators&quot;,&quot;bio&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1b0dc913-49dd-4e59-ac39-bbbf289e1744_256x256.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-12-12T11:15:35.125Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/68516f0d-53e7-42f8-b933-a36eb7d896ed_1456x1048.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://neurlcreators.substack.com/p/build-fenic-pipeline-for-content-intelligence&quot;,&quot;section_name&quot;:&quot;&#9881;&#65039; BuildAIers&#8217; Toolkit &#9881;&#65039;&quot;,&quot;video_upload_id&quot;:null,&quot;id&quot;:181317982,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:2,&quot;comment_count&quot;:0,&quot;publication_id&quot;:3228552,&quot;publication_name&quot;:&quot;The Neural Blueprint: Practical Content for AI Builders &quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!6udc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5b15961-5020-4a71-b040-30f9b3d3f232_256x256.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><div class="pullquote"><p><em>Subscribe to <a href="https://neurlcreators.substack.com/">The Neural Blueprint</a>&#128071;&#127998; for hands-on guides! &#129761; Follow us on <a href="http://www.youtube.com/@neurlcreators">YouTube</a>, <a href="https://x.com/NeurlCreators">X</a>, and <a href="https://www.linkedin.com/showcase/neurl-creators/">LinkedIn</a></em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/subscribe?"><span>Subscribe now</span></a></p></div>]]></content:encoded></item><item><title><![CDATA[Mini-App: Build a Real-Time Speech-to-Text (STT) API with OpenAI’s Whisper]]></title><description><![CDATA[From coding to vibing, here&#8217;s how I built a real-time speech-to-text API]]></description><link>https://neurlcreators.substack.com/p/how-do-you-build-a-real-time-speech</link><guid isPermaLink="false">https://neurlcreators.substack.com/p/how-do-you-build-a-real-time-speech</guid><dc:creator><![CDATA[Eteimorde Youdiowei]]></dc:creator><pubDate>Thu, 18 Dec 2025 11:20:23 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/2e14f5ea-dcab-44c1-a3f2-98c00ac2358d_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>For a while, I had been experimenting with real-time speech-to-text (STT), mostly through Deepgram&#8217;s API. I built a <a href="https://deepgram.com/learn/build-a-real-time-transcription-app-with-react-and-deepgram">real-time transcription React app</a> powered by Deepgram&#8217;s <a href="https://deepgram.com/learn/introducing-nova-3-speech-to-text-api">Nova 3</a>. It worked well, but at some point, curiosity kicked in. I started asking myself what actually happens under the hood of a real-time STT service.</p><p>To answer that question, I decided to build a real-time speech-to-text API myself.</p><p>At first, I thought it would be straightforward. My assumption was simple: take an open source speech model like <a href="https://openai.com/index/whisper/">Whisper</a>, put it behind a server, and call it a day. I was wrong. Building a real-time speech-to-text API is not just about the speech model.</p><p>I ran into challenges I hadn&#8217;t anticipated, from audio streaming and chunking to latency, buffering, and handling partial transcripts. Working through these problems forced me to rethink what &#8220;real time&#8221; actually means in an STT system.</p><p>In this article, I&#8217;ll show you how I built a real-time transcription API that runs locally using OpenAI&#8217;s Whisper. I&#8217;ll walk you through the experiments that failed, the ones that worked, and how they eventually came together into a working system.</p><p>By the end, you&#8217;ll have a clearer understanding of how leading STT providers like Deepgram and AssemblyAI implement real-time transcription and whether building your own solution is worth the trade-off compared to relying on third-party services.</p><blockquote><p>See the <a href="https://github.com/Neurl-LLC/real-time-voice-asr">full code implementation on GitHub</a>.</p></blockquote><h2>What is the fundamental theory for real-time Speech-to-Text (STT)?</h2><p>In the world of <a href="https://deepgram.com/ai-glossary/speech-to-text-models">STT models</a>, there are two main approaches to transcription. The first is the <strong>prerecorded approach</strong>, where a recorded audio file is passed to an STT model to produce a complete transcript. The second is the <strong>real-time approach</strong>.</p><div id="youtube2-s4A6C0EGAmA" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;s4A6C0EGAmA&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/s4A6C0EGAmA?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Unlike the prerecorded approach, real-time STT is not just about the model itself. It relies on a modular system made up of several coordinated components:</p><ul><li><p>Collect audio data from an input source</p></li><li><p>Convert analog audio into a digital format</p></li><li><p>Stream the audio data to a server</p></li><li><p>Transcribe the audio and return text in real time using an STT model</p></li></ul><p>Each of these steps introduces its challenges, and together they form the foundation of any real-time transcription system.</p><h3>Input Sources</h3><p>The input source is a critical stage in the pipeline, as it captures the audio data used in subsequent steps. Common approaches include:</p><ul><li><p><strong>Microphone input:</strong> The most common scenario for real-time transcription, like seeing captions appear during a meeting as you speak.</p></li><li><p><strong>Live streaming from an external audio source:</strong> For transcribing live broadcasts, online radio, or other streaming audio in real time.</p></li><li><p><strong>Streaming from an audio file:</strong> Useful for testing and debugging, as prerecorded audio can be streamed in chunks to simulate a live feed.</p></li></ul><h3><strong>Analogue-to-Digital Conversion</strong></h3><p>Audio naturally exists as an analog signal, which computers and STT models cannot directly process. To work with STT models, we convert the audio into a digital format using <strong>Pulse Code Modulation (PCM)</strong>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!n6ca!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb52ac692-aa70-4edd-9f52-8cfca406aee5_1159x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!n6ca!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb52ac692-aa70-4edd-9f52-8cfca406aee5_1159x1600.png 424w, https://substackcdn.com/image/fetch/$s_!n6ca!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb52ac692-aa70-4edd-9f52-8cfca406aee5_1159x1600.png 848w, https://substackcdn.com/image/fetch/$s_!n6ca!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb52ac692-aa70-4edd-9f52-8cfca406aee5_1159x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!n6ca!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb52ac692-aa70-4edd-9f52-8cfca406aee5_1159x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!n6ca!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb52ac692-aa70-4edd-9f52-8cfca406aee5_1159x1600.png" width="1159" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b52ac692-aa70-4edd-9f52-8cfca406aee5_1159x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1159,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!n6ca!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb52ac692-aa70-4edd-9f52-8cfca406aee5_1159x1600.png 424w, https://substackcdn.com/image/fetch/$s_!n6ca!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb52ac692-aa70-4edd-9f52-8cfca406aee5_1159x1600.png 848w, https://substackcdn.com/image/fetch/$s_!n6ca!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb52ac692-aa70-4edd-9f52-8cfca406aee5_1159x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!n6ca!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb52ac692-aa70-4edd-9f52-8cfca406aee5_1159x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Analog data converted into PCM</figcaption></figure></div><p>PCM works by sampling the audio at a fixed rate, commonly 16,000 samples per second, allowing the STT model to process the data accurately.</p><div class="pullquote"><p><em>Know someone who might need this? Share this post with your network and friends!</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/how-do-you-build-a-real-time-speech?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/how-do-you-build-a-real-time-speech?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><h3>Data Transmission</h3><p>Once audio is digitized, it needs to be transmitted to the server. In real-time systems, the audio is divided into small chunks, which are sent over a real-time protocol like WebSocket. This chunking ensures low latency and continuous transcription as the audio plays.</p><h3>Server-Side Processing</h3><p>On the server, an OpenAI Whisper model receives the audio chunks, transcribes them, and sends the text back to the client. This back-and-forth continues until the audio stream ends, enabling real-time transcription.</p><h2>Implementing the Client</h2><p>In this section, we describe what it takes to implement the client for the server. This client serves as the input source described in the theory section. It captures audio, streams it to the server, and receives transcription results in real time. The client is based on Deepgram&#8217;s <a href="https://developers.deepgram.com/docs/getting-started-with-the-streaming-test-suite">live streaming kit</a>, repurposed to work with our server.</p><p>The client depends on two third-party Python libraries:</p><ul><li><p><strong><a href="https://pypi.org/project/websockets/">websockets</a></strong>, which handles the WebSocket communication</p></li><li><p><strong><a href="https://pypi.org/project/PyAudio/">PyAudio</a></strong>, which captures audio from the microphone</p></li></ul><p>We also need <strong><a href="https://www.ffmpeg.org/">FFmpeg</a></strong> installed on the system. FFmpeg allows us to stream and convert audio from files and URLs into the raw PCM format expected by the server.</p><p>Let&#8217;s walk through a few key code snippets that are required for the client to function.</p><p>The main run() function that will handle connecting to the server and coordinating audio sending and receiving:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;11ec751c-2860-4386-b339-bcb394f2a3a1&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">async def run(method, source, host):
    print(f&#8221;&#9898; Connecting to {host}...&#8221;)
    
    async with websockets.connect(host, ping_interval=None, ping_timeout=None) as ws:
        print(f&#8221;&#128994; Connected. Mode: {method.upper()}&#8221;)
        
        # Queue to pass audio between threads and async tasks
        audio_queue = asyncio.Queue()
        stop_event = threading.Event()</code></pre></div><p>Inside run(), we first create the <strong>sender task</strong>. This task takes audio chunks from the queue and sends them to the server:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;5d0f4fe8-7dbb-4294-86a8-d68073fcf8e6&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">async def sender():
            while not stop_event.is_set():
                try:
                    data = await asyncio.wait_for(audio_queue.get(), timeout=0.5)
                    await ws.send(data)
                except asyncio.TimeoutError:
                    continue</code></pre></div><p>Then we create the <strong>receiver task</strong>, which prints out transcriptions from the server in real time:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;f4434c28-0c10-409e-abef-3b0a75a6b293&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">        async def receiver():
            try:
                async for msg in ws:
                    print(f&#8221;&#128221; {msg}&#8221;)
            except websockets.exceptions.ConnectionClosed:
                print(&#8221;&#128308; Connection closed by server.&#8221;)
                stop_event.set()</code></pre></div><p>Now we handle audio sources. First, microphone input using PyAudio:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;21dfd2f4-68d7-4665-b75e-703a7a054a3b&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">        def mic_process():
            p = pyaudio.PyAudio()
            
            def callback(in_data, frame_count, time_info, status):
                audio_queue.put_nowait(in_data)
                return (None, pyaudio.paContinue)

            stream = p.open(
                format=pyaudio.paFloat32,
                channels=1,
                rate=SERVER_RATE,
                input=True,
                frames_per_buffer=CHUNK_SIZE // 4,
                stream_callback=callback
            )
            
            print(&#8221;&#127908; Microphone Active.&#8221;)
            stream.start_stream()
            
            while not stop_event.is_set():
                asyncio.run_coroutine_threadsafe(asyncio.sleep(0.1), asyncio.get_event_loop())

            stream.stop_stream()
            stream.close()
            p.terminate()</code></pre></div><p>For URLs, use FFmpeg to convert audio to PCM and stream it:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;86004d94-2111-4bb1-be9d-d08f93e6fd0d&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">
        def ffmpeg_stream_process(url):
            command = [
                &#8216;ffmpeg&#8217;,
                &#8216;-i&#8217;, url,
                &#8216;-f&#8217;, &#8216;f32le&#8217;,
                &#8216;-ac&#8217;, &#8216;1&#8217;,
                &#8216;-ar&#8217;, str(SERVER_RATE),
                &#8216;-vn&#8217;,
                &#8216;-loglevel&#8217;, &#8216;error&#8217;,
                &#8216;-&#8217;
            ]
            print(f&#8221;&#127911; Starting FFmpeg stream for: {url}&#8221;)
            process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
            
            try:
                while not stop_event.is_set():
                    data = process.stdout.read(CHUNK_SIZE)
                    if not data:
                        break
                    audio_queue.put_nowait(data)
            finally:
                process.terminate()
                print(&#8221;&#128721; FFmpeg stopped.&#8221;)</code></pre></div><p>Finally, start the appropriate audio source in a thread and run the sender and receiver tasks:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;455e3e18-f530-4b03-b099-694f6466ddb9&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">        if method == &#8220;url&#8221;:
            source_thread = threading.Thread(target=ffmpeg_stream_process, args=(source,))
        elif method == &#8220;mic&#8221;:
            source_thread = threading.Thread(target=mic_process)
        else:
            source_thread = threading.Thread(target=ffmpeg_stream_process, args=(source,))
            
        source_thread.start()
        await asyncio.gather(sender(), receiver())
        stop_event.set()
        source_thread.join()</code></pre></div><p>Wrap everything with a simple command-line interface so you can run it with a microphone, file, or URL.</p><blockquote><p>See the <a href="https://github.com/Neurl-LLC/real-time-voice-asr">full code implementation</a>.</p></blockquote><p>With this, the client is ready! You can now stream audio from your microphone, a file, or a live URL, and receive real-time transcriptions from the server.</p><div class="pullquote"><p><em>Need creative, high-quality technical content? Happy to chat! Book a call with our Creative Engineers&#128071;&#127998;</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://calendar.app.google/s6ekF2rDbPjKWLLQ8&quot;,&quot;text&quot;:&quot;Let&#8217;s Chat! &#128222;&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://calendar.app.google/s6ekF2rDbPjKWLLQ8"><span>Let&#8217;s Chat! &#128222;</span></a></p></div><h2>My Attempts to Building the Real-time API</h2><p>I made several attempts to building the real-time API in this section I will show you the approach I took before I eventually got something that worked. I relied on using the client scripts for testing and I simulated a real-time stream by streaming this audio:</p><div class="native-audio-embed" data-component-name="AudioPlaceholder" data-attrs="{&quot;label&quot;:null,&quot;mediaUploadId&quot;:&quot;70d518ad-4c23-49a3-9f08-f25e37101dff&quot;,&quot;duration&quot;:164.57143,&quot;downloadable&quot;:true,&quot;isEditorNode&quot;:true}"></div><h3>Experiment One: The Simple Implementation</h3><p>For my first attempt at implementing the server, I built a WebSocket server that buffered one second of audio, sent it to Whisper for transcription, and returned the text. Simple, right? In practice, it failed. Whisper took too long to respond, which caused the client connections to time out and disconnect.</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;4b492d38-d07d-4de0-a990-c3b3bf418c60&quot;,&quot;duration&quot;:null}"></div><h3><strong>Experiment Two: GPT-Assisted Implementation</strong></h3><p>After the first experiment failed, I tried a different approach. I &#8220;vibe-coded&#8221; with ChatGPT to make it work: introducing multithreading to handle transcription in the background. This way, the server wouldn&#8217;t block while Whisper processed audio, and clients wouldn&#8217;t disconnect.</p><p>The idea was simple: collect audio continuously, run Whisper in a background thread whenever enough new audio arrived, and send the transcribed text back to the client. A rolling buffer kept the last few seconds for context, and an event system notified the transcription worker when new audio was ready.</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;27438369-6d12-4604-945c-880fed5f0372&quot;,&quot;duration&quot;:null}"></div><p>This approach fixed the disconnection problem, but it introduced new issues. The transcription accuracy dropped noticeably, and the code became much more complex. It worked, but I realized this wasn&#8217;t the &#8220;simplicity and reliability&#8221; I was looking for.</p><div class="pullquote"><p><em>Join the conversation and share your experiences in the comments below!</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/how-do-you-build-a-real-time-speech/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/how-do-you-build-a-real-time-speech/comments"><span>Leave a comment</span></a></p></div><h3><strong>Experiment Three: The Gemini Approach</strong></h3><p>After my attempt with ChatGPT didn&#8217;t deliver the results I was looking for, I decided to try <a href="https://deepmind.google/models/gemini/pro/">Gemini 3 Pro</a> to see if it could produce a more efficient real-time STT server. The first thing Gemini pointed out was that I shouldn&#8217;t be using OpenAI&#8217;s original Whisper implementation at all. Instead, it recommended <strong><a href="https://github.com/SYSTRAN/faster-whisper">Faster Whisper</a></strong>, a significantly more optimized version of Whisper designed for low-latency inference.</p><p>Gemini&#8217;s initial implementation still ran into the same disconnection issues when the model took too long to respond. However, when I suggested introducing threading, Gemini produced a much cleaner and more efficient solution.</p><p>This version maintained excellent transcription accuracy while keeping the code far simpler than the previous ChatGPT-assisted implementation.</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;abdc474e-26fd-4c17-b398-2369a96238ab&quot;,&quot;duration&quot;:null}"></div><p>Switching to <strong>Faster Whisper</strong> also made a noticeable difference. Compared to the original Whisper implementation, Faster Whisper runs inference more efficiently, reducing latency and making real-time transcription on CPU far more practical.</p><p>Together, these two changes, non-blocking transcription and a faster Whisper backend, finally resulted in a stable, accurate, and responsive real-time speech-to-text server.</p><blockquote><p>You can find the complete code for all experiments on <a href="https://github.com/Neurl-LLC/real-time-voice-asr">GitHub</a>.</p></blockquote><h2>Whisper Radio: A real-world implementation</h2><p>After implementing the API server and arriving at a version that performed reliably, I decided to test it in a real-world scenario. The result was <strong>Whisper Radio</strong>, a web application that allows users to listen to live radio while seeing real-time transcripts of what is being broadcast.</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;e295be92-d89a-4e41-b36c-92cd95780c30&quot;,&quot;duration&quot;:null}"></div><p>As users tune into a radio station, the audio stream is sent to our real-time Whisper server. The server transcribes the audio on the fly and streams the text back to the Whisper Radio application, allowing listeners to follow along with the broadcast in text form.</p><p>This is just one example of how a real-time speech-to-text API can be applied in practice. The Whisper Radio code lives in the <a href="https://github.com/Neurl-LLC/real-time-voice-asr">same repository</a> as the rest of the experiments explored in this project.</p><h2>Further Experiments and Enhancements</h2><p>There are several ways this real-time speech-to-text service can be extended and improved. Here are a few areas worth exploring:</p><ul><li><p><strong>Multi-Client Handling:</strong> The current implementation was only tested with a single client. Experimenting with multiple simultaneous clients will reveal how the server scales and how performance is affected.</p></li><li><p><strong>Voice Activity Detection (VAD):</strong> VAD can detect when the user is speaking before sending audio for transcription. While the current Whisper model has basic VAD, enhancing it or moving it to the client side could reduce unnecessary data transmission and improve efficiency.</p></li><li><p><strong>Alternative Open-Source Models:</strong> Test other speech-to-text models beyond Whisper to compare accuracy, speed, and latency.</p></li><li><p><strong>Word Error Rate Monitoring:</strong> Use pre-captioned audio to track transcription accuracy and identify areas for improvement.</p></li><li><p><strong>Timestamps and Diarization:</strong> Add timestamps to transcripts and separate speakers with diarization for more detailed and usable output.</p></li><li><p><strong>Voice Agent Development:</strong> Extend this system to build a real-time voice assistant capable of responding to commands or queries.</p></li></ul><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;51c33109-80cf-42db-ab57-f12bca54ffda&quot;,&quot;caption&quot;:&quot;TL;DR Model Context Protocol (MCP) offers a way to extend large models with tools and resources&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Build Better AI Agents with MCP and OpenAI&#8217;s Agent SDK&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:49908626,&quot;name&quot;:&quot;Eteimorde Youdiowei&quot;,&quot;bio&quot;:&quot;Simplifying complex ideas is my passion.....&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/90f6ea8f-0227-42b7-8c09-47e819b7f743_661x661.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null},{&quot;id&quot;:280510396,&quot;name&quot;:&quot;Neurl Creators&quot;,&quot;bio&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1b0dc913-49dd-4e59-ac39-bbbf289e1744_256x256.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-04-29T15:02:28.676Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/252d9784-eaf4-43c0-b28d-ad25408cb324_1456x1048.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://neurlcreators.substack.com/p/mcp-openai-agent-sdk-better-ai-agents&quot;,&quot;section_name&quot;:&quot;&#9881;&#65039; BuildAIers&#8217; Toolkit &#9881;&#65039;&quot;,&quot;video_upload_id&quot;:null,&quot;id&quot;:162388040,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:5,&quot;comment_count&quot;:0,&quot;publication_id&quot;:3228552,&quot;publication_name&quot;:&quot;The Neural Blueprint: Practical Content for AI Builders &quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!6udc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5b15961-5020-4a71-b040-30f9b3d3f232_256x256.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><h2>Conclusion</h2><p>Building a real-time speech-to-text service turned out to be far more challenging than I initially expected. With a bit of experimentation and some guidance from AI, I was able to create a system that works surprisingly well. Beyond just getting it to run, this project offered a glimpse into what happens behind the scenes in large-scale speech-to-text platforms.</p><p>The potential improvements, like multi-client support, advanced voice activity detection, diarization, and creating a full voice agent, make the project even more exciting. I&#8217;m looking forward to exploring these enhancements and seeing how far this system can be pushed.</p><div class="pullquote"><p><em>Subscribe to <a href="https://neurlcreators.substack.com/">The Neural Blueprint</a>&#128071;&#127998; for hands-on guides! &#129761; Follow us on <a href="http://www.youtube.com/@neurlcreators">YouTube</a>, <a href="https://x.com/NeurlCreators">X</a>, and <a href="https://www.linkedin.com/showcase/neurl-creators/">LinkedIn</a>.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/subscribe?"><span>Subscribe now</span></a></p></div>]]></content:encoded></item><item><title><![CDATA[Ship a Practical AI Content Pipeline for Search, Recommendations, and Agents with fenic]]></title><description><![CDATA[An agent-ready workflow for turning raw articles into labeled clusters, intent, grounded summaries, & exportable features with fenic DataFrames & semantic operators for content intelligence]]></description><link>https://neurlcreators.substack.com/p/build-fenic-pipeline-for-content-intelligence</link><guid isPermaLink="false">https://neurlcreators.substack.com/p/build-fenic-pipeline-for-content-intelligence</guid><dc:creator><![CDATA[Stephen FIYINFOLUWA Oladele]]></dc:creator><pubDate>Fri, 12 Dec 2025 11:15:35 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/05cb6ace-809a-489b-ab3e-74a4d6992dad_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="pullquote"><p><em>This technical guide is a repost from what originally appeared on Typedef AI&#8217;s blog as <a href="https://www.typedef.ai/blog/ai-content-pipeline-for-search-recommenders-and-agents-with-fenic">AI Content Pipeline for Search, Recommenders &amp; Agents with fenic</a></em></p></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Suf-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f401864-fb3a-47b9-99fb-6c07daae1ad1_1352x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Suf-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f401864-fb3a-47b9-99fb-6c07daae1ad1_1352x1600.png 424w, https://substackcdn.com/image/fetch/$s_!Suf-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f401864-fb3a-47b9-99fb-6c07daae1ad1_1352x1600.png 848w, https://substackcdn.com/image/fetch/$s_!Suf-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f401864-fb3a-47b9-99fb-6c07daae1ad1_1352x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!Suf-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f401864-fb3a-47b9-99fb-6c07daae1ad1_1352x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Suf-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f401864-fb3a-47b9-99fb-6c07daae1ad1_1352x1600.png" width="1352" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substackcdn.com/image/fetch/$s_!Suf-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f401864-fb3a-47b9-99fb-6c07daae1ad1_1352x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1352,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!Suf-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f401864-fb3a-47b9-99fb-6c07daae1ad1_1352x1600.png 424w, https://substackcdn.com/image/fetch/$s_!Suf-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f401864-fb3a-47b9-99fb-6c07daae1ad1_1352x1600.png 848w, https://substackcdn.com/image/fetch/$s_!Suf-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f401864-fb3a-47b9-99fb-6c07daae1ad1_1352x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!Suf-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f401864-fb3a-47b9-99fb-6c07daae1ad1_1352x1600.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Try it yourself:</strong></p><p>For the code and copy-paste-ables, check the Colab and demo repo &#11015;&#65039;</p><ul><li><p><strong><a href="https://colab.research.google.com/github/typedef-ai/fenic-examples/blob/main/ai_feature_engineering/fenic_ai_feature_engineering.ipynb">Open Colab</a></strong></p></li><li><p><strong><a href="https://github.com/typedef-ai/fenic-examples/tree/main/ai_feature_engineering">Open Repo</a></strong></p></li></ul><p>This post focuses on the <em>why</em> and <em>what</em>, so you can decide quickly and click through.</p><div><hr></div><p>Most teams hit the same wall with their content:</p><p>You&#8217;ve shipped <strong>hundreds of blog posts, tutorials, and case studies</strong> over the years. Some of them still convert. Some are outdated. Some overlap so hard they cannibalize each other. And when someone asks a simple question like:</p><blockquote><p>&#8220;What are our <em>best</em> beginner tutorials on clustering with code?&#8221;</p></blockquote><p>&#8230;you open a spreadsheet, search the blog, and start skimming <strong>one page at a time</strong>.</p><p>This guide walks through a different approach.</p><p>Instead of manually tagging and skimming, we:</p><ol><li><p>Ingest a small corpus of articles (titles, URLs, text snippets).</p></li><li><p>Enrich each row with <strong>cheap features</strong> like length and &#8220;has code.&#8221;</p></li><li><p>Use <strong>semantic operators</strong> to add embeddings and cluster labels.</p></li><li><p>Classify each article&#8217;s <strong>narrative intent</strong> (tutorial, thinkpiece, case study, etc.).</p></li><li><p>Estimate a <strong>complexity bucket</strong> (beginner/intermediate/advanced).</p></li><li><p>Build a <strong>cluster report</strong> with exemplars and LLM-generated summaries.</p></li><li><p>Export a clean <strong>feature table</strong> for BI tools, agents, or downstream ranking.</p></li></ol><p>All of it is built with <strong><a href="https://docs.fenic.ai/latest/">fenic</a></strong>. An opinionated, PySpark-inspired DataFrame framework designed for AI and agentic applications, with first-class semantic operators like <code>semantic.classify, semantic.extract</code>, and clustering built in.</p><p>If you&#8217;re responsible for docs, developer education, or marketing content, this is the kind of &#8220;content intelligence&#8221; you can actually use.</p><div class="pullquote"><p>Know someone who might need this? Share this post with your network and friends!</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/build-fenic-pipeline-for-content-intelligence?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/build-fenic-pipeline-for-content-intelligence?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><h2><strong>What &#8220;done&#8221; looks like</strong></h2><p>By the end of the notebook, you have two main artifacts:</p><p><strong>A feature table</strong> (<code>features.parquet/features.csv</code>) with one row per article:</p><pre><code>url
title
body_clip          # short text snippet for fast semantic work
char_len           # length of full body
clip_len           # length of snippet
has_code           # boolean
complexity_bucket  # e.g. &#8216;beginner&#8217;, &#8216;practitioner&#8217;, &#8216;advanced&#8217;
intent             # e.g. &#8216;tutorial/how-to&#8217;, &#8216;news/announcement&#8217;, ...
cluster            # numeric cluster id
cluster_label      # human-friendly cluster label (e.g. &#8216;K-Means basics&#8217;)</code></pre><ol><li><p>A <strong>cluster report</strong> (<code>cluster_report.csv</code>) with ~10 clusters:</p><ul><li><p>Cluster ID + label</p></li><li><p>Exemplar article (title + URL) closest to the centroid</p></li><li><p>LLM-generated bullets summarizing what that cluster is &#8220;about&#8221;</p></li></ul></li></ol><p>Once you have this, you can ask questions like:</p><ul><li><p>&#8220;Show me all <strong>tutorial</strong> posts that are <strong>beginner</strong> level and <strong>include code</strong>.&#8221;</p></li><li><p>&#8220;What are our <strong>top 10 clusters</strong> of content, by article count?&#8221;</p></li><li><p>&#8220;Find <strong>opinion pieces</strong> about <strong>AI in business</strong> that link well with this new announcement.&#8221;</p></li></ul><p>Or from an agent&#8217;s perspective:</p><blockquote><p><em>&#8220;Given this developer&#8217;s last 3 reads, suggest 2 more articles that match their level and intent.&#8221;</em></p></blockquote><p>The nice part is: all of this emerges from <strong>one fenic pipeline</strong> (load &#10145;&#65039; enrich &#10145;&#65039; embed &#10145;&#65039; cluster &#10145;&#65039; classify &#10145;&#65039; export).</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;494d3fcf-6a9e-4962-8450-4a98950f2d98&quot;,&quot;caption&quot;:&quot;This technical guide is a repost from what originally appeared on Typedef AI&#8217;s blog as Build an LLM Agent for Log Clustering &amp; Triage.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Build an End-to-End Log Triage Pipeline with Fenic and LangChain&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:85806,&quot;name&quot;:&quot;Stephen FIYINFOLUWA Oladele&quot;,&quot;bio&quot;:&quot;AI Engineer | Technical Creator | Building Tech for Creatives | Faith is the principal thing &#128170;&#127997; | Athletics | The Modern Day Generalist (TMG) &#127963;&#65039; | Make everything you do beautiful; make them art &#127912; | Ecclesiastes 9:10 &#10013;&#65039;&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/15119b47-1c09-4ae1-b6d6-df7bffdbab04_2160x2160.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null},{&quot;id&quot;:280510396,&quot;name&quot;:&quot;Neurl Creators&quot;,&quot;bio&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1b0dc913-49dd-4e59-ac39-bbbf289e1744_256x256.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-11-18T11:16:03.924Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8bb64dab-b2f6-44ca-a4c9-2496f4172013_1456x1048.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://neurlcreators.substack.com/p/log-triage-pipeline-with-fenic-langchain&quot;,&quot;section_name&quot;:&quot;&#9881;&#65039; BuildAIers&#8217; Toolkit &#9881;&#65039;&quot;,&quot;video_upload_id&quot;:null,&quot;id&quot;:178775171,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:1,&quot;comment_count&quot;:0,&quot;publication_id&quot;:3228552,&quot;publication_name&quot;:&quot;The Neural Blueprint: Practical Content for AI Builders &quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!6udc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5b15961-5020-4a71-b040-30f9b3d3f232_256x256.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><h2><strong>Why fenic is a good fit for this problem</strong></h2><p>Content intelligence is messy. You deal with:</p><ul><li><p>Unstructured text (markdown, HTML, scraped pages)</p></li><li><p>A mix of cheap heuristics (string length, regex) and semantic signals (embeddings, classification)</p></li><li><p>The need to batch model calls and keep costs predictable</p></li></ul><p>You <em>could</em> wire this together with Pandas, a few bespoke scripts, and some async model calls. But it gets hard to:</p><ul><li><p>Explain what&#8217;s happening</p></li><li><p>Reproduce the run later</p></li><li><p>Move from notebook to scheduled job to agentic tools</p></li></ul><p>fenic&#8217;s value here is that <strong>everything is a DataFrame transformation</strong>:</p><ul><li><p><a href="https://docs.fenic.ai/latest/reference/fenic/api/functions/semantic/#fenic.api.functions.semantic.embed">semantic.embed</a> to add an <code>emb</code> column</p></li><li><p><a href="https://docs.fenic.ai/latest/reference/fenic/api/functions/semantic/#fenic.api.functions.semantic.classify">semantic.classify</a> for narrative intent</p></li><li><p><a href="https://docs.fenic.ai/latest/reference/fenic/api/functions/semantic/#fenic.api.functions.semantic.extract">semantic.extract</a> to produce typed summaries</p></li><li><p><code>write.parquet/write.csv</code> for export</p></li></ul><p>You describe the pipeline once, and it can run in Colab today and as a scheduled job later. There is no need for custom batching, model SDK glue, or a separate ETL job.</p><div class="pullquote"><p><em>Need creative, high-quality technical content? Happy to chat! Book a call with our Creative Engineers&#128071;&#127998;</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://calendar.app.google/s6ekF2rDbPjKWLLQ8&quot;,&quot;text&quot;:&quot;Let&#8217;s Chat! &#128222;&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://calendar.app.google/s6ekF2rDbPjKWLLQ8"><span>Let&#8217;s Chat! &#128222;</span></a></p></div><h2><strong>Architecture at a glance</strong></h2><p>The notebook is deliberately small and linear. Here&#8217;s the flow you&#8217;ll run in Colab:</p><ol><li><p><strong>Load and normalise<br></strong>Bring a CSV or table with at least <code>url</code>,<code> title</code>,<code> body</code>. Create a short <strong>snippet</strong> (<code>body_clip</code>) that keeps model calls cheap and deterministic.</p></li><li><p><strong>Cheap features first<br></strong>Compute <code>char_len</code>,<code> clip_len</code>, and a simple <strong>has_code</strong> flag (look for code fences/backticks/language tags). These already answer useful questions before any embeddings.</p></li><li><p><strong>Semantic clusters<br></strong> Embed <code>body_clip</code> and group articles into <strong>K</strong> coherent clusters. For each cluster:</p><ul><li><p>Pick the <strong>exemplar</strong> (closest to centroid)</p></li><li><p>Generate a concise <strong>cluster label</strong> (human-readable)</p></li></ul></li><li><p><strong>Narrative intent<br></strong>Use <strong>few-shot </strong><code>semantic.classify</code> to tag each article as tutorial/announcement/explainer/opinion/case study. <em>Crucially,</em> build examples from <strong>your</strong> corpus (not toy prompts) so labels reflect your house style.</p></li><li><p><strong>Complexity buckets<br></strong>Bucket into <code>intro</code>/<code>intermediate</code>/<code>advanced</code> with a transparent rule over <code>char_len</code> and <code>has_code</code>.</p></li><li><p><strong>Cluster summaries<br></strong>For the ~10 clusters, run <strong>grounded extraction</strong> to get 1&#8211;3 bullets on audience + tone (typed via a mini Pydantic schema). These bullets anchor editorial decisions and agent prompts.</p></li><li><p><strong>Export<br></strong>Write <code>features.parquet</code> and <code>cluster_report.csv</code> with fenic&#8217;s native writers. Done.</p></li></ol><p><strong>All of the above is implemented in the <a href="https://colab.research.google.com/github/typedef-ai/fenic-examples/blob/main/ai_feature_engineering/fenic_ai_feature_engineering.ipynb">Colab notebook</a>.</strong></p><h2><strong>Where this becomes useful</strong></h2><p>Once you have <code>features.parquet</code> and <code>cluster_report.csv</code>, there are several easy wins.</p><h3><strong>1. Smarter RAG and content search</strong></h3><p>Instead of feeding your LLM the &#8220;top 5 BM25 matches,&#8221; you can:</p><ul><li><p>Prefer <strong>tutorial/how-to</strong> articles when the question sounds like &#8220;how do I&#8230;?&#8221;</p></li><li><p>Prefer <strong>case studies</strong> when the question is about business impact</p></li><li><p>Filter to <strong>beginner-level</strong> content for newer users</p></li></ul><p>You already have the features; you just plug them into your retrieval and ranking logic.</p><h3><strong>2. Content recommendations (&#8220;read next&#8221;)</strong></h3><p>Given an article the user just read:</p><ul><li><p>Use the <strong>embedding</strong> and <strong>cluster</strong> to find semantically similar pieces</p></li><li><p>Use <strong>intent</strong> and <strong>complexity</strong> to avoid repetitive or mismatched suggestions</p></li></ul><p><strong>Example:</strong></p><blockquote><p>If a user just finished a &#8220;beginner&#8221; explainer in cluster &#8220;K-Means basics,&#8221; recommend one &#8220;practitioner&#8221; tutorial and one related opinion piece at the same cluster.</p></blockquote><h3><strong>3. Editorial analytics and gaps</strong></h3><p>Your content/product teams can now ask:</p><ul><li><p>&#8220;How many <strong>advanced</strong> tutorials do we have on observability?&#8221;</p></li><li><p>&#8220;Are most of our <strong>opinion pieces</strong> clustered around the same theme?&#8221;</p></li><li><p>&#8220;Which clusters have no case studies yet?&#8221;</p></li></ul><p>Because everything is just a table, you can answer these with a few fenic <a href="https://docs.fenic.ai/latest/reference/fenic/#fenic.DataFrame.group_by">group_by</a>/<a href="https://docs.fenic.ai/latest/reference/fenic/#fenic.GroupedData.agg">agg</a> calls or plug the parquet into your BI tool of choice.</p><h3><strong>4. Agent surfaces</strong></h3><p>Finally, this is a perfect substrate for agents.</p><p>You can expose read-only <a href="https://docs.fenic.ai/latest/reference/fenic/api/mcp/tools/">MCP tools</a> like:</p><ul><li><p><code>list_articles(intent, complexity)</code></p></li><li><p><code>similar_articles(url)</code></p></li><li><p><code>cluster_overview(cluster_id)</code></p></li></ul><p>Each tool is a <strong>deterministic fenic query</strong> over the feature table. Your agent becomes a thin layer that:</p><ol><li><p>Interprets the user&#8217;s request</p></li><li><p>Calls the relevant tool(s)</p></li><li><p>Renders the result in a friendly way</p></li></ol><p>No need to let the model &#8220;<a href="https://openai.com/index/why-language-models-hallucinate/">hallucinate</a>&#8221; your catalog; it just queries it.</p><h2><strong>Cost, latency, and control</strong></h2><p>Because fenic handles batching and a minimal API surface, it&#8217;s straightforward to reason about cost:</p><ul><li><p><strong>Embeddings</strong>: main paid step; we embed <code>body_clip</code> once per article</p></li><li><p><strong>Classification</strong>: one <code>semantic.classify</code> call per row, typically via a small &#8220;mini&#8221; LLM</p></li><li><p><strong>Extraction</strong>: only for ~10 cluster exemplars, so negligible</p></li></ul><p>You can tune:</p><ul><li><p>The snippet length (<code>body_clip</code>)</p></li><li><p>The subset of rows you classify (e.g., skip archival content)</p></li><li><p>The model aliases used in your <code>semantic_cfg</code></p></li></ul><p>Everything else (feature engineering, clustering, exports) runs locally.</p><h2><strong>Adopting this pattern in your own stack</strong></h2><p>If you want to replicate this with your docs or blog:</p><ol><li><p><strong>Load your corpus</strong> into fenic with at least <code>url</code>, <code>title</code>, and <code>body</code>.</p></li><li><p>Add cheap features (<code>char_len</code>, <code>has_code</code>, maybe tags or product names).</p></li><li><p>Use <code>with_embeddings</code> and <code>with_cluster_labels</code> to create semantic clusters.</p></li><li><p>Summarise clusters with <code>semantic.extract</code> so humans can reason about them.</p></li><li><p>Define a small, opinionated <strong>intent taxonomy</strong> and build few-shot examples from your corpus.</p></li><li><p>Add a lightweight <strong>complexity bucket</strong> signal that fits your audience.</p></li><li><p>Export <code>features.parquet</code> and integrate it into RAG, search, agents, or analytics.</p></li></ol><p>You don&#8217;t need to adopt agents on day one. Even just having a <strong>clean content feature table</strong> will pay off in search, analytics, and planning.</p><blockquote><p><strong>&#128161; Maybe useful:</strong> <a href="https://www.typedef.ai/blog/from-threads-to-themes-how-we-built-a-fast-hn-research-agent-with-fenic-pydantic-ai">From Threads to Themes &#8211; How We Built a Fast HN Research Agent with fenic + Pydantic-AI</a></p></blockquote><h2><strong>Conclusion and closing thoughts</strong></h2><p>If your team is sitting on a large, messy content archive and you&#8217;re not sure what to do next, the pipeline is a very practical way to <strong>see the shape of what you already have</strong>.</p><p>fenic&#8217;s sweet spot is exactly this kind of work: <em>turning unstructured text into structured, semantic, agent-ready tables</em> without giving up the simplicity of a DataFrame pipeline.</p><p>fenic gives you the DataFrame-style ergonomics you&#8217;re used to, plus semantic tools built for this kind of AI-adjacent work.</p><p>The rest is just deciding what questions you care about and letting the pipeline answer them.</p><div class="pullquote"><p><em>Join the conversation and share your experiences in the comments below!</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/build-fenic-pipeline-for-content-intelligence/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/build-fenic-pipeline-for-content-intelligence/comments"><span>Leave a comment</span></a></p></div><h2><strong>Try the demo, then make it yours</strong></h2><ul><li><p><strong><a href="https://colab.research.google.com/github/typedef-ai/fenic-examples/blob/main/ai_feature_engineering/fenic_ai_feature_engineering.ipynb">Clone the demo</a>:</strong> Use our Colab and point it at your own small corpus slice first. Keep <code>body_clip</code> short, then grow from there.</p></li><li><p><strong><a href="https://docs.fenic.ai/latest/">Docs</a>:</strong> semantic operators, text utilities, batch inference</p></li><li><p><strong><a href="https://github.com/typedef-ai/fenic/tree/main/examples">Examples</a>:</strong> GitHub &#8594; <code>typedef-ai/fenic</code> (look for other applications and MCP tool demos).</p></li></ul><div class="pullquote"><p><em>Subscribe to <a href="https://neurlcreators.substack.com/">The Neural Blueprint</a>&#128071;&#127998; for hands-on guides! &#129761; Follow us on <a href="http://www.youtube.com/@neurlcreators">YouTube</a>, <a href="https://x.com/NeurlCreators">X</a>, and <a href="https://www.linkedin.com/showcase/neurl-creators/">LinkedIn</a>.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/subscribe?"><span>Subscribe now</span></a></p></div>]]></content:encoded></item><item><title><![CDATA[Small Language Models (SLMs): LLMs You Can Run on Your CPU Without Quantization]]></title><description><![CDATA[Comprehensive look at the Small Language Models (SLMs) with minimal memory footprint and blazing-fast inference.]]></description><link>https://neurlcreators.substack.com/p/llms-you-can-run-on-your-cpu-without</link><guid isPermaLink="false">https://neurlcreators.substack.com/p/llms-you-can-run-on-your-cpu-without</guid><dc:creator><![CDATA[Eteimorde Youdiowei]]></dc:creator><pubDate>Wed, 26 Nov 2025 16:16:12 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/0ae048b2-8475-4d71-8628-e0b737af569c_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>It&#8217;s wild how quickly the LLM ecosystem has evolved. A few years ago, the idea of running a billion-parameter model on a regular laptop felt completely out of reach. Then quantization arrived, and suddenly, CPUs became a viable playground for LLMs. </p><p>But here&#8217;s the twist: models aren&#8217;t just smaller now; they&#8217;re smarter, more efficient, and in some cases capable enough to run in full precision without any compression at all. Which leads to the big question: </p><blockquote><p><strong>What models today can run unquantized directly on your CPU?</strong></p></blockquote><p>That&#8217;s what this article is all about. We&#8217;ll walk through the small but powerful models you can run locally without quantization, focusing on three key categories:</p><ul><li><p><strong>Language models</strong></p></li><li><p><strong>Reasoning models</strong></p></li><li><p><strong>Agent-ready models with tool-use capabilities</strong></p></li></ul><p>Along the way, we&#8217;ll compare them using two practical criteria: how much memory they require and how fast they run during inference.</p><h2>Gemma 3 270M: The perfect Small Language Model</h2><p><a href="https://huggingface.co/google/gemma-3-270m-it">Gemma 3 270M</a> is one of the smallest modern language models available, and its tiny size makes it ideal for CPU inference. It delivers surprisingly fast generation speeds and has a very small memory footprint, making it easy to run on machines with as little as <strong>8 GB of RAM</strong>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-wP8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ef407f2-2105-41e8-8504-d8a1e2ad1531_1600x900.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-wP8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ef407f2-2105-41e8-8504-d8a1e2ad1531_1600x900.png 424w, https://substackcdn.com/image/fetch/$s_!-wP8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ef407f2-2105-41e8-8504-d8a1e2ad1531_1600x900.png 848w, https://substackcdn.com/image/fetch/$s_!-wP8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ef407f2-2105-41e8-8504-d8a1e2ad1531_1600x900.png 1272w, https://substackcdn.com/image/fetch/$s_!-wP8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ef407f2-2105-41e8-8504-d8a1e2ad1531_1600x900.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-wP8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ef407f2-2105-41e8-8504-d8a1e2ad1531_1600x900.png" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6ef407f2-2105-41e8-8504-d8a1e2ad1531_1600x900.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-wP8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ef407f2-2105-41e8-8504-d8a1e2ad1531_1600x900.png 424w, https://substackcdn.com/image/fetch/$s_!-wP8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ef407f2-2105-41e8-8504-d8a1e2ad1531_1600x900.png 848w, https://substackcdn.com/image/fetch/$s_!-wP8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ef407f2-2105-41e8-8504-d8a1e2ad1531_1600x900.png 1272w, https://substackcdn.com/image/fetch/$s_!-wP8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ef407f2-2105-41e8-8504-d8a1e2ad1531_1600x900.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Gemma 3 270M Performance on the Instruction-Following Evaluation [<a href="https://developers.googleblog.com/en/introducing-gemma-3-270m/">Source</a>]</figcaption></figure></div><p>The model comes in two variants:</p><ul><li><p><strong>Gemma 3 270M PT:</strong> The pretrained version</p></li><li><p><strong>Gemma 3 270M IT:</strong> The instruction-tuned and chat-focused version</p></li></ul><p>The instruction-tuned version is exciting because it can maintain natural conversations in a way that feels much closer to larger models. Despite its small parameter count, it handles dialogue and everyday tasks better than many older models that were several times its size.</p><p>Thanks to its compact architecture, Gemma 3 270M is also perfect for fine-tuning experiments. You can fine-tune it on a GPU in Google Colab, export the resulting model, and then run it on your CPU <strong>without any quantization</strong>. This versatility makes it one of the most flexible and accessible small LLMs available today.</p><div class="pullquote"><p>Know someone who might need this? Share this post with your network and friends!</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/llms-you-can-run-on-your-cpu-without?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/llms-you-can-run-on-your-cpu-without?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><h3>Running the model</h3><p>Running the unquantized version of Gemma 3 270M is straightforward. Once you have access to Hugging Face, you can load it directly using the <a href="https://huggingface.co/docs/transformers/en/index">Transformers library</a>:</p><pre><code>from transformers import pipeline

pipe = pipeline(&#8221;text-generation&#8221;, model=&#8221;google/gemma-3-270m-it&#8221;)
messages = [
    {&#8221;role&#8221;: &#8220;user&#8221;, &#8220;content&#8221;: &#8220;Who are you?&#8221;},
]
pipe(messages)</code></pre><p>Although the model is free to use, it is <strong><a href="https://huggingface.co/docs/hub/en/models-gated">gated</a></strong>, meaning you must request and receive permission on Hugging Face before you can download or load it.</p><h2>Gemma 3 1B: Bigger but still Small</h2><p>If you want something noticeably more capable than Gemma 3 270M, <strong><a href="https://huggingface.co/google/gemma-3-1b-it">Gemma 3 1B</a></strong> is the next best step up. It consistently outperforms the 270M model across benchmarks while still maintaining a small memory footprint and fast inference speeds. And just like its smaller sibling, it comes in two variants: a <strong>pretrained (PT)</strong> version and an <strong>instruction-tuned (IT)</strong> version.</p><p>Despite being larger, Gemma 3 1B remains lightweight enough to run unquantized on many CPUs, making it a great middle ground between tiny models and full-scale LLMs.</p><h3>Running the model</h3><p>Running Gemma 3 1B is identical to running the 270M version, simply reference the model name when loading it through the Transformers pipeline:</p><pre><code>from transformers import pipeline

pipe = pipeline(&#8221;text-generation&#8221;, model=&#8221;google/gemma-3-ib-it&#8221;)
messages = [
    {&#8221;role&#8221;: &#8220;user&#8221;, &#8220;content&#8221;: &#8220;Who are you?&#8221;},
]
pipe(messages)</code></pre><h2>Qwen 3 0.6B: Thinking and Agent in a small package</h2><p><a href="https://huggingface.co/Qwen/Qwen3-0.6B">Qwen 3 0.6B</a> is another model that can be run fully on the CPU without quantization, but unlike the Gemma models, it offers much more than standard text generation. Qwen 3 0.6B is both a <strong><a href="https://platform.openai.com/docs/guides/reasoning">reasoning model</a></strong> and a <strong><a href="https://platform.openai.com/docs/guides/tools">tool-using model</a></strong>, giving it capabilities typically found in much larger LLMs.</p><p>This makes it especially impressive for a model that still maintains a very small memory footprint. The only trade-off is that reasoning mode is slower, since generating internal thoughts produces more tokens.</p><p>Qwen 3 0.6B is a great example of how far small models have come. They&#8217;re no longer limited to simple text generation, they can reason, plan, and interact with tools, making them suitable for lightweight agent setups that run entirely on a CPU.</p><div class="pullquote"><p><em>Need creative, high-quality technical content? Happy to chat! Book a call with our Creative Engineers&#128071;&#127998;</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://calendar.app.google/s6ekF2rDbPjKWLLQ8&quot;,&quot;text&quot;:&quot;Let&#8217;s Chat! &#128222;&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://calendar.app.google/s6ekF2rDbPjKWLLQ8"><span>Let&#8217;s Chat! &#128222;</span></a></p></div><h3>Running the model</h3><p>Just like the Gemma models, Qwen 3 0.6B is available on Hugging Face. Below is an example showing how to enable the model&#8217;s <strong>thinking mode</strong>:</p><pre><code>from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = &#8220;Qwen/Qwen3-0.6B&#8221;

# load the tokenizer and the model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=&#8221;auto&#8221;,
    device_map=&#8221;auto&#8221;
)

# prepare the model input
prompt = &#8220;How many R are in Blueberry?&#8221;
messages = [
    {&#8221;role&#8221;: &#8220;user&#8221;, &#8220;content&#8221;: prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=True # Switches between thinking and non-thinking modes. Default is True.
)
model_inputs = tokenizer([text], return_tensors=&#8221;pt&#8221;).to(model.device)

# conduct text completion
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=32768
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()

# parsing thinking content
try:
    # rindex finding 151668 (&lt;/think&gt;)
    index = len(output_ids) - output_ids[::-1].index(151668)
except ValueError:
    index = 0

thinking_content = tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip(&#8221;\n&#8221;)
content = tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip(&#8221;\n&#8221;)

print(&#8221;thinking content:&#8221;, thinking_content)
print(&#8221;content:&#8221;, content)</code></pre><p>Below is the model&#8217;s output:</p><pre><code>thinking content: &lt;think&gt;
Okay, so I need to figure out how many times the letter &#8220;R&#8221; appears in the word &#8220;blueberry&#8221;. Let me start by writing down the word and looking at each letter one by one.

The word is &#8220;blueberry&#8221;. Let me spell it out: B-L-U-E-B-R-R-Y. Now, I need to check each letter for the letter &#8220;R&#8221;. Let me go through each position:

First letter: B &#8211; not R.
Second: L &#8211; nope.
Third: U &#8211; no.
Fourth: E &#8211; no.
Fifth: B &#8211; no.
Sixth: R &#8211; here we go! The first &#8220;R&#8221; is at position 6. Then the next letter is &#8220;R&#8221; again at position 11. So that&#8217;s two instances.

Wait, let me check again to make sure I didn&#8217;t miss anything. Let&#8217;s break it down:

B, L, U, E, B, R, R, Y. So positions 6 and 11. So yes, two &#8220;R&#8221;s. I think that&#8217;s it. I don&#8217;t see any other &#8220;R&#8221;s in the word. So the answer should be 2.
&lt;/think&gt;
content: How many R is in blueberry?

The word &#8220;blueberry&#8221; is spelled B-L-U-E-B-R-R-Y. 

Looking at each letter:
- The first R is at position 6.
- The second R is at position 11.

Therefore, the number of R&#8217;s in &#8220;blueberry&#8221; is **2**.</code></pre><p>Tool use can be combined with thinking mode. Here&#8217;s an example:</p><pre><code>import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

checkpoint = &#8220;Qwen/Qwen3-0.6B&#8221;
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint, dtype=&#8221;auto&#8221;, device_map=&#8221;auto&#8221;)

def get_current_temperature(location: str, unit: str):
    &#8220;&#8221;&#8220;
    Get the current temperature at a location.

    Args:
        location: The location to get the temperature for, in the format &#8220;City, Country&#8221;
        unit: The unit to return the temperature in. (choices: [&#8221;celsius&#8221;, &#8220;fahrenheit&#8221;])
    &#8220;&#8221;&#8220;
    return 42.

def get_current_wind_speed(location: str):
    &#8220;&#8221;&#8220;
    Get the current wind speed in km/h at a given location.

    Args:
        location: The location to get the wind speed for, in the format &#8220;City, Country&#8221;
    &#8220;&#8221;&#8220;
    return 6.

tools = [get_current_temperature, get_current_wind_speed]

messages = [ {&#8221;role&#8221;: &#8220;user&#8221;, &#8220;content&#8221;: &#8220;Hey, what&#8217;s the temperature in Paris right now?&#8221;} ]

inputs = tokenizer.apply_chat_template(messages, tools=tools, add_generation_prompt=True, return_dict=True, return_tensors=&#8221;pt&#8221;)
outputs = model.generate(**inputs.to(model.device), max_new_tokens=128)
print(tokenizer.decode(outputs[0][len(inputs[&#8221;input_ids&#8221;][0]):]))</code></pre><p>Model&#8217;s output:</p><pre><code>&lt;think&gt;
Okay, the user is asking for the current temperature in Paris. Let me check the tools available. There&#8217;s get_current_temperature which requires location and unit. The user didn&#8217;t specify the unit, but maybe I should assume Celsius unless told otherwise. The location is Paris, so I need to call the function with location &#8220;Paris, France&#8221; and unit &#8220;celsius&#8221;. Let me make sure the parameters are correctly formatted. Yes, that should do it.
&lt;/think&gt;

&lt;tool_call&gt;
{&#8221;name&#8221;: &#8220;get_current_temperature&#8221;, &#8220;arguments&#8221;: {&#8221;location&#8221;: &#8220;Paris, France&#8221;, &#8220;unit&#8221;: &#8220;celsius&#8221;}}
&lt;/tool_call&gt;&lt;|im_end|&gt;</code></pre><h2>Qwen 3 1.7B: 1 Billion parameter bigger but still small</h2><p><a href="https://huggingface.co/Qwen/Qwen3-1.7B">Qwen 3 1.7B</a> takes everything the 0.6B model can do and scales it up by an extra billion parameters. The result is a model that still runs on CPU <strong>without quantization</strong>, but with noticeably stronger reasoning, improved tool use, and better overall performance.</p><p>Despite being the largest model on this list, it remains lightweight enough to run on machines with <strong>8GB RAM</strong>, making it an impressive &#8220;upper-limit&#8221; option for small-model experimentation on consumer hardware.</p><p>The trade-off?<br><br>Inference can be slow on CPU, especially when the <strong>thinking mode</strong> is enabled. But if you don&#8217;t mind waiting a bit longer, it consistently produces solid reasoning and remains one of the most capable CPU-friendly LLMs you can run today.</p><div class="pullquote"><p><em>Join the conversation and share your experiences in the comments below!</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/llms-you-can-run-on-your-cpu-without/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/llms-you-can-run-on-your-cpu-without/comments"><span>Leave a comment</span></a></p></div><h3>Running the model</h3><p>Running Qwen 3 1.7B is identical to running Qwen 3 0.6B:</p><pre><code>from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = &#8220;Qwen/Qwen3-1.7B&#8221;

# load the tokenizer and the model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=&#8221;auto&#8221;,
    device_map=&#8221;auto&#8221;
)

# prepare the model input
prompt = &#8220;How K is in Strawberry?&#8221;
messages = [
    {&#8221;role&#8221;: &#8220;user&#8221;, &#8220;content&#8221;: prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=True # Switches between thinking and non-thinking modes. Default is True.
)
model_inputs = tokenizer([text], return_tensors=&#8221;pt&#8221;).to(model.device)

# conduct text completion
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=32768
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()

# parsing thinking content
try:
    # rindex finding 151668 (&lt;/think&gt;)
    index = len(output_ids) - output_ids[::-1].index(151668)
except ValueError:
    index = 0

thinking_content = tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip(&#8221;\n&#8221;)
content = tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip(&#8221;\n&#8221;)

print(&#8221;thinking content:&#8221;, thinking_content)
print(&#8221;content:&#8221;, content)</code></pre><p>Example Output:</p><pre><code>thinking content: &lt;think&gt;
Okay, the user is asking &#8220;How K is in Strawberry?&#8221; which seems a bit confusing. Let me break this down.

First, &#8220;Strawberry&#8221; is a common fruit, but the question is about the letter &#8220;K&#8221; in it. So maybe they&#8217;re asking how the letter &#8220;K&#8221; appears in the word &#8220;Strawberry.&#8221;

Let me check the spelling of &#8220;Strawberry.&#8221; S-T-R-A-W-B-E-R-R-Y. Wait, is there a &#8220;K&#8221; in there? Let me spell it again: S-T-R-A-W-B-E-R-R-Y. No, there&#8217;s no &#8220;K&#8221; in &#8220;Strawberry.&#8221; So the answer would be that there&#8217;s no &#8220;K&#8221; in the word.

But maybe the user is thinking of a different word. Could it be a typo? Like &#8220;Strawberry&#8221; might be misspelled, or maybe they meant another word. Alternatively, maybe they&#8217;re referring to a different context where &#8220;K&#8221; is involved. For example, in the name &#8220;Strawberry&#8221; in a different language, but I don&#8217;t think that&#8217;s the case here.

Alternatively, maybe the user is asking about the position of the letter &#8220;K&#8221; in the word, but since there&#8217;s no &#8220;K,&#8221; the answer would be that there&#8217;s no &#8220;K&#8221; in &#8220;Strawberry.&#8221;

Wait, but maybe they&#8217;re thinking of a different word. For example, &#8220;strawberry&#8221; in some other language. Let me check. In French, &#8220;strawberry&#8221; is &#8220;framboise,&#8221; which doesn&#8217;t have a &#8220;K.&#8221; In Spanish, &#8220;fresa,&#8221; which also doesn&#8217;t have a &#8220;K.&#8221; In German, &#8220;Birne&#8221; is a different fruit, but that&#8217;s not strawberry. So no &#8220;K&#8221; there either.

Alternatively, maybe the user is referring to the letter &#8220;K&#8221; in the name &#8220;Strawberry&#8221; as a brand or something else. But I don&#8217;t think that&#8217;s the case. The question seems straightforward, so the answer is that there&#8217;s no &#8220;K&#8221; in &#8220;Strawberry.&#8221;

I should also consider if there&#8217;s any possible way the user might have made a mistake. For example, maybe they meant &#8220;Strawberry&#8221; with a &#8220;K&#8221; in another part of the word. But as written, &#8220;Strawberry&#8221; doesn&#8217;t have a &#8220;K.&#8221; So the answer is that there&#8217;s no &#8220;K&#8221; in the word.
&lt;/think&gt;
content: The letter **K** does not appear in the word **&#8221;Strawberry&#8221;**. 

**Explanation:**  
The spelling of &#8220;Strawberry&#8221; is **S-T-R-A-W-B-E-R-R-Y**, which contains no &#8220;K&#8221; sound or letter. 

If you&#8217;re referring to a different word or context where &#8220;K&#8221; might be involved, feel free to clarify! &#128522;</code></pre><p>Even with thinking mode disabled, Qwen 3 1.7B still performs tool calls accurately:</p><pre><code>import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

checkpoint = &#8220;Qwen/Qwen3-1.7B&#8221;
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint, dtype=&#8221;auto&#8221;, device_map=&#8221;auto&#8221;)

def get_current_temperature(location: str, unit: str):
    &#8220;&#8221;&#8220;
    Get the current temperature at a location.

    Args:
        location: The location to get the temperature for, in the format &#8220;City, Country&#8221;
        unit: The unit to return the temperature in. (choices: [&#8221;celsius&#8221;, &#8220;fahrenheit&#8221;])
    &#8220;&#8221;&#8220;
    return 42.

def get_current_wind_speed(location: str):
    &#8220;&#8221;&#8220;
    Get the current wind speed in km/h at a given location.

    Args:
        location: The location to get the wind speed for, in the format &#8220;City, Country&#8221;
    &#8220;&#8221;&#8220;
    return 6.

tools = [get_current_temperature, get_current_wind_speed]

messages = [ {&#8221;role&#8221;: &#8220;user&#8221;, &#8220;content&#8221;: &#8220;Hey, what&#8217;s the temperature in Paris right now?&#8221;} ]

inputs = tokenizer.apply_chat_template(messages, tools=tools, add_generation_prompt=True, return_dict=True,  enable_thinking=False, return_tensors=&#8221;pt&#8221;)
outputs = model.generate(**inputs.to(model.device), max_new_tokens=128)
print(tokenizer.decode(outputs[0][len(inputs[&#8221;input_ids&#8221;][0]):]))</code></pre><p>Output without thinking:</p><pre><code>&lt;tool_call&gt;
{&#8221;name&#8221;: &#8220;get_current_temperature&#8221;, &#8220;arguments&#8221;: {&#8221;location&#8221;: &#8220;Paris, France&#8221;, &#8220;unit&#8221;: &#8220;celsius&#8221;}}
&lt;/tool_call&gt;&lt;|im_end|&gt;</code></pre><h2>Conclusion: LLMs You Can Run on Your CPU Without Quantization</h2><p>In this article, we&#8217;ve seen how far large language models have come, becoming <strong>smaller, more efficient, and still surprisingly capable</strong>. </p><p>Today, it&#8217;s possible to run sophisticated LLMs entirely on a CPU <strong>without any quantization</strong>, opening up new possibilities for experimentation, fine-tuning, and lightweight deployment. </p><p>This marks a significant milestone in the evolution of the LLM ecosystem, showing just how accessible powerful AI has become.</p><p>Our goal isn&#8217;t to suggest abandoning quantized models; they remain essential for extremely large models or resource-constrained environments. </p><p>Rather, this exploration highlights the performance of small, unquantized models straight out of the box and encourages a new way of thinking: <strong>for many tasks, do you really need a massive, quantized model, or could a smaller model deliver what you need more efficiently?</strong></p><p>By understanding the strengths of these compact models, developers and researchers can make smarter choices, balancing <strong>speed, memory, and capability</strong> in ways that were not possible just a few years ago.</p><div class="pullquote"><p><em>Subscribe to <a href="https://neurlcreators.substack.com/">The Neural Blueprint</a>&#128071;&#127998; for hands-on guides! &#129761; Follow us on <a href="http://www.youtube.com/@neurlcreators">YouTube</a>, <a href="https://x.com/NeurlCreators">X</a>, and <a href="https://www.linkedin.com/showcase/neurl-creators/">LinkedIn</a>.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/subscribe?"><span>Subscribe now</span></a></p></div>]]></content:encoded></item><item><title><![CDATA[Build an End-to-End Log Triage Pipeline with Fenic and LangChain]]></title><description><![CDATA[Build an end-to-end pipeline that provides insights using your system/app logs. Ingest your logs and use natural language to get insights into app failures through an agentic surface.]]></description><link>https://neurlcreators.substack.com/p/log-triage-pipeline-with-fenic-langchain</link><guid isPermaLink="false">https://neurlcreators.substack.com/p/log-triage-pipeline-with-fenic-langchain</guid><dc:creator><![CDATA[Stephen FIYINFOLUWA Oladele]]></dc:creator><pubDate>Tue, 18 Nov 2025 11:16:03 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/81e80791-d989-4867-96a8-923a5b7a7788_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="pullquote"><p><em>This technical guide is a repost from what originally appeared on Typedef AI&#8217;s blog as <a href="https://www.typedef.ai/blog/build-an-llm-agent-for-log-clustering-and-triage">Build an LLM Agent for Log Clustering &amp; Triage</a>.</em></p></div><p>There&#8217;s a point in every incident where the problem isn&#8217;t <em>finding</em> logs but in <em>understanding</em> them. Formats drift, error messages mutate, and even well-tuned dashboards can feel like they&#8217;re describing yesterday&#8217;s system.</p><p>In the middle of the scramble, engineers ask the same questions again and again:</p><p><strong>What broke? How widespread is it? Where should we look first?</strong></p><p>This article lays out a pragmatic, agent-first way to answer those questions with <strong><a href="https://github.com/typedef-ai/fenic">Fenic</a></strong>. Fenic is an opinionated, PySpark-inspired DataFrame framework built for AI/agentic applications.</p><p>You&#8217;ll learn how we turn raw logs into severity-aware clusters, expose read-only tools via MCP, and let a LangGraph agent respond to natural-language questions like <em>&#8220;only errors in payments&#8221;</em> or <em>&#8220;show me the raw lines for cluster 7.&#8221;</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eP9E!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e6f23ad-951d-406e-a6b6-93f5dbe4fcb1_1038x281.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eP9E!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e6f23ad-951d-406e-a6b6-93f5dbe4fcb1_1038x281.png 424w, https://substackcdn.com/image/fetch/$s_!eP9E!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e6f23ad-951d-406e-a6b6-93f5dbe4fcb1_1038x281.png 848w, https://substackcdn.com/image/fetch/$s_!eP9E!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e6f23ad-951d-406e-a6b6-93f5dbe4fcb1_1038x281.png 1272w, https://substackcdn.com/image/fetch/$s_!eP9E!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e6f23ad-951d-406e-a6b6-93f5dbe4fcb1_1038x281.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eP9E!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e6f23ad-951d-406e-a6b6-93f5dbe4fcb1_1038x281.png" width="1038" height="281" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7e6f23ad-951d-406e-a6b6-93f5dbe4fcb1_1038x281.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:281,&quot;width&quot;:1038,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eP9E!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e6f23ad-951d-406e-a6b6-93f5dbe4fcb1_1038x281.png 424w, https://substackcdn.com/image/fetch/$s_!eP9E!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e6f23ad-951d-406e-a6b6-93f5dbe4fcb1_1038x281.png 848w, https://substackcdn.com/image/fetch/$s_!eP9E!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e6f23ad-951d-406e-a6b6-93f5dbe4fcb1_1038x281.png 1272w, https://substackcdn.com/image/fetch/$s_!eP9E!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e6f23ad-951d-406e-a6b6-93f5dbe4fcb1_1038x281.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The full runnable demo is in a Colab/Repo. In this blog, you&#8217;ll see the <em>why</em> and the <em>what</em> so you can decide quickly whether this solution fits your stack.</p><blockquote><p><strong>Try it now &#8594;</strong> <a href="https://colab.research.google.com/github/typedef-ai/fenic-examples/blob/main/oncall_triage_agent/fenic_oncall_triage_agent.ipynb">Open the Colab</a> | <strong>Repo &#8594;</strong> <a href="https://github.com/typedef-ai/fenic-examples/tree/main/oncall_triage_agent">typedef-ai/fenic-examples</a> | <strong>Docs &#8594;</strong><a href="https://docs.fenic.ai/latest/"> https://docs.fenic.ai/latest/</a></p></blockquote><h2><strong>Why a different approach to log triage</strong></h2><p>Most of us have tried two extremes:</p><ol><li><p><strong>Regex everywhere.</strong> It starts clean and ends brittle. One library upgrade or format tweak and your &#8220;error&#8221; dashboard quietly loses half its signal. You might not notice until the night you needed that signal most.</p></li><li><p><strong>One-click LLM summarization.</strong> Tempting, but operationally awkward: cost is difficult to predict, behavior drifts with prompts, and there&#8217;s no clean hand-off back into the tooling your SREs actually use.</p></li></ol><p>The middle path is boring on purpose, but it works: Keep the <strong>DataFrame ergonomics</strong> engineers already use, add <strong>semantic operators</strong> where they pay off, and make the &#8220;AI bits&#8221; visible, testable, and schedulable.</p><p>That&#8217;s the ethos behind Fenic. You manipulate logs like a table, but you also get first-class <strong>text extraction, embeddings, and LLM utilities</strong>. This is all with explicit configuration and batch-friendly behavior.</p><h2><strong>The promised end-state (what &#8220;done&#8221; looks like)</strong></h2><p>Imagine asking your system:</p><ul><li><p>&#8220;<strong>Top clusters above WARN</strong> in the last hour.&#8221;</p></li><li><p>&#8220;<strong>Only ERRORs</strong> for <code>payment-api</code>.&#8221;</p></li><li><p>&#8220;<strong>Assignments for cluster 7</strong>&#8212;I want the raw lines.&#8221;</p></li><li><p>&#8220;<strong>Coverage check</strong>&#8212;are we parsing reliably today?&#8221;</p></li></ul><p>The agent answers in seconds because it isn&#8217;t guessing. It&#8217;s calling deterministic <a href="https://neurlcreators.substack.com/p/how-ai-agents-use-tools-mcp-architecture-visual-explainer?utm_source=publication-search">MCP tools</a> backed by DataFrames you just computed:</p><ul><li><p><strong>Severity-aware clusters</strong> of similar failures (stable across parameter noise).</p></li><li><p><strong>Exemplars</strong> that read like summaries, not stack dumps.</p></li><li><p><strong>Evidence</strong> via fingerprints so a human can sanity-check the grouping.</p></li><li><p><strong>Coverage</strong>: a simple metric that tells you whether your parsing kept up with reality.</p></li></ul><p>There is no need to sift through pages of stack traces. The agent uses the read-only, queryable views you produced during the pipeline run.</p><div id="youtube2-D5a1jAheATo" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;D5a1jAheATo&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/D5a1jAheATo?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Behind the scenes, you also keep operator-ready artifacts (CSV/JSON) for dashboards and a compact Markdown report for humans. Those artifacts are the audit trail; the agent is the interface.</p><p><strong>Interested in Markdown Processing?</strong> Run the <a href="https://colab.research.google.com/github/typedef-ai/fenic/blob/main/examples/markdown_processing/markdown_processing.ipynb">extensive example in Colab</a>.</p><h2><strong>Architecture at a glance (why it stays maintainable)</strong></h2><p>The pipeline that powers those tools is deliberately small and linear. Each step is a DataFrame transform you can diff, test, and run headless.</p><h3><strong>1) Parse without brittleness</strong></h3><p>Real systems emit a mix of ISO-ish timestamps, syslog-style messages, Python logging lines, and occasionally JSON.</p><p>With Fenic, you define a few templates (named fields, not brittle capture groups), <a href="https://docs.fenic.ai/latest/reference/fenic/?h=unnest#fenic.DataFrame.unnest">unnest</a> them into candidate columns, and <strong><a href="https://docs.fenic.ai/latest/reference/fenic/?h=coalesce#fenic.coalesce">coalesce</a></strong> to a canonical schema:</p><ul><li><p><code>timestamp</code> (normalize later if needed)</p></li><li><p><code>level</code> (<code>INFO/WARN/ERROR/...</code>)</p></li><li><p><code>service</code> (e.g., <code>payment-api</code>)</p></li><li><p><code>message</code> (payload after the header)</p></li><li><p>optional <code>trace_id</code></p></li></ul><p>A line that doesn&#8217;t match any template doesn&#8217;t vanish; it falls back to raw text. That single choice makes coverage transparent and drift visible.</p><h3><strong>2) Fingerprint for stability</strong></h3><p>Clustering succeeds or fails on the grouping key. We want a fingerprint that ignores volatile tokens (IDs, ports, timings) but preserves the cause. A practical pattern is:</p><pre><code>service | symbol | file#function | stem</code></pre><ul><li><p><strong>symbol</strong>: recognizable error label (<code>TimeoutError</code>, <code>OperationalError</code>, <code>unique_constraint_failed</code>).</p></li><li><p><strong>file#function</strong>: optional call site (useful for stack-tracey languages).</p></li><li><p><strong>stem</strong>: a normalized message that keeps the <em>idea</em> of the failure while stripping noise.</p></li></ul><p>You can extract these with rules or a tiny LLM call. Either way, the fingerprint stays explainable because you keep the pieces next to it in the table.</p><h3><strong>3) Tag severity deterministically</strong></h3><p>Before we embed anything, we separate signal from noise with cheap rules:</p><ul><li><p>Hard failures (level=ERROR/FATAL/CRIT, timeout, connection refused, HTTP 5xx, nginx upstream errors) &#8594; <strong>error</strong></p></li><li><p>Soft indicators (retry, latency, degraded) &#8594; <strong>warn</strong></p></li><li><p>Everything else &#8594; <strong>info</strong></p></li></ul><p>This stops <code>INFO</code> chatter from drowning out real problems and lets you cluster inside severity buckets. You can layer an LLM adjudicator later for ambiguous messages, but the baseline is deterministic and fast.</p><h3><strong>4) Cluster semantically (per severity)</strong></h3><p>Now the semantic bit pays off. You embed enriched text, like:</p><pre><code>[svc:payment-api] [sev:error] TimeoutError on /v1/users after Xms</code></pre><p>Clustering <strong>within severity buckets</strong> keeps <code>WARN</code> chatter from bleeding into <code>ERROR</code>s. Choose <a href="https://docs.fenic.ai/latest/reference/fenic/api/dataframe/?h=k+means#fenic.api.dataframe.SemanticExtensions.with_cluster_labels">K-Means</a> (with a small heuristic for K) or a density approach (HDBSCAN) for larger bins <strong>within</strong> each severity.</p><p>For each cluster, pick the exemplar closest to the centroid; that exemplar often reads like the sentence you&#8217;d write in a postmortem.</p><p>The result is a handful of clusters, each with a count, an exemplar, and a fingerprint, which is precisely what humans need to triage.</p><div class="pullquote"><p><em>Know someone who might need this? Share this post with your network and friends!</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/log-triage-pipeline-with-fenic-langchain?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/log-triage-pipeline-with-fenic-langchain?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><h3><strong>5) Publish read-only tools</strong></h3><p>Persist the three useful views (<code>triage</code>, <code>clusters</code>, and <code>assignments</code>) and register <a href="https://docs.fenic.ai/latest/topics/fenic-mcp/">MCP tools with Fenic</a>:</p><ul><li><p><code>list_clusters(severity_floor)</code> &#8594; ranked by severity-weighted score</p></li><li><p><code>clusters_by_severity(severity)</code> &#8594; a single lane</p></li><li><p><code>assignments_for_cluster(cluster_id)</code> &#8594; raw lines and evidence</p></li><li><p><code>coverage_metrics()</code> &#8594; processed vs total lines</p></li></ul><p>These are just <strong>queries over tables</strong>; they return rows, not plain-English (narrative) summaries.</p><p>A <a href="https://www.langchain.com/agents">LangGraph agent</a> sits on top and chooses the right tool based on the question.</p><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:168006343,&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/langgraph-agent-state-machine-review&quot;,&quot;publication_id&quot;:3228552,&quot;publication_name&quot;:&quot;The Neural Blueprint: Practical Content for AI Builders &quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!6udc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5b15961-5020-4a71-b040-30f9b3d3f232_256x256.png&quot;,&quot;title&quot;:&quot;In-Depth Review of LangGraph: The Agentic State Machine&quot;,&quot;truncated_body_text&quot;:&quot;At the inaugural LangChain Interrupt conference this year, the spotlight wasn&#8217;t on LangChain itself; it was on LangGraph. The event focused heavily on AI agents and showcased how major companies, such as Uber, LinkedIn, and Replit, are deploying them in production.&quot;,&quot;date&quot;:&quot;2025-07-10T20:45:21.074Z&quot;,&quot;like_count&quot;:0,&quot;comment_count&quot;:0,&quot;bylines&quot;:[{&quot;id&quot;:49908626,&quot;name&quot;:&quot;Eteimorde Youdiowei&quot;,&quot;handle&quot;:&quot;eteims&quot;,&quot;previous_name&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/90f6ea8f-0227-42b7-8c09-47e819b7f743_661x661.jpeg&quot;,&quot;bio&quot;:&quot;Simplifying complex ideas is my passion.....&quot;,&quot;profile_set_up_at&quot;:&quot;2025-01-10T07:28:20.374Z&quot;,&quot;reader_installed_at&quot;:&quot;2025-01-10T07:28:07.491Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:3823561,&quot;user_id&quot;:49908626,&quot;publication_id&quot;:3750355,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:3750355,&quot;name&quot;:&quot;Eteimorde Youdiowei&quot;,&quot;subdomain&quot;:&quot;eteims&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Just messing around...&quot;,&quot;logo_url&quot;:null,&quot;author_id&quot;:49908626,&quot;primary_user_id&quot;:49908626,&quot;theme_var_background_pop&quot;:&quot;#FF6719&quot;,&quot;created_at&quot;:&quot;2025-01-13T22:20:31.445Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Eteimorde Youdiowei&quot;,&quot;founding_plan_name&quot;:null,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;profile&quot;,&quot;is_personal_mode&quot;:true}},{&quot;id&quot;:3818981,&quot;user_id&quot;:49908626,&quot;publication_id&quot;:3228552,&quot;role&quot;:&quot;contributor&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:3228552,&quot;name&quot;:&quot;The Neural Blueprint: Practical Content for AI Builders &quot;,&quot;subdomain&quot;:&quot;neurlcreators&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Blueprints for AI Builders who code, compute, and create at scale &#128640;.\n\nWe&#8217;re AI Engineers obsessed with the art of building practical solutions and crafting immersive, technical stories about that journey through great content.\n\nSounds fun? Subscribe! &#129761;&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f5b15961-5020-4a71-b040-30f9b3d3f232_256x256.png&quot;,&quot;author_id&quot;:280510396,&quot;primary_user_id&quot;:280510396,&quot;theme_var_background_pop&quot;:&quot;#FF6719&quot;,&quot;created_at&quot;:&quot;2024-10-25T15:49:25.618Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Neurl LLC.&quot;,&quot;founding_plan_name&quot;:null,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}}],&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null,&quot;status&quot;:{&quot;bestsellerTier&quot;:null,&quot;subscriberTier&quot;:null,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:null,&quot;paidPublicationIds&quot;:[],&quot;subscriber&quot;:null}},{&quot;id&quot;:280510396,&quot;name&quot;:&quot;Neurl Creators&quot;,&quot;handle&quot;:&quot;neurlcreators&quot;,&quot;previous_name&quot;:&quot;Stephen Oladele&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1b0dc913-49dd-4e59-ac39-bbbf289e1744_256x256.png&quot;,&quot;bio&quot;:null,&quot;profile_set_up_at&quot;:&quot;2024-10-25T15:48:45.741Z&quot;,&quot;reader_installed_at&quot;:&quot;2025-03-23T15:26:15.979Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:3288365,&quot;user_id&quot;:280510396,&quot;publication_id&quot;:3228552,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:3228552,&quot;name&quot;:&quot;The Neural Blueprint: Practical Content for AI Builders &quot;,&quot;subdomain&quot;:&quot;neurlcreators&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Blueprints for AI Builders who code, compute, and create at scale &#128640;.\n\nWe&#8217;re AI Engineers obsessed with the art of building practical solutions and crafting immersive, technical stories about that journey through great content.\n\nSounds fun? Subscribe! &#129761;&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f5b15961-5020-4a71-b040-30f9b3d3f232_256x256.png&quot;,&quot;author_id&quot;:280510396,&quot;primary_user_id&quot;:280510396,&quot;theme_var_background_pop&quot;:&quot;#FF6719&quot;,&quot;created_at&quot;:&quot;2024-10-25T15:49:25.618Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Neurl LLC.&quot;,&quot;founding_plan_name&quot;:null,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}}],&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null,&quot;status&quot;:{&quot;bestsellerTier&quot;:null,&quot;subscriberTier&quot;:null,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:null,&quot;paidPublicationIds&quot;:[],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://neurlcreators.substack.com/p/langgraph-agent-state-machine-review?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!6udc!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5b15961-5020-4a71-b040-30f9b3d3f232_256x256.png" loading="lazy"><span class="embedded-post-publication-name">The Neural Blueprint: Practical Content for AI Builders </span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">In-Depth Review of LangGraph: The Agentic State Machine</div></div><div class="embedded-post-body">At the inaugural LangChain Interrupt conference this year, the spotlight wasn&#8217;t on LangChain itself; it was on LangGraph. The event focused heavily on AI agents and showcased how major companies, such as Uber, LinkedIn, and Replit, are deploying them in production&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">10 months ago &#183; Eteimorde Youdiowei and Neurl Creators</div></a></div><h2><strong>Agent-first: the interface humans actually use</strong></h2><p>The trick to a good agent is primarily in well-chosen tools. With deterministic MCP calls, the agent becomes a thin orchestration layer:</p><ul><li><p><strong>Question &#8594; tool selection</strong> (e.g., &#8220;top&#8221; &#8594; <code>list_clusters</code> with warn)</p></li><li><p><strong>Fetch rows</strong> &#8594; <strong>render clearly</strong> (markdown tables are enough)</p></li><li><p><strong>Drill-down</strong> &#8594; if the human asks, <code>call assignments_for_cluster</code> with the id in the current row</p></li></ul><p>There&#8217;s no hidden magic. If the agent faces an ambiguous question, it could enrich a summary for the top 3 clusters with a small, cheap LLM. But the default behavior is to <strong>show the facts</strong> and let the responder decide.</p><p>This is why the agent replaces the daily digest: it answers the <em>current</em> question, at the <em>moment of curiosity</em>, with <em>just enough</em> detail to act.</p><h2><strong>What &#8220;good&#8221; looks like in production</strong></h2><p>After a week of daily runs, you&#8217;ll notice:</p><ul><li><p><strong>Coverage is stable</strong> (e.g., 80&#8211;90%). If it dips, that&#8217;s a crisp signal you need a new template or a parsing tweak.</p></li><li><p><strong>Clusters are consistent</strong> day-to-day. The exemplar for &#8220;connection timeouts to db-prod&#8221; still points at the same root cause. If a cluster explodes in size, you know where to look fast.</p></li><li><p><strong>On-call asks the agent first.</strong> Instead of searching dashboards, responders query: &#8220;top clusters above warn,&#8221; &#8220;only errors,&#8221; &#8220;assignments for #7.&#8221; They get compact tables, not walls of text.</p></li><li><p><strong>Artifacts fit right in.</strong> CSV/JSON slots into dashboards and data lakes; Markdown renders well in tickets and wikis.</p></li></ul><p>The effect isn&#8217;t flashy but helps you respond with more velocity, which, of course, is less guesswork and more progress.</p><div class="pullquote"><p>Need creative, high-quality technical content? Happy to chat! Book a call with our Creative Engineers&#128071;&#127998;</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://calendar.app.google/s6ekF2rDbPjKWLLQ8&quot;,&quot;text&quot;:&quot;Let&#8217;s Chat! &#128222;&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://calendar.app.google/s6ekF2rDbPjKWLLQ8"><span>Let&#8217;s Chat! &#128222;</span></a></p></div><h2><strong>Why this design holds up in day-2 reality</strong></h2><ul><li><p><strong>Adoption stays high.</strong> DataFrame ergonomics mean less training and faster pull requests. You can add a template or tweak a rule without changing the whole system.</p></li><li><p><strong>Determinism where you want it.</strong> Severity, fingerprints, and clusters don&#8217;t change because the operators are explicit. If a cluster changes, you can usually point to the reason (new message variant, new template).</p></li><li><p><strong>Costs don&#8217;t surprise you.</strong> Deduplicate before embedding, embed once per unique fingerprint, and summarize only exemplars if you need narrative text. Your spend tracks <em>distinct issues and</em> not <em>raw volume</em>.</p></li><li><p><strong>It&#8217;s easy to run headless.</strong> The same notebook steps translate into a small script; artifacts are files; tools are queries. Nothing exotic.</p></li><li><p><strong>The agent is an interface, not a black box.</strong> It calls small, auditable tools over your tables. You can inspect the tables, test the tools, and trust the response.</p></li></ul><h2><strong>Adopting this incrementally</strong></h2><p>A common failure mode is trying to &#8220;agentify&#8221; the entire logging stack at once. Don&#8217;t. Here&#8217;s a low-friction rollout that respects your time and budget:</p><ol><li><p><strong>Run the demo</strong> against a small, representative slice of your logs.</p></li><li><p><strong>Check coverage.</strong> If it&#8217;s low, add a template (JSON is a big win if you have it) and normalize service names (merge aliases).</p></li><li><p><strong>Turn on the agent</strong> and use it during on-call. Let people ask natural-language questions and see cluster movement.</p></li><li><p><strong>Stabilize cluster IDs</strong> if needed: enrich the fingerprint prefix or seed K-Means with yesterday&#8217;s centroids so exemplars stay familiar.</p></li><li><p><strong>Only then</strong> consider LLM summaries for the top few clusters if responders want them. Keep it bounded.</p></li><li><p><strong>Integrate light observability.</strong> Track coverage and cluster counts; alert on sudden dips or spikes.</p></li></ol><p>At no point do you need to replace your existing log tools. This pipeline sits next to them and explains what they&#8217;re already collecting.</p><div class="pullquote"><p><em>Join the conversation and share your experiences in the comments below!</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/log-triage-pipeline-with-fenic-langchain/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/log-triage-pipeline-with-fenic-langchain/comments"><span>Leave a comment</span></a></p></div><h2><strong>Cost, latency, and control (picking an operating point)</strong></h2><ul><li><p><strong>Parsing</strong> is free aside from CPU. Templates are deterministic and cheap.</p></li><li><p><strong>Fingerprints</strong> are near-free if rule-based; if you choose to extract with an LLM, dedupe prompts by unique message text.</p></li><li><p><strong>Embeddings</strong> are the main paid step. Control spending by deduplicating on the fingerprint and batching aggressively. You embed <em>distinct issues</em>, not <em>every log line</em>.</p></li><li><p><strong>Clustering</strong> (K-Means or <a href="https://scikit-learn.org/stable/modules/generated/sklearn.cluster.HDBSCAN.html">HDBSCAN</a>) runs locally and scales well.</p></li><li><p><strong>Summaries</strong> are optional and should be limited to <strong>top-K</strong> clusters, where a human will actually read them.</p></li></ul><p>If you want one sentence to carry to your next platform meeting: rules and embeddings, agent-first, and summaries only if people ask for them.</p><h2><strong>Governance and privacy (don&#8217;t skip this)</strong></h2><p>If there&#8217;s any chance your logs include PII/PHI or secrets, redact before any model call:</p><ul><li><p>Mask emails, tokens, phone numbers, and user IDs at the <code>message</code> field.</p></li><li><p>Keep raw logs inside your private boundary; export only fingerprints, counts, exemplars, and cluster IDs externally.</p></li><li><p>If required, route embeddings to a provider in your VPC, or use a private endpoint.</p></li></ul><p>Fenic doesn&#8217;t force you into any specific model provider, which keeps compliance conversations simpler.</p><h2><strong>What success looks like</strong></h2><p>After a few days, you&#8217;ll notice the conversation change. People stop posting raw stack traces and start asking the agent:</p><ul><li><p>&#8220;only error clusters in payments since 9am?&#8221;</p></li><li><p>&#8220;assignments for cluster 12?&#8221;</p></li><li><p>&#8220;coverage today vs yesterday?&#8221;</p></li></ul><p>You&#8217;ll also recognize a few patterns:</p><ul><li><p>Coverage hovers above the threshold you set (say, 70-80%). When it dips, a new template or a rename fixes it.</p></li><li><p>Top clusters stay stable unless something genuinely changes in production. When an <code>ERROR</code> cluster doubles overnight, it&#8217;s a true signal.</p></li><li><p>The agent becomes the default way people ask for status: SREs, support engineers, and even PMs who want a quick read before stand-up.</p></li><li><p>Code changes that would have broken a regex pipeline are non-events. You coalesce a new template and move on.</p></li></ul><p>And that&#8217;s the real win: fewer firefights over tooling and more time hunting actual root causes.</p><h2><strong>Try it, then make it yours</strong></h2><ul><li><p><strong><a href="https://colab.research.google.com/github/typedef-ai/fenic-examples/blob/main/oncall_triage_agent/fenic_oncall_triage_agent.ipynb">Open the Colab</a></strong> and run the demo with sample logs, or drop in a small batch of your own.</p></li><li><p><strong>Adopt the minimal pipeline</strong> (parse &#8594; fingerprint &#8594; severity &#8594; cluster) and expose the <strong>tools</strong>.</p></li><li><p><strong>Use the agent</strong> during on-call. Adjust templates and fingerprints as reality evolves.</p></li><li><p><strong>Iterate</strong>: If responders ask for more context, add domain columns (endpoint, table, region) into fingerprints and enriched text. If they want prose, summarize exemplars only.</p></li></ul><h3><strong>Grab the Essentials</strong></h3><ul><li><p><strong><a href="https://docs.fenic.ai/latest/">Docs</a>:</strong> Fenic operators, semantic config, and MCP tooling.</p></li><li><p><strong><a href="https://fenic.ai/">GitHub</a>:</strong> clone, star, and open issues/PRs.</p></li></ul><p><a href="https://github.com/typedef-ai/fenic/tree/main/examples">More examples and ready-to-run scripts</a>.</p><div class="pullquote"><p>Subscribe to <em><strong><a href="https://neurlcreators.substack.com/">The Neural Blueprint</a></strong></em>&#128071;&#127998; for hands-on guides! &#129761; Follow us on <em><a href="http://www.youtube.com/@neurlcreators">YouTube</a>,</em> <em><a href="https://x.com/NeurlCreators">X</a>, and</em> <em><a href="https://www.linkedin.com/showcase/neurl-creators/">LinkedIn</a>.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/subscribe?"><span>Subscribe now</span></a></p></div><p></p>]]></content:encoded></item><item><title><![CDATA[Unstructured.io: LLM-Ready Document ETL Without Rewriting Your Stack]]></title><description><![CDATA[RAG lives or dies by input quality. Unstructured.io turns messy PDFs, slides, emails, and HTML into clean, chunk-ready JSON. It&#8217;s complete with OCR, table structure, and connectors to your favorite data sources. Is it the right pre-processing backbone for your LLM stack?]]></description><link>https://neurlcreators.substack.com/p/tuesday-tool-review-16-llm-ready</link><guid isPermaLink="false">https://neurlcreators.substack.com/p/tuesday-tool-review-16-llm-ready</guid><dc:creator><![CDATA[Stephen FIYINFOLUWA Oladele]]></dc:creator><pubDate>Fri, 10 Oct 2025 16:20:36 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/1de51ede-5b0b-49c4-ba99-f1aa4a278f71_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Since its first commit in 2022, <strong><a href="https://docs.unstructured.io/welcome">Unstructured</a></strong> has grown from a PDF splitter into a modular ETL stack for unstructured content:</p><ol><li><p><a href="https://github.com/Unstructured-IO/unstructured-api">OSS extraction library</a> (<code>unstructured</code>)</p></li><li><p><a href="https://github.com/Unstructured-IO/unstructured-inference">ML add-ons</a> (<code>unstructured-inference</code>)</p></li><li><p><a href="https://github.com/Unstructured-IO/unstructured-ingest">Scalable pipelines</a> (<code>unstructured-ingest</code>)</p></li><li><p><a href="https://docs.unstructured.io/api-reference/overview">Serverless/hosted</a> Partition and Chunking API</p></li></ol><p>That mix is why it shows up in so many production RAG stacks today. Version 0.18 series (summer 2025)<strong> </strong>introduced faster PDF partitioning and a revamped chunking API for RAG pipelines.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!R4IO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80722719-9843-4ce1-ad17-584ec611403a_1408x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!R4IO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80722719-9843-4ce1-ad17-584ec611403a_1408x1600.png 424w, https://substackcdn.com/image/fetch/$s_!R4IO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80722719-9843-4ce1-ad17-584ec611403a_1408x1600.png 848w, https://substackcdn.com/image/fetch/$s_!R4IO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80722719-9843-4ce1-ad17-584ec611403a_1408x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!R4IO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80722719-9843-4ce1-ad17-584ec611403a_1408x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!R4IO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80722719-9843-4ce1-ad17-584ec611403a_1408x1600.png" width="1408" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/80722719-9843-4ce1-ad17-584ec611403a_1408x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!R4IO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80722719-9843-4ce1-ad17-584ec611403a_1408x1600.png 424w, https://substackcdn.com/image/fetch/$s_!R4IO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80722719-9843-4ce1-ad17-584ec611403a_1408x1600.png 848w, https://substackcdn.com/image/fetch/$s_!R4IO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80722719-9843-4ce1-ad17-584ec611403a_1408x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!R4IO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80722719-9843-4ce1-ad17-584ec611403a_1408x1600.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h2><strong>&#128226; What Is Unstructured?</strong></h2><p>An <strong>open-source ETL framework</strong> that:</p><ol><li><p><strong>Loads</strong> documents from <a href="https://docs.unstructured.io/open-source/ingestion/source-connectors/overview">40+ sources</a> (filesystem, S3, Gmail, Jira, &#8230;).</p></li><li><p><strong>Partitions</strong> each file into logical elements (Title, NarrativeText, Table, ListItem, PageBreak).</p></li><li><p><strong>Enriches</strong> with layout ML (table boundaries, header detection, image OCR).</p></li><li><p><strong>Spits out</strong> JSON, Markdown, HTML, or Arrow, with coordinates, page numbers, languages, SHA hashes.</p></li></ol><p>The same API powers:</p><ul><li><p>&#128013; <strong>Python SDK</strong> (<code>from unstructured.partition.pdf import partition_pdf(...)</code>)</p></li><li><p>&#128736;&#65039; <strong>CLI</strong> (<code>unstructured-ingest --strategy hi_res --recursive s3://bucket</code> &#8230;)</p></li><li><p>&#9729;&#65039; <strong>Cloud SaaS/Docker micro-service</strong> (<code>POST /general/v0/general</code>)</p></li></ul><h2><strong>&#9881;&#65039; How the Pieces Fit</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bUVC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc08998f1-9345-4fb2-905d-cb82d5f3cc98_1600x1268.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bUVC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc08998f1-9345-4fb2-905d-cb82d5f3cc98_1600x1268.png 424w, https://substackcdn.com/image/fetch/$s_!bUVC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc08998f1-9345-4fb2-905d-cb82d5f3cc98_1600x1268.png 848w, https://substackcdn.com/image/fetch/$s_!bUVC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc08998f1-9345-4fb2-905d-cb82d5f3cc98_1600x1268.png 1272w, https://substackcdn.com/image/fetch/$s_!bUVC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc08998f1-9345-4fb2-905d-cb82d5f3cc98_1600x1268.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bUVC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc08998f1-9345-4fb2-905d-cb82d5f3cc98_1600x1268.png" width="1456" height="1154" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c08998f1-9345-4fb2-905d-cb82d5f3cc98_1600x1268.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1154,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&#9881;&#65039; How the Pieces Fit&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="&#9881;&#65039; How the Pieces Fit" title="&#9881;&#65039; How the Pieces Fit" srcset="https://substackcdn.com/image/fetch/$s_!bUVC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc08998f1-9345-4fb2-905d-cb82d5f3cc98_1600x1268.png 424w, https://substackcdn.com/image/fetch/$s_!bUVC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc08998f1-9345-4fb2-905d-cb82d5f3cc98_1600x1268.png 848w, https://substackcdn.com/image/fetch/$s_!bUVC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc08998f1-9345-4fb2-905d-cb82d5f3cc98_1600x1268.png 1272w, https://substackcdn.com/image/fetch/$s_!bUVC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc08998f1-9345-4fb2-905d-cb82d5f3cc98_1600x1268.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><strong>&#9881;&#65039; How the Pieces Fit</strong></figcaption></figure></div><ul><li><p><strong>FAST</strong> strategy = pdfminer + heuristic chunking (seconds per doc).</p></li><li><p><strong>HI-RES</strong> (<code>--strategy hi_res</code>) calls the <strong>Chipper</strong> model for page-layout accuracy; slower, GPU optional <a href="https://github.com/Unstructured-IO/unstructured-api">GitHub</a>.</p></li></ul><h2><strong>&#9881;&#65039; Key Features of Unstructured</strong></h2><p>Here are all the core components of Unstructured:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sGbT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82c71b17-08b2-439c-b1dd-2b9a54991ad9_1600x1540.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sGbT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82c71b17-08b2-439c-b1dd-2b9a54991ad9_1600x1540.png 424w, https://substackcdn.com/image/fetch/$s_!sGbT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82c71b17-08b2-439c-b1dd-2b9a54991ad9_1600x1540.png 848w, https://substackcdn.com/image/fetch/$s_!sGbT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82c71b17-08b2-439c-b1dd-2b9a54991ad9_1600x1540.png 1272w, https://substackcdn.com/image/fetch/$s_!sGbT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82c71b17-08b2-439c-b1dd-2b9a54991ad9_1600x1540.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sGbT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82c71b17-08b2-439c-b1dd-2b9a54991ad9_1600x1540.png" width="1456" height="1401" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/82c71b17-08b2-439c-b1dd-2b9a54991ad9_1600x1540.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1401,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&#9881;&#65039; Key Features of Unstructured&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="&#9881;&#65039; Key Features of Unstructured" title="&#9881;&#65039; Key Features of Unstructured" srcset="https://substackcdn.com/image/fetch/$s_!sGbT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82c71b17-08b2-439c-b1dd-2b9a54991ad9_1600x1540.png 424w, https://substackcdn.com/image/fetch/$s_!sGbT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82c71b17-08b2-439c-b1dd-2b9a54991ad9_1600x1540.png 848w, https://substackcdn.com/image/fetch/$s_!sGbT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82c71b17-08b2-439c-b1dd-2b9a54991ad9_1600x1540.png 1272w, https://substackcdn.com/image/fetch/$s_!sGbT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82c71b17-08b2-439c-b1dd-2b9a54991ad9_1600x1540.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><strong>&#9881;&#65039; Key Features of Unstructured</strong></figcaption></figure></div><h2><strong>&#128736;&#65039; How Unstructured Works (Under the Hood)</strong></h2><p><strong>Partition &#8594; Elements &#8594; Chunk &#8594; Embed<br></strong>You call a <code>partition_*</code> function (or the API). It returns a list of typed <strong>elements</strong> (e.g., <code>Title, NarrativeText, Table</code>) with metadata (page, coords, <code>text_as_html</code> for tables, etc.). You then <strong>chunk</strong> those elements (e.g., &#8220;by_title&#8221;) and embed the chunks.</p><p><strong>Extraction strategies</strong></p><ul><li><p><code>fast</code>: fastest, fewer heavy dependencies; good for bulk/simple docs.</p></li><li><p><code>hi_res</code>: uses <code>unstructured-inference</code> layout models (Detectron2) for higher fidelity on PDFs/slides; better table/figure handling (needs OS deps).</p></li><li><p><code>vlm</code>: uses a vision-language model for image-heavy docs or screenshots.</p></li></ul><p><strong>Tables and images</strong></p><ul><li><p>Set <code>skip_infer_table_types</code> to False to preserve tables (HTML in <code>metadata.text_as_html</code>).</p></li><li><p>There&#8217;s a guided &#8220;Extract tables as HTML&#8221; and &#8220;Extract images and tables&#8221; <a href="https://docs.unstructured.io/examplecode/codesamples/apioss/table-extraction-from-pdf">how-to in the docs</a>.</p></li></ul><p><strong>Chunking</strong></p><ul><li><p>Core chunkers combine neighboring elements with size limits; <code>by_title</code> respects section boundaries (and never mixes a <code>Table</code> with other text). API adds extra strategies and parameters (<code>combine_under_n_chars, multipage_sections, by_similarity</code>).</p></li></ul><h2><strong>&#128640; Quick Spin-Up (two ways)</strong></h2><h3><strong>A) One-line API (Docker or hosted)</strong></h3><pre><code># Local API (example)
docker pull downloads.unstructured.io/unstructured-io/unstructured-api:latest

# Run locally
docker run -p 8000:8000 -d --rm --name unstructured-api \
  downloads.unstructured.io/unstructured-io/unstructured-api:latest

# Send a PDF &#8594; get elements + chunks, hi-res + tables HTML preserved
curl -X POST &#8220;http://localhost:8000/general/v0/general&#8221; \
  -H &#8220;Content-Type: multipart/form-data&#8221; \
  -F &#8220;files=@sample.pdf&#8221; \
  -F &#8220;strategy=hi_res&#8221; \
  -F &#8220;chunking_strategy=by_title&#8221; \
  -F &#8220;combine_under_n_chars=500&#8221; \
  -F &#8220;multipage_sections=true&#8221; \
  -F &#8220;include_orig_elements=true&#8221;</code></pre><p><em>Why this way?</em> Fewer OS deps; newest chunking features are first-class in the API.</p><p>B) Local Python (OSS): digital PDF, keep it fast</p><h3><strong>B) Local Python (OSS): digital PDF, keep it fast</strong></h3><pre><code># pip install &#8220;unstructured[pdf]&#8221;
from unstructured.partition.pdf import partition_pdf

elements = partition_pdf(
    filename=&#8221;report.pdf&#8221;,
    strategy=&#8221;fast&#8221;,            # try fast first; faster on simple PDFs
    extract_images=False
)
# elements is a list of typed blocks: Title, NarrativeText, ListItem, Table, ...</code></pre><p>Unstructured&#8217;s <a href="https://docs.unstructured.io/api-reference/partition/speed-up-large-files-batches">docs recommend</a> trying <code>fast</code> on simple PDFs before reaching for <code>hi_res</code> to reduce processing time.</p><h3><strong>C) Local Python (scanned or mixed layout PDF, include tables)</strong></h3><pre><code># pip install &#8220;unstructured[pdf]&#8221;  # brings layout/ocr deps
from unstructured.partition.pdf import partition_pdf

elements = partition_pdf(
    filename=&#8221;financials_scanned.pdf&#8221;,
    strategy=&#8221;hi_res&#8221;,                # layout models (Detectron2/YOLOX)
    infer_table_structure=True,       # include HTML table structure
    extract_images=True
)</code></pre><p>(Note: <code>hi_res</code> needs OS deps like poppler, tesseract, libmagic, libreoffice per README.)</p><h2><strong>&#9881;&#65039; Connectors for Real Pipelines</strong></h2><p>Need to pull from SaaS or clouds and push to a lake/vector DB? <strong>Unstructured Ingest</strong> ships <a href="https://docs.unstructured.io/open-source/ingestion/source-connectors/overview">many source</a> and <a href="https://docs.unstructured.io/open-source/ingestion/destination-connectors/overview">destination connectors</a> (S3/GCS/Azure, Confluence, Notion, Salesforce, Airtable, Slack, SharePoint, etc.), and can call the <strong>API</strong> directly (<code>--partition-by-api</code>).</p><p><strong>Example (Airtable &#8594; API &#8594; JSON):</strong></p><pre><code># deps
pip install &#8220;unstructured-ingest[airtable]&#8221;

# run local API in another shell if you want to avoid hosted keys:
# docker run -p 8000:8000 -d --rm --name unstructured-api \
  downloads.unstructured.io/unstructured-io/unstructured-api:latest

unstructured-ingest \
  airtable \
  --personal-access-token &#8220;$AIRTABLE_PAT&#8221; \
  --list-of-paths &#8220;appr9nKeXLAtg6bgn/tblZ8uT1GY7NLbWit&#8221; \
  --output-dir ./local-out \
  --num-processes 2 \
  --reprocess \
  --partition-by-api \
  --partition-endpoint &#8220;http://localhost:8000/general/v0/general&#8221;</code></pre><h2><strong>&#128293; Why MLOps Engineers Should Care</strong></h2><ol><li><p><strong>Better recall with fewer hallucinations</strong>: <a href="https://docs.unstructured.io/open-source/core-functionality/chunking">Element-aware chunking</a> maintains the section semantics and table structure, resulting in retrieval units that are cleaner than those produced by blind text splits.</p></li><li><p><strong>Switchable fidelity</strong>: <code>fast</code> for throughput; <code>hi_res</code> when layout matters (scientific PDFs, slide decks).</p></li><li><p><strong>Production dataplane</strong>: Use <a href="https://docs.unstructured.io/open-source/ingestion/overview">the API</a> to avoid wrestling with OS deps and to access advanced chunking controls.</p></li><li><p><strong>End-to-end ingestion</strong>: <a href="https://docs.unstructured.io/open-source/ingestion/source-connectors/overview">Connectors</a> + chunking + metadata pave a straight path to embeddings and vector stores.</p></li><li><p><strong>API extras.</strong> The hosted API exposes <a href="https://docs.unstructured.io/api-reference/api-services/chunking">chunking modes</a> and strategies beyond OSS defaults (e.g., <code>vlm</code>).</p></li></ol><h2><strong>&#128201; Gotchas and Caveats (Read This Before You Roll Out)</strong></h2><ol><li><p><code>hi_res</code><strong> is heavier.</strong> Layout inference improves accuracy on tables/figures but increases runtime and system deps; try <code>fast</code> first on digital PDFs.</p></li><li><p><strong>Tables aren&#8217;t &#8220;just there.&#8221;</strong> <a href="https://docs.unstructured.io/examplecode/codesamples/apioss/table-extraction-from-pdf">On PDFs</a> you&#8217;ll want <code>strategy=&#8221;hi_res&#8221;</code> and <code>skip_infer_table_types=False</code> to populate <code>text_as_html</code> on <code>Table</code> elements. (The older <code>pdf_infer_table_structure </code>is deprecated.)</p></li><li><p><strong>API-only capabilities exist.</strong> Some strategies (e.g., <code>vlm</code>) and chunking options are <a href="https://docs.unstructured.io/api-reference/partition/partitioning">exposed in the </a><strong><a href="https://docs.unstructured.io/api-reference/partition/partitioning">Serverless API</a></strong> rather than the OSS path.</p></li><li><p><strong>OCR languages and agents.</strong> For non-English scans, pass <code>languages=[...]</code> (<a href="https://docs.unstructured.io/api-reference/general/summary">API</a>) or set the OCR agent in OSS; you&#8217;ll need language packs.</p></li><li><p><strong>Large-file performance.</strong> Unstructured&#8217;s docs <a href="https://docs.unstructured.io/open-source/how-to/speed-up-large-files-batches">include concrete tuning tips</a> for big PDFs/batches; read them before you queue a 2k-page corpus.</p></li></ol><h2><strong>&#129514; Sanity-Check Scenarios</strong></h2><ul><li><p><strong>RAG for financial filings (scanned + tables). </strong>Use <code>strategy=&#8221;hi_res&#8221; </code>with <code>skip_infer_table_types=False</code>; chunk <strong>by title/semantics</strong>; consider API if you want <code>vlm</code> for image-heavy pages.</p></li><li><p><strong>Policy docs/handbooks (digital PDFs).</strong> Run <code>fast</code>, chunk by titles/sections. Faster, fewer deps, usually enough fidelity.</p></li><li><p><strong>SharePoint/Confluence crawl. </strong>Use <a href="https://docs.unstructured.io/open-source/ingestion/ingest-cli">unstructured-ingest</a>; start local, then flip <code>--partition-by-api</code> for scale without changing code.</p></li></ul><h2><strong>&#128483;&#65039; Community Pulse</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Y0-t!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcce7dda6-5a81-45db-aaa3-2bfd8ede6ef1_1478x448.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Y0-t!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcce7dda6-5a81-45db-aaa3-2bfd8ede6ef1_1478x448.png 424w, https://substackcdn.com/image/fetch/$s_!Y0-t!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcce7dda6-5a81-45db-aaa3-2bfd8ede6ef1_1478x448.png 848w, https://substackcdn.com/image/fetch/$s_!Y0-t!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcce7dda6-5a81-45db-aaa3-2bfd8ede6ef1_1478x448.png 1272w, https://substackcdn.com/image/fetch/$s_!Y0-t!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcce7dda6-5a81-45db-aaa3-2bfd8ede6ef1_1478x448.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Y0-t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcce7dda6-5a81-45db-aaa3-2bfd8ede6ef1_1478x448.png" width="1456" height="441" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cce7dda6-5a81-45db-aaa3-2bfd8ede6ef1_1478x448.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:441,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Y0-t!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcce7dda6-5a81-45db-aaa3-2bfd8ede6ef1_1478x448.png 424w, https://substackcdn.com/image/fetch/$s_!Y0-t!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcce7dda6-5a81-45db-aaa3-2bfd8ede6ef1_1478x448.png 848w, https://substackcdn.com/image/fetch/$s_!Y0-t!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcce7dda6-5a81-45db-aaa3-2bfd8ede6ef1_1478x448.png 1272w, https://substackcdn.com/image/fetch/$s_!Y0-t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcce7dda6-5a81-45db-aaa3-2bfd8ede6ef1_1478x448.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong><a href="https://mlops-community.slack.com/archives/C015J2Y9RLM/p1734196244184519?thread_ts=1734152489.432569&amp;cid=C015J2Y9RLM">Thread in MLOps.Community Slack</a>.</strong></p><h2><strong>&#128161; Real-World Use Case: Agentic Safety Red-Team</strong></h2><ol><li><p><strong>Ingest</strong> 10k annual reports (PDF) from S3 nightly.</p></li><li><p><strong>Hi-Res</strong> partition with table inference &#8594; JSON &#8594; DuckDB.</p></li><li><p><strong>Chunk and embed</strong> with OpenAI Ada-3.</p></li><li><p><strong>Query</strong> via a LangChain retriever in a chatbot.</p></li><li><p><strong>Outcome</strong>: 40% reduction in hallucinated KPIs vs. plain pdfminer pipeline.</p></li></ol><h1><strong>&#128202; How Unstructured Stacks Up (2025)</strong></h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!A4_D!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49f5125f-9e59-410f-b026-8424fe5e5528_1463x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!A4_D!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49f5125f-9e59-410f-b026-8424fe5e5528_1463x1600.png 424w, https://substackcdn.com/image/fetch/$s_!A4_D!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49f5125f-9e59-410f-b026-8424fe5e5528_1463x1600.png 848w, https://substackcdn.com/image/fetch/$s_!A4_D!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49f5125f-9e59-410f-b026-8424fe5e5528_1463x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!A4_D!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49f5125f-9e59-410f-b026-8424fe5e5528_1463x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!A4_D!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49f5125f-9e59-410f-b026-8424fe5e5528_1463x1600.png" width="1456" height="1592" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/49f5125f-9e59-410f-b026-8424fe5e5528_1463x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1592,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&#128202; How Unstructured Stacks Up (2025)&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="&#128202; How Unstructured Stacks Up (2025)" title="&#128202; How Unstructured Stacks Up (2025)" srcset="https://substackcdn.com/image/fetch/$s_!A4_D!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49f5125f-9e59-410f-b026-8424fe5e5528_1463x1600.png 424w, https://substackcdn.com/image/fetch/$s_!A4_D!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49f5125f-9e59-410f-b026-8424fe5e5528_1463x1600.png 848w, https://substackcdn.com/image/fetch/$s_!A4_D!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49f5125f-9e59-410f-b026-8424fe5e5528_1463x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!A4_D!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49f5125f-9e59-410f-b026-8424fe5e5528_1463x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">How Unstructured Stacks Up (2025)</figcaption></figure></div><p><em>&#10145;&#65039; Third-party rows are directional; choose based on vendor posture, compliance, and TCO.</em></p><div><hr></div><h2><strong>&#129489;&#8205;&#9878;&#65039; Final Verdict: 4.4/5 &#8220;Pragmatic, Proven, and Worth the Slot in Your Stack&#8221;</strong></h2><h3><strong>Rating: &#11088;&#11088;&#11088;&#11088;&#9734; (4.4/5)</strong></h3><p><strong>&#9989; Ship it if&#8230;</strong></p><ul><li><p>Your corpus spans <strong>scans + digital PDFs + HTML/Office</strong>, and you need <strong>typed elements</strong> (esp. tables, images).</p></li><li><p>You want one ETL that can run locally now and API later (same JSON shape).</p></li><li><p>You need connectors to move content at scale without writing crawlers.</p></li></ul><p><strong>&#10060; Hold off if&#8230;</strong></p><ul><li><p>Your docs are all digital PDFs with simple text, and a <strong>basic text extractor</strong> suffices.</p></li><li><p>You can&#8217;t accept heavier deps/runtime for hi_res, and no API usage is allowed.</p></li><li><p>You already standardized on a single cloud OCR like <a href="https://developers.llamaindex.ai/python/framework/llama_cloud/llama_parse/">LlamaParse</a>/<a href="https://cloud.llamaindex.ai/">Llama Cloud</a> that meets all needs and don&#8217;t need OSS flexibility.</p></li></ul><p><strong>Bottom line:</strong> For teams serious about production RAG/agents over messy docs, Unstructured gives you a <strong>cohesive OSS toolbox</strong> (from one-liners to a full ETL) with the right escape hatches for fidelity and scale.</p><div class="pullquote"><p>Need creative, high-quality technical content? Happy to chat! Book a call with our Creative Engineers&#128071;&#127998;</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://calendar.app.google/s6ekF2rDbPjKWLLQ8&quot;,&quot;text&quot;:&quot;Let's Chat! &#128222;&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://calendar.app.google/s6ekF2rDbPjKWLLQ8"><span>Let's Chat! &#128222;</span></a></p></div><h2><strong>&#128204; Resources to Learn More</strong></h2><ul><li><p>Docs &amp; Quick-starts: <a href="https://docs.unstructured.io/welcome">https://docs.unstructured.io/welcome</a></p></li><li><p>GitHub core library: <a href="https://github.com/Unstructured-IO/unstructured">https://github.com/Unstructured-IO/unstructured</a></p></li><li><p>Ingest connectors: <a href="https://github.com/Unstructured-IO/unstructured-ingest">https://github.com/Unstructured-IO/unstructured-ingest</a></p></li><li><p>Chunking best practices blog: <a href="https://unstructured.io/blog/chunking-for-rag-best-practices">https://unstructured.io/blog/chunking-for-rag-best-practices</a></p></li><li><p>Example notebooks: <a href="https://github.com/Unstructured-IO/notebooks">https://github.com/Unstructured-IO/notebooks</a></p></li></ul><div><hr></div><p>Love this review? Drop your war stories (and weird PDFs) in the comments, or share on X with <strong>#TuesdayToolReview</strong> and tag <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;MLOps Community&quot;,&quot;id&quot;:179676708,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc8b88be-5eab-4a2c-a6a5-430502ebabef_184x184.png&quot;,&quot;uuid&quot;:&quot;c737da97-fc54-491b-a8b2-b682cd2ab4bc&quot;}" data-component-name="MentionToDOM"></span> </p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/tuesday-tool-review-16-llm-ready/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/tuesday-tool-review-16-llm-ready/comments"><span>Leave a comment</span></a></p><p>If you want a deeper walkthrough (like parsing, chunking strategies, and table-aware retrieval), subscribe to <em><strong><a href="https://neurlcreators.substack.com/">The Neural Blueprint</a></strong></em> &#128071;&#127998; for hands-on guides! &#129761;  Follow us on <em><a href="http://www.youtube.com/@neurlcreators">YouTube</a>,<a href="https://x.com/NeurlCreators"> X</a>, and <a href="https://www.linkedin.com/showcase/neurl-creators/">LinkedIn</a>.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Inspect AI is UK AISI’s Open-Source Framework for Serious LLM Evaluations]]></title><description><![CDATA[From sandboxed agent safety tests to model-graded accuracy benchmarks, Inspect AI is becoming an OSS gold standard for evaluating frontier models. Should it power your next red-team sweep? See our no-BS review.]]></description><link>https://neurlcreators.substack.com/p/inspect-ai-evaluation-framework-review</link><guid isPermaLink="false">https://neurlcreators.substack.com/p/inspect-ai-evaluation-framework-review</guid><dc:creator><![CDATA[Stephen FIYINFOLUWA Oladele]]></dc:creator><pubDate>Wed, 17 Sep 2025 10:05:53 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/229a5094-9cd2-4ef1-b19c-1739a65dd663_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Inspect AI is an open-source Python framework from the UK AI Security Institute (AISI) for building and running reproducible LLM evaluations.</p><p>It ships opinionated primitives (dataset &#8594; Task &#8594; Solver &#8594; Scorer), multi-turn/agent workflows with tools, sandboxed execution (Docker built-in, optional Kubernetes/Proxmox adapters), a VS Code log viewer, and a web-based Inspect View.</p><p>Install is one-liner: <code>pip install inspect-ai </code>(<code>inspect-ai 0.3.130</code> as of Sep 7, 2025; MIT-licensed; Python &#8805; 3.10).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!R_BJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa91368cb-7a33-4bdd-847f-a31f6a11cb81_1408x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!R_BJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa91368cb-7a33-4bdd-847f-a31f6a11cb81_1408x1600.png 424w, https://substackcdn.com/image/fetch/$s_!R_BJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa91368cb-7a33-4bdd-847f-a31f6a11cb81_1408x1600.png 848w, https://substackcdn.com/image/fetch/$s_!R_BJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa91368cb-7a33-4bdd-847f-a31f6a11cb81_1408x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!R_BJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa91368cb-7a33-4bdd-847f-a31f6a11cb81_1408x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!R_BJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa91368cb-7a33-4bdd-847f-a31f6a11cb81_1408x1600.png" width="1408" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a91368cb-7a33-4bdd-847f-a31f6a11cb81_1408x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!R_BJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa91368cb-7a33-4bdd-847f-a31f6a11cb81_1408x1600.png 424w, https://substackcdn.com/image/fetch/$s_!R_BJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa91368cb-7a33-4bdd-847f-a31f6a11cb81_1408x1600.png 848w, https://substackcdn.com/image/fetch/$s_!R_BJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa91368cb-7a33-4bdd-847f-a31f6a11cb81_1408x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!R_BJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa91368cb-7a33-4bdd-847f-a31f6a11cb81_1408x1600.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h2><strong>&#128226; What Is Inspect AI (in 2025)?</strong></h2><p>At its core, Inspect is a declarative framework for evaluating AI systems via composable &#8220;tasks.&#8221; A task might check:</p><ul><li><p>whether a model refuses dangerous requests,</p></li><li><p>whether an agent leaks system instructions,</p></li><li><p>or whether an AI-to-AI convo converges on a correct answer.</p></li></ul><p>Inspect handles:</p><p>&#9989; Prompt construction</p><p>&#9989; Response evaluation</p><p>&#9989; Score aggregation</p><p>&#9989; Logging (JSONL, Postgres, Dash UI)</p><p>&#9989; Isolation (via K8s or VMs)</p><p><strong>Runs are explicit and inspectable.</strong> Everything is typed, reproducible, and introspectable, which is a must for frontier model auditing.</p><p><strong>Backed by:</strong><a href="https://www.aisi.gov.uk/"> UK AI Safety Institute</a> (the team driving UK government safety standards for frontier AI).</p><p><strong>Active user base:</strong> Adopted by METR, Apollo Research, other government AISIs, and major safety labs.</p><h2><strong>&#9881;&#65039; Key Features of Inspect AI</strong></h2><p>Here are all the core components the eval framework comes with:</p><ul><li><p><strong>Tasks/Datasets/Solvers/Scorers:</strong> Tasks bring together datasets (load data), solvers that elicit behavior (single/multi-step), and scorers (score outputs).</p></li><li><p><strong>Models layer: </strong><a href="https://inspect.aisi.org.uk/providers.html">One interface</a> over OpenAI, Anthropic, Google, Groq, Mistral, xAI (Grok), AWS Bedrock/AI Inference, Azure AI, Together, Cloudflare, Goodfire, plus local vLLM/Ollama/llama-cpp.</p></li><li><p><strong>Agents:</strong> Built-in ReAct, <a href="https://inspect.aisi.org.uk/agents.html">multi-agent composition</a>, external agent <strong>bridge</strong> (e.g., AutoGen/LangChain).</p></li><li><p><strong>Tools:</strong> Built-ins (bash, python, text-edit, web_search, web_browser, computer) + MCP/<a href="https://inspect.aisi.org.uk/tools-standard.html">Custom tools</a>.</p></li><li><p><strong>Sandboxing and safety:</strong> Run untrusted code and browsers in <a href="https://inspect.aisi.org.uk/sandboxing.html#sec-docker-configuration">Docker</a>, <a href="https://k8s-sandbox.aisi.org.uk/">K8s</a> (pods per sample), or <a href="https://github.com/UKGovernmentBEIS/inspect_proxmox_sandbox">Proxmox</a> with <a href="https://inspect.aisi.org.uk/sandboxing.html">sandbox extensions</a> and optional domain/network controls; Tool Approval (human-in-the-loop or policy-based gating).</p></li><li><p><strong>Dev Experience:</strong> <strong><a href="https://marketplace.visualstudio.com/items?itemName=ukaisi.inspect-ai">Inspect View</a></strong> (web log viewer for basic observability) and VS Code extension for run/debug/tune.</p></li><li><p><strong>Scale Knobs:</strong> Caching, batch mode, parallelism, eval-set slicing, retry/resume.</p></li><li><p><strong>Evals Registry: </strong><a href="https://inspect.aisi.org.uk/eval-sets.html">Dozens of canonical evals</a> (reasoning, coding, agent tasks, and cybersecurity).</p></li></ul><p><strong>Bonus:</strong> AISI publishes add-on packages like <code>inspect-cyber</code> for agentic cyber evals (see <a href="https://pypi.org/project/inspect-cyber/">PyPI</a>).</p><h2><strong>&#128736;&#65039; How Inspect AI Works (Under the Hood)</strong></h2><p><strong>1. Define Tasks</strong>: JSON/CSV/HF datasets feed <code>Task</code> objects; mix single-turn Q&amp;A, coding, multi-modal, or open-tool agent prompts.</p><p><strong>2. Elicit Solutions</strong>: <code>inspect_ai.solver</code> provides chain-of-thought, self-critique, tool-use, and MCP calls.</p><p><strong>3. Execute Securely</strong>: Choose <code>sandbox</code> back-ends: process jail, Docker, or the community <a href="https://github.com/UKGovernmentBEIS/inspect_k8s_sandbox">K8s sandbox</a> for scale and isolation.</p><p><strong>4. Score and Aggregate</strong>: Out-of-the-box model-graded QA, F1, pass@k, statistical bootstrap, plus custom metrics.</p><p><strong>5. Analyze</strong>: Transform logs into <code>evals_df, samples_df, events_df</code> for Pandas-like slicing or Inspect Viz dashboards.</p><p>Each <code>Task</code> bundles:</p><ul><li><p>a prompt template (e.g., a jailbreak or math question),</p></li><li><p>a model runner (e.g., OpenAI, HuggingFace, Ollama, local endpoint),</p></li><li><p>a scoring function (numeric, boolean, class).</p></li></ul><p>Example CLI command:</p><pre><code>inspect eval my_eval.py --model openai/gpt-4o --limit 100</code></pre><h2><strong>&#128640; Quick Spin-Up (two minutes)</strong></h2><ol><li><p><strong>Install and pick a model provider</strong></p><pre><code>pip install inspect-ai openai
export OPENAI_API_KEY=your-key</code></pre></li><li><p><strong>Hello-World eval (exact match + single call)</strong></p><pre><code>from inspect_ai import Task, task
from inspect_ai.dataset import Sample
from inspect_ai.solver import generate
from inspect_ai.scorer import exact

@task
def hello_world():
    return Task(
        dataset=[Sample(input="Just reply with Hello World", target="Hello World")],
        solver=[generate()],
        scorer=exact(),
    )</code></pre><pre><code>inspect eval hello.py --model openai/gpt-4o

# You can also write a Parquet log you view live with the GUI and can also slice in Pandas for basic monitoring using the commands:
# inspect eval theory.py --model openai/gpt-4o --log_dir runs/hello-world
# inspect view runs/hello-world            # live GUI</code></pre></li><li><p><strong>Model-graded scoring</strong> (use an LLM to judge correctness):</p><pre><code>from inspect_ai.scorer import model_graded_fact
# ... Task(dataset=..., solver=[...], scorer=model_graded_fact())</code></pre></li></ol><p>Examples for multi-choice (HellaSwag), GSM8K few-shot math, and custom math equivalence scorers are in <a href="https://inspect.aisi.org.uk/tutorial.html">the docs tutorial</a>.</p><div class="pullquote"><p>Need creative, high-quality technical content? Happy to chat! Book a call with our Creative Engineers&#128071;&#127998;</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://calendar.app.google/s6ekF2rDbPjKWLLQ8&quot;,&quot;text&quot;:&quot;Let&#8217;s Chat! &#128222;&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://calendar.app.google/s6ekF2rDbPjKWLLQ8"><span>Let&#8217;s Chat! &#128222;</span></a></p></div><h2><strong>&#128293; Why MLOps and AI Safety Teams Should Care</strong></h2><ol><li><p><strong>Single CLI for multi-vendor evals</strong>: No more bespoke scripts per provider.</p></li><li><p><strong>Agent and tool safety</strong>: Isolated K8s pods or VM sandbox plugins let you run ReAct or Auto-GPT-style agents in locked pods. Includes policy limits for <a href="https://inspect.aisi.org.uk/reference/inspect_ai.util.html">tokens</a>, <a href="https://inspect.aisi.org.uk/reference/inspect_ai.html">clock- and working-time</a> (CPU time spent inside model/tool calls), <a href="https://inspect.aisi.org.uk/options.html">messages</a>, and sandbox parallelism/retries.</p></li><li><p><strong>Model-graded scoring</strong>: A unified <a href="https://inspect.aisi.org.uk/reference/">scoring library</a> with bootstrap CIs and pass/fail gates that is vastly cheaper than human eval yet richer than <a href="https://huggingface.co/spaces/evaluate-metric/exact_match">Exact Match</a>.</p></li><li><p><strong>Rich observability</strong>: Inspect View + Inspect Viz render per-step traces, costs, and scorer heatmaps. You can also log every prompt, response, and token count for audit.</p></li><li><p><strong>Ecosystem</strong>: k8s-sandbox, Hugging Face dataset bridges, SWE-Bench datasets, <a href="https://join.slack.com/t/inspectcommunity/shared_invite/zt-38v1m03v9-XxQg2QLocyAZLKBemqHqEg">active Slack community</a>, etc.</p></li></ol><p>I found out that you can also add solvers, scorers, or entire toolchains via the <a href="https://inspect.aisi.org.uk/extensions.html">extensions registry</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1fvd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff506b39-ae29-4274-bc2d-85032be854d4_1600x1540.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1fvd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff506b39-ae29-4274-bc2d-85032be854d4_1600x1540.png 424w, https://substackcdn.com/image/fetch/$s_!1fvd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff506b39-ae29-4274-bc2d-85032be854d4_1600x1540.png 848w, https://substackcdn.com/image/fetch/$s_!1fvd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff506b39-ae29-4274-bc2d-85032be854d4_1600x1540.png 1272w, https://substackcdn.com/image/fetch/$s_!1fvd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff506b39-ae29-4274-bc2d-85032be854d4_1600x1540.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1fvd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff506b39-ae29-4274-bc2d-85032be854d4_1600x1540.png" width="1456" height="1401" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ff506b39-ae29-4274-bc2d-85032be854d4_1600x1540.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1401,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1fvd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff506b39-ae29-4274-bc2d-85032be854d4_1600x1540.png 424w, https://substackcdn.com/image/fetch/$s_!1fvd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff506b39-ae29-4274-bc2d-85032be854d4_1600x1540.png 848w, https://substackcdn.com/image/fetch/$s_!1fvd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff506b39-ae29-4274-bc2d-85032be854d4_1600x1540.png 1272w, https://substackcdn.com/image/fetch/$s_!1fvd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff506b39-ae29-4274-bc2d-85032be854d4_1600x1540.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>&#128483;&#65039; Community Pulse</strong></h2><p><em>&#8220;We use https://github.com/UKGovernmentBEIS/inspect_ai which is great and comes with model-graded scorers. We also built support for reporting the results back to Langtrace AI which we use for observability.&#8221; </em>&#8212; Karthik Kalyanaraman, <a href="https://mlops-community.slack.com/archives/C04T55KFV8S/p1720015554729939?thread_ts=1720014591.699139&amp;cid=C04T55KFV8S">Slack thread</a> (July 2025)</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bP1f!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0aee3019-beff-43aa-996d-eb78fb4bcd78_1600x657.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bP1f!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0aee3019-beff-43aa-996d-eb78fb4bcd78_1600x657.png 424w, https://substackcdn.com/image/fetch/$s_!bP1f!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0aee3019-beff-43aa-996d-eb78fb4bcd78_1600x657.png 848w, https://substackcdn.com/image/fetch/$s_!bP1f!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0aee3019-beff-43aa-996d-eb78fb4bcd78_1600x657.png 1272w, https://substackcdn.com/image/fetch/$s_!bP1f!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0aee3019-beff-43aa-996d-eb78fb4bcd78_1600x657.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bP1f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0aee3019-beff-43aa-996d-eb78fb4bcd78_1600x657.png" width="1456" height="598" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0aee3019-beff-43aa-996d-eb78fb4bcd78_1600x657.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:598,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bP1f!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0aee3019-beff-43aa-996d-eb78fb4bcd78_1600x657.png 424w, https://substackcdn.com/image/fetch/$s_!bP1f!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0aee3019-beff-43aa-996d-eb78fb4bcd78_1600x657.png 848w, https://substackcdn.com/image/fetch/$s_!bP1f!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0aee3019-beff-43aa-996d-eb78fb4bcd78_1600x657.png 1272w, https://substackcdn.com/image/fetch/$s_!bP1f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0aee3019-beff-43aa-996d-eb78fb4bcd78_1600x657.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>&#8220;Inspect also has <a href="https://join.slack.com/t/inspectcommunity/shared_invite/zt-38v1m03v9-XxQg2QLocyAZLKBemqHqEg">its own very active Slack community</a>, and the Inspect userbase includes other safety research organisations (other AISIs, Apollo, METR) as well as some of the frontier labs.&#8221;</em> &#8212; Jason Gwartz (Head of Platform, UK AISI), <a href="https://mlops-community.slack.com/archives/C010A328X38/p1752767404547629?thread_ts=1752685284.759859&amp;cid=C010A328X38">Slack thread</a> (July 2025)</p><h2><strong>&#128161; Real-World Use Case: Agentic Safety Red-Team</strong></h2><ol><li><p><strong>Dataset:</strong> 500 complex multi-step jailbreak prompts.</p></li><li><p><strong>Solver chain:</strong> ReAct agent &#8594; tool-calling &#8594; self-critique.</p></li><li><p><strong>Sandbox:</strong> k8s_sandbox spins a pod per sample, capturing file/HTTP output.</p></li><li><p><strong>Scorer:</strong> model-graded policy checker + custom regex.</p></li><li><p><strong>Outcome:</strong> Run 500 &#215; 4 models overnight; Inspect Viz dashboard highlights policy breaks with stack traces.</p></li></ol><p>The agent is allowed to browse/execute code inside a K8s sandbox with domain allowlists while completing capture-the-flag tasks; this reduces blast radius if the model tries risky actions. (See <em><a href="https://ukgovernmentbeis.github.io/inspect_evals/evals/cybersecurity/cybench/">Cybench</a></em><a href="https://ukgovernmentbeis.github.io/inspect_evals/evals/cybersecurity/cybench/"> guide</a>.)</p><h1><strong>&#128202; How Inspect AI Stacks Up Against Other Eval Tools (2025)</strong></h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gufZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febe87af8-543f-404d-a012-cc27d2d57395_1370x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gufZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febe87af8-543f-404d-a012-cc27d2d57395_1370x1600.png 424w, https://substackcdn.com/image/fetch/$s_!gufZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febe87af8-543f-404d-a012-cc27d2d57395_1370x1600.png 848w, https://substackcdn.com/image/fetch/$s_!gufZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febe87af8-543f-404d-a012-cc27d2d57395_1370x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!gufZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febe87af8-543f-404d-a012-cc27d2d57395_1370x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gufZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febe87af8-543f-404d-a012-cc27d2d57395_1370x1600.png" width="1370" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ebe87af8-543f-404d-a012-cc27d2d57395_1370x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1370,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;How Inspect AI Stacks Up Against Other Eval Tools (2025).&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="How Inspect AI Stacks Up Against Other Eval Tools (2025)." title="How Inspect AI Stacks Up Against Other Eval Tools (2025)." srcset="https://substackcdn.com/image/fetch/$s_!gufZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febe87af8-543f-404d-a012-cc27d2d57395_1370x1600.png 424w, https://substackcdn.com/image/fetch/$s_!gufZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febe87af8-543f-404d-a012-cc27d2d57395_1370x1600.png 848w, https://substackcdn.com/image/fetch/$s_!gufZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febe87af8-543f-404d-a012-cc27d2d57395_1370x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!gufZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febe87af8-543f-404d-a012-cc27d2d57395_1370x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>How Inspect AI Stacks Up Against Other Eval Tools (2025).</em></figcaption></figure></div><div><hr></div><h2><strong>&#129489;&#8205;&#9878;&#65039; Final Verdict: 4.3/5 &#8220;Production-grade evals &amp; agent sandboxes, batteries included.&#8221;</strong></h2><h3><strong>Rating: &#11088;&#11088;&#11088;&#11088;&#9734; (4.3/5)</strong></h3><p>If your org treats evals as production infrastructure (with agents, tools, and safety constraints), Inspect AI is the most complete OSS choice today. The learning curve and runtime overhead are real, but the auditability, sandboxing, and scorer depth justify it for serious teams.</p><p><strong>&#9989; Ship it if&#8230;</strong></p><ul><li><p>You run recurring evals/regressions across multiple vendors or models (cloud + local).</p></li><li><p>You need to evaluate agents using tools with isolation (Docker/k8s) and approval policies.</p></li><li><p>You want LLM-graded or domain-specific scoring beyond Exact Match.</p></li><li><p>Regulatory or SOC2 demands tamper-proof logs and stats.</p></li></ul><p><strong>&#10060; Hold off if&#8230;</strong></p><ul><li><p>You&#8217;re doing a <strong>one-week</strong> bake-off on static benchmarks. Lighter harnesses will be faster.</p></li><li><p>You don&#8217;t plan to test <strong>tool use/agents </strong>(Inspect&#8217;s agent tasks can be spendy)<strong>,</strong> and simple text-in/text-out scoring suffices.</p></li><li><p>You can&#8217;t support even Docker-level isolation yet (agentic evals will be risky).</p></li><li><p>You dislike declarative DSLs (lm-eval-harness may feel lighter).</p></li></ul><div><hr></div><h2><strong>&#128640; Starter Commands You&#8217;ll Actually Use</strong></h2><ul><li><p><strong>Run an eval:</strong> <code>inspect eval my_eval.py --model anthropic/claude-3-5-sonnet-20240620 --limit 100 --log-dir ./logs</code></p></li><li><p><strong>View logs:</strong> <code>inspect view</code> (web UI) or open the Inspect panel in VS Code.</p></li><li><p><strong>Enable sandbox:</strong> <code>--sandbox docker</code> or <code>--sandbox k8s:config.yml </code>(domain/network policy via K8s).</p></li><li><p><strong>Repro tips:</strong> set <code>--seed</code>, use caching, and pin model versions.</p></li></ul><div><hr></div><h2><strong>&#128204; Resources to Learn More</strong></h2><ul><li><p><a href="https://inspect.aisi.org.uk/">Docs</a>/Getting Started/Install.</p></li><li><p><a href="https://inspect.aisi.org.uk/providers.html">Model Providers</a> (OpenAI, Anthropic, Google, xAI, Mistral, HF, Azure, AWS, &#8230;).</p></li><li><p><a href="https://inspect.aisi.org.uk/tutorial.html">Tutorials</a> and Examples (Hello World, GSM8K, HellaSwag, Math, Tools, CTF).</p></li><li><p><a href="https://inspect.aisi.org.uk/sandboxing.html">Sandboxing</a> (Docker built-in; Kubernetes/Proxmox adapters).</p></li><li><p><a href="https://inspect.aisi.org.uk/log-viewer.html">Logs and Viewers</a> (VS Code + Web).</p></li><li><p><a href="https://github.com/UKGovernmentBEIS/inspect_evals">Inspect Evals</a> collection (incl. SWE-bench variants).</p></li><li><p><a href="https://pypi.org/project/inspect-ai/">PyPI</a> (latest version, license).</p></li></ul><div class="pullquote"><p>Love this review? Share it in MLOps Slack or forward to your EvalOps lead, or share on X with <strong>#TuesdayToolReview</strong> and tag @mlopscommunity </p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/inspect-ai-evaluation-framework-review?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/inspect-ai-evaluation-framework-review?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><blockquote><p>Want deeper tutorials? Subscribe to <em><strong><a href="https://neurlcreators.substack.com/">The Neural Blueprint</a></strong></em> for hands-on guides! &#129761; and follow us on<em><strong> <a href="http://www.youtube.com/@neurlcreators">YouTube</a>, <a href="https://x.com/NeurlCreators">X</a> </strong></em>and<em><strong> <a href="https://www.linkedin.com/showcase/neurl-creators/">LinkedIn</a></strong></em></p></blockquote><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[How Nano Banana Compares to GPT-4o image]]></title><description><![CDATA[Is Google&#8217;s new model really worth going bananas over? Let's go over how Nano Banana compares to GPT-4o. Time to peel, well, the banana? &#127820;]]></description><link>https://neurlcreators.substack.com/p/how-nano-banana-compares-to-gpt-4o</link><guid isPermaLink="false">https://neurlcreators.substack.com/p/how-nano-banana-compares-to-gpt-4o</guid><dc:creator><![CDATA[Eteimorde Youdiowei]]></dc:creator><pubDate>Fri, 12 Sep 2025 10:00:22 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/346d4034-d31a-4daa-bd3e-392e69da49e4_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>When GPT-4o Image launched, it took the world by storm. Within just a week, users had generated over <a href="https://techcrunch.com/2025/04/03/chatgpt-users-have-generated-over-700m-images-since-last-week-openai-says/">700 million images</a>. Social media was flooded with Studio Ghibli-style portraits, and everyone seemed to be trying it out. In contrast, the release of Google&#8217;s Nano Banana felt almost secretive. Yet, despite its quiet debut, its capabilities quickly made the internet <a href="https://techcrunch.com/2025/08/26/google-geminis-ai-image-model-gets-a-bananas-upgrade/">go bananas</a>.</p><p>Nano Banana has since been hailed as the most impressive image generation model released since GPT-4o Image. It wowed users with powerful editing abilities, producing changes that were both consistent and remarkably realistic.</p><p>With two groundbreaking models now in the spotlight, the obvious question arises: which one is superior? That&#8217;s exactly what this article sets out to explore. We&#8217;ll compare Nano Banana&#8217;s capabilities against GPT-4o Image, drawing on results from OpenAI&#8217;s original <a href="https://openai.com/index/introducing-4o-image-generation/">announcement blog post</a> and putting Nano Banana through the same set of tests.</p><p>Along the way, we&#8217;ll evaluate performance across 5 key areas:</p><ul><li><p>Instruction following</p></li><li><p>Text rendering</p></li><li><p>In-context learning</p></li><li><p>World knowledge</p></li><li><p>Photorealism</p></li></ul><p>Let&#8217;s peel the banana &#127820;&#127820;&#127820;</p><h2>Useful Image Generation</h2><p>Image generation in AI is nothing new. Long before language models began stealing the spotlight, we had pioneers like DALL&#183;E and Stable Diffusion. These models could produce visually striking images, but &#8220;cool&#8221; didn&#8217;t always mean <em>useful</em>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iFqs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3c498ee-613e-4682-9bf8-b45706378a05_1345x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iFqs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3c498ee-613e-4682-9bf8-b45706378a05_1345x1600.png 424w, https://substackcdn.com/image/fetch/$s_!iFqs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3c498ee-613e-4682-9bf8-b45706378a05_1345x1600.png 848w, https://substackcdn.com/image/fetch/$s_!iFqs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3c498ee-613e-4682-9bf8-b45706378a05_1345x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!iFqs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3c498ee-613e-4682-9bf8-b45706378a05_1345x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iFqs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3c498ee-613e-4682-9bf8-b45706378a05_1345x1600.png" width="1345" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d3c498ee-613e-4682-9bf8-b45706378a05_1345x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1345,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;One of the tests from the original GPT-4o image announcement blog post&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="One of the tests from the original GPT-4o image announcement blog post" title="One of the tests from the original GPT-4o image announcement blog post" srcset="https://substackcdn.com/image/fetch/$s_!iFqs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3c498ee-613e-4682-9bf8-b45706378a05_1345x1600.png 424w, https://substackcdn.com/image/fetch/$s_!iFqs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3c498ee-613e-4682-9bf8-b45706378a05_1345x1600.png 848w, https://substackcdn.com/image/fetch/$s_!iFqs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3c498ee-613e-4682-9bf8-b45706378a05_1345x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!iFqs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3c498ee-613e-4682-9bf8-b45706378a05_1345x1600.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>One of the tests from the original GPT-4o image announcement blog post</em></figcaption></figure></div><p>Earlier image generation models struggled with key limitations:</p><ul><li><p>Poor text rendering</p></li><li><p>Lack of real-world knowledge</p></li><li><p>Difficulty producing truly realistic images</p></li></ul><p>OpenAI addressed these issues with the release of GPT-4o Image. More than just an image generator, it was powered by a multimodal model, giving it a deeper understanding of the world and the ability to create images that were both realistic and contextually relevant.</p><p>Not long after, Google entered the scene with Nano Banana, combining its Gemini Flash model with an image generator to deliver results that rivaled GPT-4o Image.</p><h2>Text Rendering</h2><p>Earlier image generation models like DALL&#183;E 2 struggled to produce meaningful text. While they could create visually convincing images, their text rendering was often impractical and unreliable.</p><p>GPT-4o Image changed that by introducing near-flawless text generation, making it possible to create things like comic panels, posters, or images that seamlessly integrate written content.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yDyc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc172bff0-f714-47c6-9656-95df6e90e09d_1345x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yDyc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc172bff0-f714-47c6-9656-95df6e90e09d_1345x1600.png 424w, https://substackcdn.com/image/fetch/$s_!yDyc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc172bff0-f714-47c6-9656-95df6e90e09d_1345x1600.png 848w, https://substackcdn.com/image/fetch/$s_!yDyc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc172bff0-f714-47c6-9656-95df6e90e09d_1345x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!yDyc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc172bff0-f714-47c6-9656-95df6e90e09d_1345x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yDyc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc172bff0-f714-47c6-9656-95df6e90e09d_1345x1600.png" width="1345" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c172bff0-f714-47c6-9656-95df6e90e09d_1345x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1345,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yDyc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc172bff0-f714-47c6-9656-95df6e90e09d_1345x1600.png 424w, https://substackcdn.com/image/fetch/$s_!yDyc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc172bff0-f714-47c6-9656-95df6e90e09d_1345x1600.png 848w, https://substackcdn.com/image/fetch/$s_!yDyc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc172bff0-f714-47c6-9656-95df6e90e09d_1345x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!yDyc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc172bff0-f714-47c6-9656-95df6e90e09d_1345x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Nano Banana also supports text rendering, and in many cases, it does so convincingly, placing text naturally within an image so that it looks as if it belongs in the real world.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Uzx2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708e29e8-d1df-45ac-ace4-6d908f21b4dc_1345x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Uzx2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708e29e8-d1df-45ac-ace4-6d908f21b4dc_1345x1600.png 424w, https://substackcdn.com/image/fetch/$s_!Uzx2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708e29e8-d1df-45ac-ace4-6d908f21b4dc_1345x1600.png 848w, https://substackcdn.com/image/fetch/$s_!Uzx2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708e29e8-d1df-45ac-ace4-6d908f21b4dc_1345x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!Uzx2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708e29e8-d1df-45ac-ace4-6d908f21b4dc_1345x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Uzx2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708e29e8-d1df-45ac-ace4-6d908f21b4dc_1345x1600.png" width="1345" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/708e29e8-d1df-45ac-ace4-6d908f21b4dc_1345x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1345,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Uzx2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708e29e8-d1df-45ac-ace4-6d908f21b4dc_1345x1600.png 424w, https://substackcdn.com/image/fetch/$s_!Uzx2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708e29e8-d1df-45ac-ace4-6d908f21b4dc_1345x1600.png 848w, https://substackcdn.com/image/fetch/$s_!Uzx2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708e29e8-d1df-45ac-ace4-6d908f21b4dc_1345x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!Uzx2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708e29e8-d1df-45ac-ace4-6d908f21b4dc_1345x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>However, compared to GPT-4o Image, Nano Banana still has some limitations. It can sometimes misalign text, produce gibberish, or generate words that aren&#8217;t clearly legible. Mathematical equations, in particular, pose a challenge, as they often fail to render accurately.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5piV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04c439cd-510a-4ae0-ac40-61d3415fa2d0_1591x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5piV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04c439cd-510a-4ae0-ac40-61d3415fa2d0_1591x1600.png 424w, https://substackcdn.com/image/fetch/$s_!5piV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04c439cd-510a-4ae0-ac40-61d3415fa2d0_1591x1600.png 848w, https://substackcdn.com/image/fetch/$s_!5piV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04c439cd-510a-4ae0-ac40-61d3415fa2d0_1591x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!5piV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04c439cd-510a-4ae0-ac40-61d3415fa2d0_1591x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5piV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04c439cd-510a-4ae0-ac40-61d3415fa2d0_1591x1600.png" width="1456" height="1464" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/04c439cd-510a-4ae0-ac40-61d3415fa2d0_1591x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1464,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5piV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04c439cd-510a-4ae0-ac40-61d3415fa2d0_1591x1600.png 424w, https://substackcdn.com/image/fetch/$s_!5piV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04c439cd-510a-4ae0-ac40-61d3415fa2d0_1591x1600.png 848w, https://substackcdn.com/image/fetch/$s_!5piV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04c439cd-510a-4ae0-ac40-61d3415fa2d0_1591x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!5piV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04c439cd-510a-4ae0-ac40-61d3415fa2d0_1591x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Instruction Following</h2><p>One of the biggest breakthroughs in AI has been the ability of language models to follow instructions. The question is simple: if you give a prompt, can the model understand and execute it?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!L0ri!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82a8c159-60de-4adc-93ac-e84c4933d5a9_1600x1474.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!L0ri!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82a8c159-60de-4adc-93ac-e84c4933d5a9_1600x1474.png 424w, https://substackcdn.com/image/fetch/$s_!L0ri!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82a8c159-60de-4adc-93ac-e84c4933d5a9_1600x1474.png 848w, https://substackcdn.com/image/fetch/$s_!L0ri!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82a8c159-60de-4adc-93ac-e84c4933d5a9_1600x1474.png 1272w, https://substackcdn.com/image/fetch/$s_!L0ri!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82a8c159-60de-4adc-93ac-e84c4933d5a9_1600x1474.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!L0ri!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82a8c159-60de-4adc-93ac-e84c4933d5a9_1600x1474.png" width="1456" height="1341" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/82a8c159-60de-4adc-93ac-e84c4933d5a9_1600x1474.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1341,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!L0ri!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82a8c159-60de-4adc-93ac-e84c4933d5a9_1600x1474.png 424w, https://substackcdn.com/image/fetch/$s_!L0ri!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82a8c159-60de-4adc-93ac-e84c4933d5a9_1600x1474.png 848w, https://substackcdn.com/image/fetch/$s_!L0ri!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82a8c159-60de-4adc-93ac-e84c4933d5a9_1600x1474.png 1272w, https://substackcdn.com/image/fetch/$s_!L0ri!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82a8c159-60de-4adc-93ac-e84c4933d5a9_1600x1474.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p> Since both Nano Banana and GPT-4o are built on top of large language models, they inherit strong instruction-following abilities. This allows users to generate images that closely align with their prompts, even when the requests are detailed or complex.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iBR_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96483404-dd57-45e0-8e48-e32ebb9eaad7_1600x1582.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iBR_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96483404-dd57-45e0-8e48-e32ebb9eaad7_1600x1582.png 424w, https://substackcdn.com/image/fetch/$s_!iBR_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96483404-dd57-45e0-8e48-e32ebb9eaad7_1600x1582.png 848w, https://substackcdn.com/image/fetch/$s_!iBR_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96483404-dd57-45e0-8e48-e32ebb9eaad7_1600x1582.png 1272w, https://substackcdn.com/image/fetch/$s_!iBR_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96483404-dd57-45e0-8e48-e32ebb9eaad7_1600x1582.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iBR_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96483404-dd57-45e0-8e48-e32ebb9eaad7_1600x1582.png" width="1456" height="1440" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/96483404-dd57-45e0-8e48-e32ebb9eaad7_1600x1582.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1440,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iBR_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96483404-dd57-45e0-8e48-e32ebb9eaad7_1600x1582.png 424w, https://substackcdn.com/image/fetch/$s_!iBR_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96483404-dd57-45e0-8e48-e32ebb9eaad7_1600x1582.png 848w, https://substackcdn.com/image/fetch/$s_!iBR_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96483404-dd57-45e0-8e48-e32ebb9eaad7_1600x1582.png 1272w, https://substackcdn.com/image/fetch/$s_!iBR_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96483404-dd57-45e0-8e48-e32ebb9eaad7_1600x1582.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In practice, the two models perform quite similarly. Both tend to follow user instructions faithfully, with minimal drift, which sets them apart from older image generation models that often produced irrelevant or inconsistent results.</p><h2>In-Context Learning</h2><p>Like GPT-4o Image, Nano Banana is capable of in-context learning. This means it can use previous inputs, whether text or images, to guide the generation of new outputs. Just as large language models can use context to produce relevant text, these image models can leverage context to create consistent and meaningful visuals.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!B15N!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37522710-02d5-4b90-81d2-01da46a6f1bf_1186x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!B15N!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37522710-02d5-4b90-81d2-01da46a6f1bf_1186x1600.png 424w, https://substackcdn.com/image/fetch/$s_!B15N!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37522710-02d5-4b90-81d2-01da46a6f1bf_1186x1600.png 848w, https://substackcdn.com/image/fetch/$s_!B15N!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37522710-02d5-4b90-81d2-01da46a6f1bf_1186x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!B15N!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37522710-02d5-4b90-81d2-01da46a6f1bf_1186x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!B15N!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37522710-02d5-4b90-81d2-01da46a6f1bf_1186x1600.png" width="1186" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/37522710-02d5-4b90-81d2-01da46a6f1bf_1186x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1186,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!B15N!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37522710-02d5-4b90-81d2-01da46a6f1bf_1186x1600.png 424w, https://substackcdn.com/image/fetch/$s_!B15N!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37522710-02d5-4b90-81d2-01da46a6f1bf_1186x1600.png 848w, https://substackcdn.com/image/fetch/$s_!B15N!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37522710-02d5-4b90-81d2-01da46a6f1bf_1186x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!B15N!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37522710-02d5-4b90-81d2-01da46a6f1bf_1186x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>For example, both Nano Banana and GPT-4o Image were first asked to generate an image of a chainsaw. In the next prompt, they were asked to create an advertisement featuring a grandmother using the chainsaw. Both models handled the task with ease, combining textual and visual context to produce coherent, context-aware results.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!q3tG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54a612e5-b2e8-410b-91ac-a9c3d4316ef7_1600x1582.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!q3tG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54a612e5-b2e8-410b-91ac-a9c3d4316ef7_1600x1582.png 424w, https://substackcdn.com/image/fetch/$s_!q3tG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54a612e5-b2e8-410b-91ac-a9c3d4316ef7_1600x1582.png 848w, https://substackcdn.com/image/fetch/$s_!q3tG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54a612e5-b2e8-410b-91ac-a9c3d4316ef7_1600x1582.png 1272w, https://substackcdn.com/image/fetch/$s_!q3tG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54a612e5-b2e8-410b-91ac-a9c3d4316ef7_1600x1582.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!q3tG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54a612e5-b2e8-410b-91ac-a9c3d4316ef7_1600x1582.png" width="1456" height="1440" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/54a612e5-b2e8-410b-91ac-a9c3d4316ef7_1600x1582.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1440,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!q3tG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54a612e5-b2e8-410b-91ac-a9c3d4316ef7_1600x1582.png 424w, https://substackcdn.com/image/fetch/$s_!q3tG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54a612e5-b2e8-410b-91ac-a9c3d4316ef7_1600x1582.png 848w, https://substackcdn.com/image/fetch/$s_!q3tG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54a612e5-b2e8-410b-91ac-a9c3d4316ef7_1600x1582.png 1272w, https://substackcdn.com/image/fetch/$s_!q3tG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54a612e5-b2e8-410b-91ac-a9c3d4316ef7_1600x1582.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In another case, when given a sketch of a building as input, each model was able to transform it into a realistic image of the same structure.</p><p>These examples highlight that when it comes to in-context learning, Nano Banana is on par with GPT-4o Image.</p><div class="pullquote"><p>Need creative, high-quality technical content? Happy to chat! Book a call with our Creative Engineers&#128071;&#127998;</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://calendar.app.google/s6ekF2rDbPjKWLLQ8&quot;,&quot;text&quot;:&quot;Let&#8217;s Chat! &#128222;&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://calendar.app.google/s6ekF2rDbPjKWLLQ8"><span>Let&#8217;s Chat! &#128222;</span></a></p></div><h2>World Knowledge</h2><p>One of the strengths of GPT-4o Image is its deep understanding of the world. With just a minimal prompt, it can often fill in the gaps and still generate accurate, meaningful images.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!P7ta!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bd110ab-89ff-43b9-9a1a-a0802cfa2764_1421x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!P7ta!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bd110ab-89ff-43b9-9a1a-a0802cfa2764_1421x1600.png 424w, https://substackcdn.com/image/fetch/$s_!P7ta!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bd110ab-89ff-43b9-9a1a-a0802cfa2764_1421x1600.png 848w, https://substackcdn.com/image/fetch/$s_!P7ta!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bd110ab-89ff-43b9-9a1a-a0802cfa2764_1421x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!P7ta!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bd110ab-89ff-43b9-9a1a-a0802cfa2764_1421x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!P7ta!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bd110ab-89ff-43b9-9a1a-a0802cfa2764_1421x1600.png" width="1421" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6bd110ab-89ff-43b9-9a1a-a0802cfa2764_1421x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1421,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!P7ta!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bd110ab-89ff-43b9-9a1a-a0802cfa2764_1421x1600.png 424w, https://substackcdn.com/image/fetch/$s_!P7ta!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bd110ab-89ff-43b9-9a1a-a0802cfa2764_1421x1600.png 848w, https://substackcdn.com/image/fetch/$s_!P7ta!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bd110ab-89ff-43b9-9a1a-a0802cfa2764_1421x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!P7ta!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bd110ab-89ff-43b9-9a1a-a0802cfa2764_1421x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>For example, GPT-4o was asked to create a poster of different whale species without being explicitly told which whales to include. Remarkably, it succeeded. Nano Banana demonstrates a similar level of world knowledge, producing results that show it understands context and real-world concepts.</p><p>It&#8217;s the combination of world knowledge and instruction-following that makes these image generation models so powerful and versatile.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zz-V!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2368e7d8-3545-4fe5-9a6d-242e59e23347_1421x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zz-V!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2368e7d8-3545-4fe5-9a6d-242e59e23347_1421x1600.png 424w, https://substackcdn.com/image/fetch/$s_!zz-V!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2368e7d8-3545-4fe5-9a6d-242e59e23347_1421x1600.png 848w, https://substackcdn.com/image/fetch/$s_!zz-V!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2368e7d8-3545-4fe5-9a6d-242e59e23347_1421x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!zz-V!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2368e7d8-3545-4fe5-9a6d-242e59e23347_1421x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zz-V!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2368e7d8-3545-4fe5-9a6d-242e59e23347_1421x1600.png" width="1421" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2368e7d8-3545-4fe5-9a6d-242e59e23347_1421x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1421,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zz-V!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2368e7d8-3545-4fe5-9a6d-242e59e23347_1421x1600.png 424w, https://substackcdn.com/image/fetch/$s_!zz-V!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2368e7d8-3545-4fe5-9a6d-242e59e23347_1421x1600.png 848w, https://substackcdn.com/image/fetch/$s_!zz-V!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2368e7d8-3545-4fe5-9a6d-242e59e23347_1421x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!zz-V!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2368e7d8-3545-4fe5-9a6d-242e59e23347_1421x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>That said, Nano Banana isn&#8217;t perfect. While its world knowledge is impressive, it occasionally makes mistakes. It misspelled the names of the Orca and Narwhal whales. Errors like these tend to stem from the same weaknesses Nano Banana shows in text rendering.</p><h2>Photorealism</h2><p>Like GPT-4o Image, Nano Banana is capable of generating highly photorealistic images that lack the typical &#8220;AI look.&#8221; In many cases, it becomes nearly impossible to distinguish an AI-generated image from a real photograph.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KKgI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F648d0d62-7aff-4db4-aeb1-9ce9e29d7ce5_1421x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KKgI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F648d0d62-7aff-4db4-aeb1-9ce9e29d7ce5_1421x1600.png 424w, https://substackcdn.com/image/fetch/$s_!KKgI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F648d0d62-7aff-4db4-aeb1-9ce9e29d7ce5_1421x1600.png 848w, https://substackcdn.com/image/fetch/$s_!KKgI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F648d0d62-7aff-4db4-aeb1-9ce9e29d7ce5_1421x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!KKgI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F648d0d62-7aff-4db4-aeb1-9ce9e29d7ce5_1421x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KKgI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F648d0d62-7aff-4db4-aeb1-9ce9e29d7ce5_1421x1600.png" width="1421" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/648d0d62-7aff-4db4-aeb1-9ce9e29d7ce5_1421x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1421,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KKgI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F648d0d62-7aff-4db4-aeb1-9ce9e29d7ce5_1421x1600.png 424w, https://substackcdn.com/image/fetch/$s_!KKgI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F648d0d62-7aff-4db4-aeb1-9ce9e29d7ce5_1421x1600.png 848w, https://substackcdn.com/image/fetch/$s_!KKgI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F648d0d62-7aff-4db4-aeb1-9ce9e29d7ce5_1421x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!KKgI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F648d0d62-7aff-4db4-aeb1-9ce9e29d7ce5_1421x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This level of realism, combined with strong instruction-following and world knowledge, allows both models to produce images that look authentic and believable.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XEIn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423235e5-45fb-43f9-9bc9-5488cbd93a17_1591x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XEIn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423235e5-45fb-43f9-9bc9-5488cbd93a17_1591x1600.png 424w, https://substackcdn.com/image/fetch/$s_!XEIn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423235e5-45fb-43f9-9bc9-5488cbd93a17_1591x1600.png 848w, https://substackcdn.com/image/fetch/$s_!XEIn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423235e5-45fb-43f9-9bc9-5488cbd93a17_1591x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!XEIn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423235e5-45fb-43f9-9bc9-5488cbd93a17_1591x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XEIn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423235e5-45fb-43f9-9bc9-5488cbd93a17_1591x1600.png" width="1456" height="1464" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/423235e5-45fb-43f9-9bc9-5488cbd93a17_1591x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1464,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!XEIn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423235e5-45fb-43f9-9bc9-5488cbd93a17_1591x1600.png 424w, https://substackcdn.com/image/fetch/$s_!XEIn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423235e5-45fb-43f9-9bc9-5488cbd93a17_1591x1600.png 848w, https://substackcdn.com/image/fetch/$s_!XEIn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423235e5-45fb-43f9-9bc9-5488cbd93a17_1591x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!XEIn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423235e5-45fb-43f9-9bc9-5488cbd93a17_1591x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Since both deliver impressive photorealism, choosing between them often comes down to user preference and the specific use case.</p><h2>Can Nano Banana do Ghibli?</h2><p>One of the coolest abilities of GPT-4o was its ability to generate Ghibli-styled images, so one question arises: can Nano Banana do the same? The answer is, not really.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vvPo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4854ac5-91b8-4fae-b0f5-11906b72b588_1024x585.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vvPo!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4854ac5-91b8-4fae-b0f5-11906b72b588_1024x585.png 424w, https://substackcdn.com/image/fetch/$s_!vvPo!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4854ac5-91b8-4fae-b0f5-11906b72b588_1024x585.png 848w, https://substackcdn.com/image/fetch/$s_!vvPo!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4854ac5-91b8-4fae-b0f5-11906b72b588_1024x585.png 1272w, https://substackcdn.com/image/fetch/$s_!vvPo!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4854ac5-91b8-4fae-b0f5-11906b72b588_1024x585.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vvPo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4854ac5-91b8-4fae-b0f5-11906b72b588_1024x585.png" width="1024" height="585" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a4854ac5-91b8-4fae-b0f5-11906b72b588_1024x585.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:585,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;A Ghibli generated image of Yusuf Dikec by Nano Banana&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="A Ghibli generated image of Yusuf Dikec by Nano Banana" title="A Ghibli generated image of Yusuf Dikec by Nano Banana" srcset="https://substackcdn.com/image/fetch/$s_!vvPo!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4854ac5-91b8-4fae-b0f5-11906b72b588_1024x585.png 424w, https://substackcdn.com/image/fetch/$s_!vvPo!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4854ac5-91b8-4fae-b0f5-11906b72b588_1024x585.png 848w, https://substackcdn.com/image/fetch/$s_!vvPo!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4854ac5-91b8-4fae-b0f5-11906b72b588_1024x585.png 1272w, https://substackcdn.com/image/fetch/$s_!vvPo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4854ac5-91b8-4fae-b0f5-11906b72b588_1024x585.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>A Ghibli generated image of Yusuf Dikec by Nano Banana</em></figcaption></figure></div><p>Most of the Ghibli images generated by Nano Banana feel more tuned towards reality and give an oil painting style, rather than the chubby and cute style of Studio Ghibli.</p><p>This limitation of Nano Banana actually gives us a better understanding of these types of models, since it shows that everything depends on the data it was trained on.</p><p>Perhaps GPT-4o Image contained several Ghibli images, or Google, fearing lawsuits, just doesn&#8217;t want their model to copy the styles of animation companies.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6ohq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25fd2723-5b25-4078-a0cb-401092b28510_1024x683.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6ohq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25fd2723-5b25-4078-a0cb-401092b28510_1024x683.png 424w, https://substackcdn.com/image/fetch/$s_!6ohq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25fd2723-5b25-4078-a0cb-401092b28510_1024x683.png 848w, https://substackcdn.com/image/fetch/$s_!6ohq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25fd2723-5b25-4078-a0cb-401092b28510_1024x683.png 1272w, https://substackcdn.com/image/fetch/$s_!6ohq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25fd2723-5b25-4078-a0cb-401092b28510_1024x683.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6ohq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25fd2723-5b25-4078-a0cb-401092b28510_1024x683.png" width="1024" height="683" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/25fd2723-5b25-4078-a0cb-401092b28510_1024x683.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:683,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Nano Banana attempting to generate the Distracted Boyfriend meme in Ghibli style&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Nano Banana attempting to generate the Distracted Boyfriend meme in Ghibli style" title="Nano Banana attempting to generate the Distracted Boyfriend meme in Ghibli style" srcset="https://substackcdn.com/image/fetch/$s_!6ohq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25fd2723-5b25-4078-a0cb-401092b28510_1024x683.png 424w, https://substackcdn.com/image/fetch/$s_!6ohq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25fd2723-5b25-4078-a0cb-401092b28510_1024x683.png 848w, https://substackcdn.com/image/fetch/$s_!6ohq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25fd2723-5b25-4078-a0cb-401092b28510_1024x683.png 1272w, https://substackcdn.com/image/fetch/$s_!6ohq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25fd2723-5b25-4078-a0cb-401092b28510_1024x683.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Nano Banana attempting to generate the Distracted Boyfriend meme in Ghibli style</figcaption></figure></div><div class="pullquote"><p>Know someone who might need this? Share this post with your network and friends!</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/how-nano-banana-compares-to-gpt-4o?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/how-nano-banana-compares-to-gpt-4o?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><h2>Nano Banana as an Image Editor</h2><p>One thing to note about Nano Banana is that it is not actually an image generator, or at least it is not advertised as one like other models such as ImageGen or GPT-4o Image. Instead, it is marketed as an image editing model, and it excels at this task.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1GoM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9418122-1b09-4c92-b1ae-92024a6ff7cd_1024x683.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1GoM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9418122-1b09-4c92-b1ae-92024a6ff7cd_1024x683.png 424w, https://substackcdn.com/image/fetch/$s_!1GoM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9418122-1b09-4c92-b1ae-92024a6ff7cd_1024x683.png 848w, https://substackcdn.com/image/fetch/$s_!1GoM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9418122-1b09-4c92-b1ae-92024a6ff7cd_1024x683.png 1272w, https://substackcdn.com/image/fetch/$s_!1GoM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9418122-1b09-4c92-b1ae-92024a6ff7cd_1024x683.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1GoM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9418122-1b09-4c92-b1ae-92024a6ff7cd_1024x683.png" width="1024" height="683" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e9418122-1b09-4c92-b1ae-92024a6ff7cd_1024x683.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:683,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Nano Banana showcasing its editing skills by transforming an image originally generated by GPT-4o image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Nano Banana showcasing its editing skills by transforming an image originally generated by GPT-4o image" title="Nano Banana showcasing its editing skills by transforming an image originally generated by GPT-4o image" srcset="https://substackcdn.com/image/fetch/$s_!1GoM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9418122-1b09-4c92-b1ae-92024a6ff7cd_1024x683.png 424w, https://substackcdn.com/image/fetch/$s_!1GoM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9418122-1b09-4c92-b1ae-92024a6ff7cd_1024x683.png 848w, https://substackcdn.com/image/fetch/$s_!1GoM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9418122-1b09-4c92-b1ae-92024a6ff7cd_1024x683.png 1272w, https://substackcdn.com/image/fetch/$s_!1GoM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9418122-1b09-4c92-b1ae-92024a6ff7cd_1024x683.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>Nano Banana showcases its editing skills by transforming an image originally generated by GPT-4o Image.</em></figcaption></figure></div><p>It is capable of editing any image, whether a real-life photo or one generated by another model, while maintaining the style and consistency of the original without giving it that "AI feel."</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3ASs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc38e90fd-6fdc-4502-8426-230ca852839c_1024x683.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3ASs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc38e90fd-6fdc-4502-8426-230ca852839c_1024x683.png 424w, https://substackcdn.com/image/fetch/$s_!3ASs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc38e90fd-6fdc-4502-8426-230ca852839c_1024x683.png 848w, https://substackcdn.com/image/fetch/$s_!3ASs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc38e90fd-6fdc-4502-8426-230ca852839c_1024x683.png 1272w, https://substackcdn.com/image/fetch/$s_!3ASs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc38e90fd-6fdc-4502-8426-230ca852839c_1024x683.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3ASs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc38e90fd-6fdc-4502-8426-230ca852839c_1024x683.png" width="1024" height="683" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c38e90fd-6fdc-4502-8426-230ca852839c_1024x683.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:683,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Nano Banana seamlessly replaced the snail in this image with a sloth&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Nano Banana seamlessly replaced the snail in this image with a sloth" title="Nano Banana seamlessly replaced the snail in this image with a sloth" srcset="https://substackcdn.com/image/fetch/$s_!3ASs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc38e90fd-6fdc-4502-8426-230ca852839c_1024x683.png 424w, https://substackcdn.com/image/fetch/$s_!3ASs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc38e90fd-6fdc-4502-8426-230ca852839c_1024x683.png 848w, https://substackcdn.com/image/fetch/$s_!3ASs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc38e90fd-6fdc-4502-8426-230ca852839c_1024x683.png 1272w, https://substackcdn.com/image/fetch/$s_!3ASs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc38e90fd-6fdc-4502-8426-230ca852839c_1024x683.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>Nano Banana seamlessly replaced the snail in this image with a sloth</em></figcaption></figure></div><p>For example, the image above was originally generated by GPT-4o Image with a snail rather than a sloth. Nano Banana was able to edit the image seamlessly, replacing the snail without introducing any noticeable changes. This highlights the strength of its editing capabilities.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-PjN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbf769eb-918b-4ec6-8b88-b852c62b7d0d_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-PjN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbf769eb-918b-4ec6-8b88-b852c62b7d0d_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!-PjN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbf769eb-918b-4ec6-8b88-b852c62b7d0d_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!-PjN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbf769eb-918b-4ec6-8b88-b852c62b7d0d_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!-PjN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbf769eb-918b-4ec6-8b88-b852c62b7d0d_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-PjN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbf769eb-918b-4ec6-8b88-b852c62b7d0d_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bbf769eb-918b-4ec6-8b88-b852c62b7d0d_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;GPT-4o Image managed to swap the snail for the slot, but for some reason it also made the man bald in the first panel.&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="GPT-4o Image managed to swap the snail for the slot, but for some reason it also made the man bald in the first panel." title="GPT-4o Image managed to swap the snail for the slot, but for some reason it also made the man bald in the first panel." srcset="https://substackcdn.com/image/fetch/$s_!-PjN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbf769eb-918b-4ec6-8b88-b852c62b7d0d_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!-PjN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbf769eb-918b-4ec6-8b88-b852c62b7d0d_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!-PjN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbf769eb-918b-4ec6-8b88-b852c62b7d0d_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!-PjN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbf769eb-918b-4ec6-8b88-b852c62b7d0d_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>GPT-4o Image managed to swap the snail for the slot, but for some reason, it also made the man bald in the first panel.</em></figcaption></figure></div><div class="pullquote"><p>Need creative, high-quality technical content? Happy to chat! Book a call with our Creative Engineers&#128071;&#127998;</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://calendar.app.google/UK4CoEUU9oosbYw77&quot;,&quot;text&quot;:&quot;Let's Chat! &#128222;&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://calendar.app.google/UK4CoEUU9oosbYw77"><span>Let's Chat! &#128222;</span></a></p></div><p>By contrast, when given the same image, GPT-4o Image struggled with the task. Instead of replacing the snail with a sloth, it sometimes substituted it with a slot machine.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zFwl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8928999-c6c6-4ba9-ade3-f10694fbc537_1024x683.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zFwl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8928999-c6c6-4ba9-ade3-f10694fbc537_1024x683.png 424w, https://substackcdn.com/image/fetch/$s_!zFwl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8928999-c6c6-4ba9-ade3-f10694fbc537_1024x683.png 848w, https://substackcdn.com/image/fetch/$s_!zFwl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8928999-c6c6-4ba9-ade3-f10694fbc537_1024x683.png 1272w, https://substackcdn.com/image/fetch/$s_!zFwl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8928999-c6c6-4ba9-ade3-f10694fbc537_1024x683.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zFwl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8928999-c6c6-4ba9-ade3-f10694fbc537_1024x683.png" width="1024" height="683" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d8928999-c6c6-4ba9-ade3-f10694fbc537_1024x683.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:683,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Nano Banana can easily edit Ghibli styled images&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Nano Banana can easily edit Ghibli styled images" title="Nano Banana can easily edit Ghibli styled images" srcset="https://substackcdn.com/image/fetch/$s_!zFwl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8928999-c6c6-4ba9-ade3-f10694fbc537_1024x683.png 424w, https://substackcdn.com/image/fetch/$s_!zFwl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8928999-c6c6-4ba9-ade3-f10694fbc537_1024x683.png 848w, https://substackcdn.com/image/fetch/$s_!zFwl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8928999-c6c6-4ba9-ade3-f10694fbc537_1024x683.png 1272w, https://substackcdn.com/image/fetch/$s_!zFwl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8928999-c6c6-4ba9-ade3-f10694fbc537_1024x683.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>Nano Banana can easily edit Ghibli-style images</em></figcaption></figure></div><p>So, while Nano Banana may not be able to generate Ghibli-style art on its own, it can certainly edit a Ghibli image you&#8217;ve already created, preserving both consistency and style.</p><div class="pullquote"><p>Join the conversation and share your experiences in the comments below!</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/how-nano-banana-compares-to-gpt-4o/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/how-nano-banana-compares-to-gpt-4o/comments"><span>Leave a comment</span></a></p></div><h2>Conclusion</h2><p>Although Nano Banana was marketed primarily as an image editing model, it has proven itself to be a highly capable image generator, often matching and sometimes even surpassing GPT-4o Image in certain tests.</p><p>Ultimately, the choice between the two depends on user preference and specific use cases. Some tasks favor GPT-4o&#8217;s generation strengths, especially in text rendering, while others benefit from Nano Banana&#8217;s powerful editing abilities.</p><p>In practice, the best approach may be to combine them: generate your base image with GPT-4o Image, then refine or modify it using Nano Banana. Both models are excellent options, and whichever you choose, you&#8217;re unlikely to go wrong.</p><div class="pullquote"><p><em>For more engaging content, subscribe&#128071;&#127998; and follow us on <a href="http://www.youtube.com/@neurlcreators">YouTube</a>, <a href="https://x.com/NeurlCreators">X</a>, and <a href="https://www.linkedin.com/showcase/neurl-creators/">LinkedIn</a></em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/subscribe?"><span>Subscribe now</span></a></p></div>]]></content:encoded></item><item><title><![CDATA[GPT-4 Showed Sparks of AGI. Has GPT-5 Lit the Fire?]]></title><description><![CDATA[We wanted AGI, instead we got GPT-5?]]></description><link>https://neurlcreators.substack.com/p/gpt-4-showed-sparks-of-agi-has-gpt</link><guid isPermaLink="false">https://neurlcreators.substack.com/p/gpt-4-showed-sparks-of-agi-has-gpt</guid><dc:creator><![CDATA[Eteimorde Youdiowei]]></dc:creator><pubDate>Tue, 26 Aug 2025 16:30:52 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/866fe026-d5ec-4a18-975f-071e2a4b048d_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In 2023, researchers at Microsoft released a paper titled <em><a href="https://arxiv.org/abs/2303.12712">Sparks of Artificial General Intelligence: Early Experiments with GPT-4</a></em>. The paper suggested that GPT-4 might be edging us closer to Artificial General Intelligence (<a href="https://en.m.wikipedia.org/wiki/Artificial_general_intelligence">AGI</a>). Two years later, with the <a href="https://openai.com/index/introducing-gpt-5/">launch of GPT-5</a>, the natural question is: are we still seeing sparks, or has something more ignited?</p><p>The launch of GPT-5 has not gone entirely as OpenAI envisioned. While it has been described as having <a href="https://www.bbc.com/news/articles/cy5prvgw0r1o">PhD-level</a> intelligence, public sentiment has been mixed at best.</p><p>Early testers reported that the production model underperformed compared to private previews. Some users have openly asked for <a href="https://openai.com/index/hello-gpt-4o/">GPT-4o</a> back, while others have criticized GPT-5 as little more than a clever hack, pointing to its internal <a href="https://www.latent.space/p/gpt5-router">router architecture</a>. GPT-5 is one of the most polarizing models OpenAI has ever released.</p><p>In this article, we will set aside the controversy and focus on a different question: how does GPT-5 perform when measured against the benchmarks outlined in Microsoft&#8217;s <em>Sparks of AGI</em> paper? By the end, we hope to discover whether GPT-5 is still just producing sparks or if it has finally lit the flame of AGI.</p><h2><strong>Sparks Of AGI</strong></h2><p>When <em>Sparks of AGI</em> was released, it marked one of the boldest claims yet about large language models. The researchers argued that GPT-4 exhibited &#8220;sparks&#8221; of general intelligence, not full AGI but behaviors that seemed strikingly close.</p><p>What stood out about the paper was not just its conclusion, but the way the researchers tested GPT-4. They did not simply measure accuracy on benchmarks. Instead, they asked the model to perform across a wide spectrum of tasks, from coding and mathematics to multimodal reasoning and human interaction, looking for flexibility and depth of understanding.</p><p>For this article, I want to revisit those same tests. But instead of focusing only on GPT-4 as the Microsoft team did in 2023, I will compare two snapshots: <strong>GPT-4 as it was at the time of that paper, and GPT-5 as it stands today.</strong></p><p>The areas I will be exploring are the same ones Microsoft highlighted:</p><ul><li><p><strong>Multimodal and Interdisciplinary Composition</strong> &#8211; Can the model combine knowledge across domains, or describe ideas that bridge different fields?</p></li><li><p><strong>Coding</strong> &#8211; Does it demonstrate problem-solving as a programmer, not just surface-level pattern matching?</p></li><li><p><strong>Interaction with the World</strong> &#8211; Is the model capable of reasoning about real-world objects, contexts, and physical constraints?</p></li></ul><p>By running GPT-5 through the same kinds of challenges, I hope to see whether we are still witnessing &#8220;sparks&#8221; of general intelligence or whether GPT-5 has taken us closer to something more.</p><h2><strong>Multimodal and interdisciplinary composition</strong></h2><p>A central goal of the <em>Sparks of AGI</em> researchers was to test how well the model could integrate knowledge across disciplines to generate new ideas. For example, they explored whether it could combine insights from poetry and physics to produce novel connections. This capacity is referred to as the model&#8217;s <strong>integrative ability</strong>.</p><h3>Integrative ability</h3><p>Let&#8217;s revisit those same tests to compare GPT-4&#8217;s integrative ability with GPT-5&#8217;s. In the original paper, GPT-4 outperformed its predecessor, so how does it measure up against its successor?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5EWj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92a24684-24e2-407a-b430-5a4c89336711_1345x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5EWj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92a24684-24e2-407a-b430-5a4c89336711_1345x1600.png 424w, https://substackcdn.com/image/fetch/$s_!5EWj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92a24684-24e2-407a-b430-5a4c89336711_1345x1600.png 848w, https://substackcdn.com/image/fetch/$s_!5EWj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92a24684-24e2-407a-b430-5a4c89336711_1345x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!5EWj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92a24684-24e2-407a-b430-5a4c89336711_1345x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5EWj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92a24684-24e2-407a-b430-5a4c89336711_1345x1600.png" width="1345" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/92a24684-24e2-407a-b430-5a4c89336711_1345x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1345,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Example output from the Sparks of AGI paper where GPT-4 generates play proving why they are infinitely many primes&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Example output from the Sparks of AGI paper where GPT-4 generates play proving why they are infinitely many primes" title="Example output from the Sparks of AGI paper where GPT-4 generates play proving why they are infinitely many primes" srcset="https://substackcdn.com/image/fetch/$s_!5EWj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92a24684-24e2-407a-b430-5a4c89336711_1345x1600.png 424w, https://substackcdn.com/image/fetch/$s_!5EWj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92a24684-24e2-407a-b430-5a4c89336711_1345x1600.png 848w, https://substackcdn.com/image/fetch/$s_!5EWj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92a24684-24e2-407a-b430-5a4c89336711_1345x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!5EWj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92a24684-24e2-407a-b430-5a4c89336711_1345x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Example output from the Sparks of AGI paper where GPT-4 generates play proving why they are infinitely many primes</figcaption></figure></div><p>When GPT-4 was asked to write a play about infinite primes, it produced the one shown above.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9mB7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faafb963a-4fdc-4ca7-83de-6ed6385f776b_1206x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9mB7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faafb963a-4fdc-4ca7-83de-6ed6385f776b_1206x1600.png 424w, https://substackcdn.com/image/fetch/$s_!9mB7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faafb963a-4fdc-4ca7-83de-6ed6385f776b_1206x1600.png 848w, https://substackcdn.com/image/fetch/$s_!9mB7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faafb963a-4fdc-4ca7-83de-6ed6385f776b_1206x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!9mB7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faafb963a-4fdc-4ca7-83de-6ed6385f776b_1206x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9mB7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faafb963a-4fdc-4ca7-83de-6ed6385f776b_1206x1600.png" width="1206" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/aafb963a-4fdc-4ca7-83de-6ed6385f776b_1206x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1206,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;GPT-5 generation paly proving why they are infinitely many primes&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="GPT-5 generation paly proving why they are infinitely many primes" title="GPT-5 generation paly proving why they are infinitely many primes" srcset="https://substackcdn.com/image/fetch/$s_!9mB7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faafb963a-4fdc-4ca7-83de-6ed6385f776b_1206x1600.png 424w, https://substackcdn.com/image/fetch/$s_!9mB7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faafb963a-4fdc-4ca7-83de-6ed6385f776b_1206x1600.png 848w, https://substackcdn.com/image/fetch/$s_!9mB7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faafb963a-4fdc-4ca7-83de-6ed6385f776b_1206x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!9mB7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faafb963a-4fdc-4ca7-83de-6ed6385f776b_1206x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">GPT-5 generation of a play proving why they are infinitely many primes</figcaption></figure></div><p>I gave the same prompt to GPT-5, and it produced the play shown above. Even at a glance, the result feels more engaging: GPT-5 introduced characters like <em>Euclidus</em> and <em>Skepticus</em>, demonstrating not only historical awareness but also skill in wordplay and a clear grasp of Shakespearean style.</p><p>Still, let&#8217;s not be the judges ourselves. In the original paper, each model&#8217;s output was evaluated by a separate instance of GPT-4, which decided which response was better. We will follow the same approach here, except this time the evaluation will be done by a fresh instance of GPT-5.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wH42!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa13b2219-c702-4461-a881-5e4676116a3b_1102x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wH42!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa13b2219-c702-4461-a881-5e4676116a3b_1102x1600.png 424w, https://substackcdn.com/image/fetch/$s_!wH42!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa13b2219-c702-4461-a881-5e4676116a3b_1102x1600.png 848w, https://substackcdn.com/image/fetch/$s_!wH42!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa13b2219-c702-4461-a881-5e4676116a3b_1102x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!wH42!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa13b2219-c702-4461-a881-5e4676116a3b_1102x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wH42!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa13b2219-c702-4461-a881-5e4676116a3b_1102x1600.png" width="1102" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a13b2219-c702-4461-a881-5e4676116a3b_1102x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1102,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wH42!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa13b2219-c702-4461-a881-5e4676116a3b_1102x1600.png 424w, https://substackcdn.com/image/fetch/$s_!wH42!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa13b2219-c702-4461-a881-5e4676116a3b_1102x1600.png 848w, https://substackcdn.com/image/fetch/$s_!wH42!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa13b2219-c702-4461-a881-5e4676116a3b_1102x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!wH42!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa13b2219-c702-4461-a881-5e4676116a3b_1102x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Evaluation of the play output from GPT-4 and GPT-5</figcaption></figure></div><p>We labeled GPT-5&#8217;s output as <em>Student 1</em> and GPT-4&#8217;s output as <em>Student 2</em>. When a new instance of GPT-5 was asked to evaluate them, it awarded Student 1 an A and Student 2 a B. This suggests that the model&#8217;s ability to integrate knowledge across disciplines has improved.</p><p>Another test from the paper involved asking Mahatma Gandhi to write a letter to his wife, explaining his desire to support the electron in running for the U.S. presidential candidacy. Below is a side-by-side comparison of GPT-4 and GPT-5&#8217;s responses.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pg88!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43c390b5-94dd-469c-95f4-225be552cb40_733x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pg88!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43c390b5-94dd-469c-95f4-225be552cb40_733x1600.png 424w, https://substackcdn.com/image/fetch/$s_!pg88!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43c390b5-94dd-469c-95f4-225be552cb40_733x1600.png 848w, https://substackcdn.com/image/fetch/$s_!pg88!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43c390b5-94dd-469c-95f4-225be552cb40_733x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!pg88!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43c390b5-94dd-469c-95f4-225be552cb40_733x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pg88!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43c390b5-94dd-469c-95f4-225be552cb40_733x1600.png" width="733" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/43c390b5-94dd-469c-95f4-225be552cb40_733x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:733,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pg88!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43c390b5-94dd-469c-95f4-225be552cb40_733x1600.png 424w, https://substackcdn.com/image/fetch/$s_!pg88!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43c390b5-94dd-469c-95f4-225be552cb40_733x1600.png 848w, https://substackcdn.com/image/fetch/$s_!pg88!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43c390b5-94dd-469c-95f4-225be552cb40_733x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!pg88!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43c390b5-94dd-469c-95f4-225be552cb40_733x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Side by Side Comparison of the output of GPT-4 and GPT-5 when asked to generate a letter for Gandhi to his wife supporting electron for president </figcaption></figure></div><p>As in the previous example, we assigned student IDs to the models. GPT-4 was given the ID <em>A</em> and GPT-5 the ID <em>B</em>. We then asked a new instance of GPT-5 to compare their responses.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3JZq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf8f1eac-f2f7-4e92-89f7-fc26e6f966aa_1185x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3JZq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf8f1eac-f2f7-4e92-89f7-fc26e6f966aa_1185x1600.png 424w, https://substackcdn.com/image/fetch/$s_!3JZq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf8f1eac-f2f7-4e92-89f7-fc26e6f966aa_1185x1600.png 848w, https://substackcdn.com/image/fetch/$s_!3JZq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf8f1eac-f2f7-4e92-89f7-fc26e6f966aa_1185x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!3JZq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf8f1eac-f2f7-4e92-89f7-fc26e6f966aa_1185x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3JZq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf8f1eac-f2f7-4e92-89f7-fc26e6f966aa_1185x1600.png" width="1185" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/df8f1eac-f2f7-4e92-89f7-fc26e6f966aa_1185x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1185,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3JZq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf8f1eac-f2f7-4e92-89f7-fc26e6f966aa_1185x1600.png 424w, https://substackcdn.com/image/fetch/$s_!3JZq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf8f1eac-f2f7-4e92-89f7-fc26e6f966aa_1185x1600.png 848w, https://substackcdn.com/image/fetch/$s_!3JZq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf8f1eac-f2f7-4e92-89f7-fc26e6f966aa_1185x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!3JZq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf8f1eac-f2f7-4e92-89f7-fc26e6f966aa_1185x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Evaluation of the electron for presidency letter from GPT-4 and GPT-5 </figcaption></figure></div><p>The new instance once again preferred GPT-5&#8217;s output and provided a full explanation of its reasoning.</p><h3>Image generation beyond memorization</h3><p>Another way to demonstrate the model&#8217;s ability to integrate knowledge across domains is through image generation. At the time, GPT-4 could not generate images directly, so the researchers asked it to produce TikZ code instead. The resulting images were then used to evaluate how well the model combined understanding from multiple disciplines.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!U5ip!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1300ea7a-85be-4abf-9b1d-2aefb77be7a4_1076x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!U5ip!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1300ea7a-85be-4abf-9b1d-2aefb77be7a4_1076x1600.png 424w, https://substackcdn.com/image/fetch/$s_!U5ip!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1300ea7a-85be-4abf-9b1d-2aefb77be7a4_1076x1600.png 848w, https://substackcdn.com/image/fetch/$s_!U5ip!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1300ea7a-85be-4abf-9b1d-2aefb77be7a4_1076x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!U5ip!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1300ea7a-85be-4abf-9b1d-2aefb77be7a4_1076x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!U5ip!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1300ea7a-85be-4abf-9b1d-2aefb77be7a4_1076x1600.png" width="1076" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1300ea7a-85be-4abf-9b1d-2aefb77be7a4_1076x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1076,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!U5ip!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1300ea7a-85be-4abf-9b1d-2aefb77be7a4_1076x1600.png 424w, https://substackcdn.com/image/fetch/$s_!U5ip!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1300ea7a-85be-4abf-9b1d-2aefb77be7a4_1076x1600.png 848w, https://substackcdn.com/image/fetch/$s_!U5ip!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1300ea7a-85be-4abf-9b1d-2aefb77be7a4_1076x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!U5ip!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1300ea7a-85be-4abf-9b1d-2aefb77be7a4_1076x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">GPT-4 integrative ability demonstrated with it&#8217;s generation of images from a letter.</figcaption></figure></div><p>The example above shows GPT-4&#8217;s output for generating images using letters. Now let&#8217;s apply the same prompt to GPT-5 and see what it produces.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VAub!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9cc6472a-081b-46da-bfea-aae5f9110652_1076x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VAub!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9cc6472a-081b-46da-bfea-aae5f9110652_1076x1600.png 424w, https://substackcdn.com/image/fetch/$s_!VAub!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9cc6472a-081b-46da-bfea-aae5f9110652_1076x1600.png 848w, https://substackcdn.com/image/fetch/$s_!VAub!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9cc6472a-081b-46da-bfea-aae5f9110652_1076x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!VAub!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9cc6472a-081b-46da-bfea-aae5f9110652_1076x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VAub!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9cc6472a-081b-46da-bfea-aae5f9110652_1076x1600.png" width="1076" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9cc6472a-081b-46da-bfea-aae5f9110652_1076x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1076,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VAub!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9cc6472a-081b-46da-bfea-aae5f9110652_1076x1600.png 424w, https://substackcdn.com/image/fetch/$s_!VAub!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9cc6472a-081b-46da-bfea-aae5f9110652_1076x1600.png 848w, https://substackcdn.com/image/fetch/$s_!VAub!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9cc6472a-081b-46da-bfea-aae5f9110652_1076x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!VAub!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9cc6472a-081b-46da-bfea-aae5f9110652_1076x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">GPT-5 integrative ability demonstrated with it&#8217;s generation of images from a letter.</figcaption></figure></div><p>GPT-5 produced images that were more creatively generated from the alphabet, demonstrating not only an understanding of letters but also of the objects themselves and how to combine the two in a more realistic way than GPT-4.</p><div class="pullquote"><p>Need creative, high-quality technical content? Happy to chat! Book a call with our Creative Engineers&#128071;&#127998;</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://calendar.app.google/s6ekF2rDbPjKWLLQ8&quot;,&quot;text&quot;:&quot;Let&#8217;s Chat! &#128222;&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://calendar.app.google/s6ekF2rDbPjKWLLQ8"><span>Let&#8217;s Chat! &#128222;</span></a></p></div><h2>Coding</h2><p>It is clear that one of GPT-5&#8217;s strongest selling points is its coding ability, a capability OpenAI is actively highlighting through collaborations with companies like Cursor. Compared to the time when <em>Sparks of AGI</em> was first released, the model&#8217;s coding performance is on an entirely new level.</p><div id="youtube2-PQUcIbSEBCM" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;PQUcIbSEBCM&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/PQUcIbSEBCM?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>The paper introduced several challenges for GPT-4:</p><ul><li><p>Solving coding problems</p></li><li><p>Applying code to real-world scenarios</p></li><li><p>Understanding existing code</p></li><li><p>Reasoning about code execution</p></li></ul><p>Currently GPT-5 is capable of doing all of this and more. This isn&#8217;t neccesarily a feature of GPT-5 since this improved coding ability has been steadily increasing in the GPT-4 era.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mPf4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78a5f0e-0b56-4ea6-89d7-daa7d1c0f5d8_849x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mPf4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78a5f0e-0b56-4ea6-89d7-daa7d1c0f5d8_849x1600.png 424w, https://substackcdn.com/image/fetch/$s_!mPf4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78a5f0e-0b56-4ea6-89d7-daa7d1c0f5d8_849x1600.png 848w, https://substackcdn.com/image/fetch/$s_!mPf4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78a5f0e-0b56-4ea6-89d7-daa7d1c0f5d8_849x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!mPf4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78a5f0e-0b56-4ea6-89d7-daa7d1c0f5d8_849x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mPf4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78a5f0e-0b56-4ea6-89d7-daa7d1c0f5d8_849x1600.png" width="849" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a78a5f0e-0b56-4ea6-89d7-daa7d1c0f5d8_849x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:849,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mPf4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78a5f0e-0b56-4ea6-89d7-daa7d1c0f5d8_849x1600.png 424w, https://substackcdn.com/image/fetch/$s_!mPf4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78a5f0e-0b56-4ea6-89d7-daa7d1c0f5d8_849x1600.png 848w, https://substackcdn.com/image/fetch/$s_!mPf4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78a5f0e-0b56-4ea6-89d7-daa7d1c0f5d8_849x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!mPf4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa78a5f0e-0b56-4ea6-89d7-daa7d1c0f5d8_849x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">GPT-4 and GPT-5 showing their ability to reason about code execution</figcaption></figure></div><p>By leveraging GPT-5&#8217;s thinking, the model is able to outperform the o3 model on benchmarks such as <a href="https://www.swebench.com/original.html">SWE-bench</a> and <a href="https://aider.chat/2024/12/21/polyglot.html#the-polyglot-benchmark">Aider Pyplot</a>. Its strong reasoning ability also suggests that its coding skills still have room for further improvement.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/gpt-4-showed-sparks-of-agi-has-gpt?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/gpt-4-showed-sparks-of-agi-has-gpt?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h2>Interaction with the world</h2><p>One of the key criteria highlighted in the original paper was GPT-4&#8217;s ability to use tools that interact with the real world. GPT-4 demonstrated successful tool use, but it struggled when the tools became more complex.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PR2D!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2804f29b-9577-4931-bbb8-6616a3d8cf3e_579x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PR2D!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2804f29b-9577-4931-bbb8-6616a3d8cf3e_579x1600.png 424w, https://substackcdn.com/image/fetch/$s_!PR2D!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2804f29b-9577-4931-bbb8-6616a3d8cf3e_579x1600.png 848w, https://substackcdn.com/image/fetch/$s_!PR2D!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2804f29b-9577-4931-bbb8-6616a3d8cf3e_579x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!PR2D!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2804f29b-9577-4931-bbb8-6616a3d8cf3e_579x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PR2D!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2804f29b-9577-4931-bbb8-6616a3d8cf3e_579x1600.png" width="579" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2804f29b-9577-4931-bbb8-6616a3d8cf3e_579x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:579,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PR2D!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2804f29b-9577-4931-bbb8-6616a3d8cf3e_579x1600.png 424w, https://substackcdn.com/image/fetch/$s_!PR2D!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2804f29b-9577-4931-bbb8-6616a3d8cf3e_579x1600.png 848w, https://substackcdn.com/image/fetch/$s_!PR2D!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2804f29b-9577-4931-bbb8-6616a3d8cf3e_579x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!PR2D!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2804f29b-9577-4931-bbb8-6616a3d8cf3e_579x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">GPT-4 and GPT-5 showing their tool usage capabilities</figcaption></figure></div><p>GPT-5 clearly surpasses that limitation. It can now use tools through function calling across domains such as airlines, retail, and telecommunications. Combined with its advanced reasoning capabilities, it is even able to handle the challenges of using highly complex tools.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!e8QO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91028e86-75f5-458c-b5a7-800eb8de6b48_1350x932.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!e8QO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91028e86-75f5-458c-b5a7-800eb8de6b48_1350x932.png 424w, https://substackcdn.com/image/fetch/$s_!e8QO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91028e86-75f5-458c-b5a7-800eb8de6b48_1350x932.png 848w, https://substackcdn.com/image/fetch/$s_!e8QO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91028e86-75f5-458c-b5a7-800eb8de6b48_1350x932.png 1272w, https://substackcdn.com/image/fetch/$s_!e8QO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91028e86-75f5-458c-b5a7-800eb8de6b48_1350x932.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!e8QO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91028e86-75f5-458c-b5a7-800eb8de6b48_1350x932.png" width="1350" height="932" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/91028e86-75f5-458c-b5a7-800eb8de6b48_1350x932.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:932,&quot;width&quot;:1350,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!e8QO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91028e86-75f5-458c-b5a7-800eb8de6b48_1350x932.png 424w, https://substackcdn.com/image/fetch/$s_!e8QO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91028e86-75f5-458c-b5a7-800eb8de6b48_1350x932.png 848w, https://substackcdn.com/image/fetch/$s_!e8QO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91028e86-75f5-458c-b5a7-800eb8de6b48_1350x932.png 1272w, https://substackcdn.com/image/fetch/$s_!e8QO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91028e86-75f5-458c-b5a7-800eb8de6b48_1350x932.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Does GPT-5 address GPT-4 limitations</h2><p>From our assessments, GPT-5 is clearly an improvement over its predecessor. The key question is whether it overcomes the limitations identified in the original <em>Sparks of AGI</em> paper. Those limitations included:</p><ul><li><p>Lack of planning in arithmetic and reasoning problems</p></li><li><p>Lack of planning in text generation</p></li></ul><p>Arithmetic has long been difficult for language models. GPT-5 shows clear progress. It solves more complex arithmetic problems, and with stronger reasoning it handles tasks that GPT-4 often could not solve without a calculator tool.</p><p>Reasoning has also advanced. Progress has been steady since the <a href="https://openai.com/o1/">o1</a> generation, and GPT-5 goes further, outperforming <a href="https://openai.com/index/introducing-o3-and-o4-mini/">o3</a> on key benchmarks.</p><p>GPT-4 often lacked planning in text generation, which led to hallucinations and inconsistencies. GPT-5 plans better, writes more coherently, and is more willing to say when it does not know the answer.</p><h2>GPT-5 is Impressive, but it is not AGI</h2><p>The question is simple: does GPT-5 show sparks of AGI as GPT-4 once did? The answer is yes. Is GPT-5 itself AGI? The answer is no. Even OpenAI has acknowledged that GPT-5 is not AGI, and the definition of AGI continues to shift with every new generation of large language models.</p><p>The AI industry has fueled the narrative that AGI is just around the corner, and the public has increasingly come to believe it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4i7q!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1bc573b-e33c-427c-9d55-f0f7facd7961_603x263.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4i7q!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1bc573b-e33c-427c-9d55-f0f7facd7961_603x263.png 424w, https://substackcdn.com/image/fetch/$s_!4i7q!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1bc573b-e33c-427c-9d55-f0f7facd7961_603x263.png 848w, https://substackcdn.com/image/fetch/$s_!4i7q!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1bc573b-e33c-427c-9d55-f0f7facd7961_603x263.png 1272w, https://substackcdn.com/image/fetch/$s_!4i7q!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1bc573b-e33c-427c-9d55-f0f7facd7961_603x263.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4i7q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1bc573b-e33c-427c-9d55-f0f7facd7961_603x263.png" width="603" height="263" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e1bc573b-e33c-427c-9d55-f0f7facd7961_603x263.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:263,&quot;width&quot;:603,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4i7q!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1bc573b-e33c-427c-9d55-f0f7facd7961_603x263.png 424w, https://substackcdn.com/image/fetch/$s_!4i7q!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1bc573b-e33c-427c-9d55-f0f7facd7961_603x263.png 848w, https://substackcdn.com/image/fetch/$s_!4i7q!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1bc573b-e33c-427c-9d55-f0f7facd7961_603x263.png 1272w, https://substackcdn.com/image/fetch/$s_!4i7q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1bc573b-e33c-427c-9d55-f0f7facd7961_603x263.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This has created major public expectations, while at the same time diminishing appreciation for the real innovations in the field. If the narrative continues unchecked, it could even trigger another <a href="https://en.wikipedia.org/wiki/AI_winter">AI winter</a>.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/subscribe?"><span>Subscribe now</span></a></p><h2>Conclusion</h2><p>GPT-5 may not have lived up to the wildest expectations fueled by industry hype, but it may well have lit the spark for what comes next. Just as GPT-4 laid the groundwork for many of the breakthroughs we have today, GPT-5 could be the foundation for future advancements in the GPT series.</p><p>Even if AGI remains out of reach for now, the progress is undeniable. GPT-5 demonstrates remarkable capabilities, and its practical applications are already reshaping how we work, create, and interact with technology. The pursuit of AGI may continue, but the real value lies in what these models can already do today.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/gpt-4-showed-sparks-of-agi-has-gpt/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/gpt-4-showed-sparks-of-agi-has-gpt/comments"><span>Leave a comment</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Generating Court Transcriptions with Deepgram's Nova-3]]></title><description><![CDATA[Let's build an AI court reporter with Deepgram's Nova-3 speech-to-text (STT) model.]]></description><link>https://neurlcreators.substack.com/p/generating-court-transcriptions-with</link><guid isPermaLink="false">https://neurlcreators.substack.com/p/generating-court-transcriptions-with</guid><dc:creator><![CDATA[Eteimorde Youdiowei]]></dc:creator><pubDate>Sat, 09 Aug 2025 17:00:31 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/fc818fb3-8875-4913-a2fb-c97d2a6bfe42_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In the courtroom, every word matters. Transcripts form the official record of legal proceedings, capturing exactly <strong>what</strong> was said, <strong>by whom</strong>, and <strong>when</strong>. </p><p>Traditionally, this task falls to highly trained <a href="https://www.ncra.org/home/the-profession/Court-Reporting">court reporters</a>, experts who transcribe speech in real time using specialized shorthand machines. B</p><p>ut with rapid advances in speech recognition technology, one question naturally arises: can modern Speech-to-Text models shoulder some of this responsibility?</p><p>Classic Speech-to-Text systems simply convert spoken words into plain text. Today&#8217;s models, however, such as <strong>Deepgram&#8217;s <a href="https://deepgram.com/learn/introducing-nova-3-speech-to-text-api">Nova-3</a></strong> and <strong>AssemblyAI&#8217;s <a href="https://www.assemblyai.com/universal-2">Universal-2</a></strong>, go far beyond basic transcription.</p><p>With features like <strong><a href="https://deepgram.com/learn/what-is-speaker-diarization">speaker diarization</a></strong> and <strong>timestamps</strong>, these models can produce metadata-rich transcripts that mimic the structure, clarity, and reliability of a human court reporter.</p><p>In this article, we&#8217;ll build an <strong>AI court reporter</strong> using <strong><a href="https://developers.deepgram.com/docs/models-languages-overview">Deepgram&#8217;s Nova-3</a></strong> model and explore:</p><ul><li><p>How metadata transforms raw transcripts into legally useful records</p></li><li><p>How to work with the Deepgram API in Python</p></li></ul><div><hr></div><h2><strong>What does it take to build an AI court reporter?</strong></h2><p>Court proceedings, whether a trial, hearing, or deposition, are among the most critical processes in the legal system. During these events, a court reporter is responsible for capturing every spoken word verbatim. Skilled human reporters achieve high accuracy rates and understand complex legal jargon.</p><p>Their role goes beyond simple transcription. They must record who spoke, what was said, and when it was said. To replicate this with an AI system, we need a Speech-to-Text model capable of three essential things:</p><ul><li><p><strong>Speaker diarization</strong>: Automatically identifies and labels different speakers in a conversation (e.g., Speaker 0, Speaker 1). This is crucial in courtroom settings where multiple participants, such as judges, lawyers, and witnesses, speak in turn.</p></li><li><p><strong>Timestamps</strong>: Tags each word or sentence with its start and end times. This allows precise alignment with the original audio, enabling features like searchable playback, real-time synchronization, and legally verifiable transcripts.</p></li><li><p><strong>Low Word Error Rate (WER)</strong>: Ensures the AI produces highly accurate transcripts that can be trusted in legal contexts, reducing the risk of misinterpretation or misquotation.</p></li></ul><h2>Building an AI Court Reporter with Deepgram</h2><p>Now that we understand the key capabilities required for an AI court reporter, let&#8217;s put them into action using <strong>Deepgram&#8217;s Nova-3</strong> model, which supports both <strong>speaker diarization</strong> and <strong>timestamps</strong>.</p><p>We&#8217;ll build a simple Python CLI tool that takes either a local audio file or a URL to a court proceeding, then transcribes the audio while labeling each speaker and attaching precise timestamps.</p><h3><strong>Step 1: Install Dependencies</strong></h3><p>Before we start coding, make sure you have the necessary Python packages installed. We&#8217;ll use the <strong><a href="https://pypi.org/project/deepgram-sdk/">Deepgram SDK</a></strong> for transcription and <strong><a href="https://github.com/Textualize/rich">Rich</a></strong> to create a clean, formatted terminal output:</p><pre><code>pip install deepgram-sdk rich</code></pre><h3><strong>Step 2: Import Required Libraries</strong></h3><p>Next, we&#8217;ll import all the libraries needed for our script. We&#8217;ll use <strong>argparse</strong> to handle command-line arguments, the <strong>Deepgram SDK</strong> classes to interact with the Deepgram API, and components from <strong>Rich</strong> to format and display results in the terminal.</p><pre><code>import argparse
import json
from itertools import groupby
from operator import itemgetter
from pathlib import Path

from deepgram import (
    DeepgramClient,
    PrerecordedOptions,
    FileSource,
)

from rich.console import Console
from rich.table import Table
from rich.progress import Progress</code></pre><h3><strong>Step 3: Define the Transcription Function</strong></h3><p>We&#8217;ll now create a function that sends audio to Deepgram for transcription. This function accepts either a local file path or a URL, then uses the Nova-3 model with speaker diarization and timestamps enabled. It returns the raw transcription JSON data from Deepgram&#8217;s API.</p><pre><code>def transcribe_audio(audio_path, api_key):
    dg = DeepgramClient(api_key)

    opts = PrerecordedOptions(
        model="nova-3",
        language="en",
        smart_format=True,
        diarize=True
    )

    if audio_path.startswith("http://") or audio_path.startswith("https://"):
        source = {"url": audio_path}
        res = dg.listen.rest.v("1").transcribe_url(source, opts)
    else:
        with open(audio_path, "rb") as f:
            payload: FileSource = {
                "buffer": f.read()
            }
            res = dg.listen.rest.v("1").transcribe_file(payload, opts)

    return res</code></pre><h3><strong>Step 4: Format the Transcription</strong></h3><p>Once we receive the raw output from Deepgram, we use the <code>build_diarized_transcript</code> function to organize it into a clean, readable format. This involves grouping words by speaker, extracting their start and end timestamps, and combining them into speaker-specific segments.</p><pre><code>def build_diarized_transcript(res):
    words = res.results.channels[0].alternatives[0].words
    diarized_segments = []

    for speaker, group in groupby(words, key=itemgetter("speaker")):
        group = list(group)
        start = group[0]["start"]
        end = group[-1]["end"]
        text = " ".join([w["punctuated_word"] for w in group])
        diarized_segments.append({
            "speaker": f"Speaker {speaker}",
            "start": start,
            "end": end,
            "text": text
        })
    return diarized_segments</code></pre><h3><strong>Step 5: Display the Transcripts</strong></h3><p>We&#8217;ll use <strong>Rich</strong> to neatly format and display the <strong>rendered transcripts</strong> in a table for better readability:</p><pre><code>def print_diarized_table(diarized_segments):
    table = Table(show_header=True, header_style="bold magenta")
    table.add_column("Start", style="cyan")
    table.add_column("End", style="cyan")
    table.add_column("Speaker", style="green")
    table.add_column("Text", style="white", overflow="fold")

    for seg in diarized_segments:
        start_time = f"{seg['start']:.2f}s"
        end_time = f"{seg['end']:.2f}s"
        table.add_row(start_time, end_time, seg["speaker"], seg["text"])

    console = Console()
    console.print("\n[bold underline]Diarized Transcript[/bold underline]\n")
    console.print(table)</code></pre><h3><strong>Step 6: Run the program</strong></h3><p>Let&#8217;s make the script runnable so we can execute it directly from the command line. We&#8217;ll use <code>argparse</code> to accept the audio file path, the Deepgram API key, and an optional output path for saving the raw Deepgram JSON.</p><pre><code>if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="Transcribe audio with Deepgram Nova-3 and diarization.")
    parser.add_argument("audio", help="Path or URL to the audio file")
    parser.add_argument("--api_key", required=True, help="Deepgram API key")
    parser.add_argument("--save_json", help="Optional path to save raw Deepgram JSON output")
    args = parser.parse_args()

    console = Console()

    with Progress() as progress:
        task = progress.add_task("[cyan]Transcribing audio...", total=None)
        res = transcribe_audio(args.audio, args.api_key)
        progress.update(task, completed=1)

    if args.save_json:
        with open(args.save_json, "w") as f:
            json.dump(res.to_dict(), f, indent=2)

    diarized_segments = build_diarized_transcript(res)
    print_diarized_table(diarized_segments)</code></pre><h3><strong>Testing the AI Court Reporter</strong></h3><p>To test the AI Court Reporter, we will use a clip from Better Call Saul to see how the model can capture both the time stamps and label speakers.</p><div id="youtube2-DvkYRhu-TP0" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;DvkYRhu-TP0&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/DvkYRhu-TP0?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>All we have to do is get an audio version of the clip and our Deepgram key, then pass it to the application like this:</p><pre><code>python main.py "Better_Call_Saul.mp3" --api_key YOUR_DEEPGRAM_API_KEY</code></pre><p>This will give us the following output:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!esav!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7c0692-4741-466a-a351-3ec149adbafd_1600x857.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!esav!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7c0692-4741-466a-a351-3ec149adbafd_1600x857.png 424w, https://substackcdn.com/image/fetch/$s_!esav!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7c0692-4741-466a-a351-3ec149adbafd_1600x857.png 848w, https://substackcdn.com/image/fetch/$s_!esav!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7c0692-4741-466a-a351-3ec149adbafd_1600x857.png 1272w, https://substackcdn.com/image/fetch/$s_!esav!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7c0692-4741-466a-a351-3ec149adbafd_1600x857.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!esav!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7c0692-4741-466a-a351-3ec149adbafd_1600x857.png" width="1456" height="780" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8b7c0692-4741-466a-a351-3ec149adbafd_1600x857.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:780,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!esav!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7c0692-4741-466a-a351-3ec149adbafd_1600x857.png 424w, https://substackcdn.com/image/fetch/$s_!esav!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7c0692-4741-466a-a351-3ec149adbafd_1600x857.png 848w, https://substackcdn.com/image/fetch/$s_!esav!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7c0692-4741-466a-a351-3ec149adbafd_1600x857.png 1272w, https://substackcdn.com/image/fetch/$s_!esav!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7c0692-4741-466a-a351-3ec149adbafd_1600x857.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>You can get the source code on GitHub:</strong><a href="https://github.com/Neurl-LLC/Court-Transcripts-With-Nova"> Neurl-LLC/Court-Transcripts-With-Nova</a></p><div class="pullquote"><p>Need creative, high-quality technical content? Happy to chat! Book a call with our Creative Engineers&#128071;&#127998;</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://calendar.app.google/s6ekF2rDbPjKWLLQ8&quot;,&quot;text&quot;:&quot;Let&#8217;s Chat! &#128222;&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://calendar.app.google/s6ekF2rDbPjKWLLQ8"><span>Let&#8217;s Chat! &#128222;</span></a></p></div><h2><strong>Drawbacks of Using AI for Court Reporting</strong></h2><p>AI-based transcription is not without limitations. Some of the most common challenges include:</p><ul><li><p><strong>Overlapping speech:</strong> When multiple speakers talk at the same time, AI often struggles to separate their voices accurately.</p></li><li><p><strong>Specialized legal terminology:</strong> Court proceedings often contain Latin phrases, case law references, and technical legal terms that speech-to-text models may not recognize without domain-specific training.</p></li><li><p><strong>Contextual ambiguity:</strong> AI lacks the human judgment to interpret sarcasm, implied meaning, or nuanced tone shifts, which can sometimes be relevant in court.</p></li><li><p><strong>Legal restrictions:</strong> Certain courts do not allow digital devices or permit the recording of audio, making AI transcription impossible in those settings.</p></li></ul><p>For now, these drawbacks mean AI is best used as a <strong>supplementary court reporter</strong>, working alongside humans to improve efficiency and accessibility without replacing the need for human oversight.</p><h2><strong>Conclusion</strong></h2><p>By using speech-to-text models that provide extra metadata such as timestamps and diarization, we unlock untapped potential not only in the legal industry but in any field that requires high-quality text data beyond plain transcription. When combined with other AI technologies, this enables capabilities such as:</p><ul><li><p>Automated redaction using language models</p></li><li><p>Speaker identification through voice recognition</p></li><li><p>Advanced search systems for fast, precise retrieval</p></li></ul><p>In the legal industry, these capabilities can streamline case preparation, improve evidence review, and ensure greater accuracy in legal documentation.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Neural Blueprint: Practical Content for AI Builders ! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Fenic: DataFrames for an LLM world]]></title><description><![CDATA[Fenic turns LLM calls into first-class DataFrame ops (semantic.map, classify, extract, join). PySpark vibes, Arrow under the hood, Rust speed&#8230; but also a young ecosystem with sharp ed]]></description><link>https://neurlcreators.substack.com/p/fenic-dataframes-for-an-llm-world</link><guid isPermaLink="false">https://neurlcreators.substack.com/p/fenic-dataframes-for-an-llm-world</guid><dc:creator><![CDATA[Stephen FIYINFOLUWA Oladele]]></dc:creator><pubDate>Fri, 08 Aug 2025 17:00:49 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/a44880be-e659-477e-adb8-df00e17c8c52_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Fenic just went public on GitHub (&#9733; ~143, Apache 2.0, <a href="https://github.com/typedef-ai/fenic/releases/tag/v0.2.1">v0.2.1</a>, July 7, 2025). The goal: treat <strong>LLM inference as a first-class DataFrame primitive</strong>, so everything from paraphrasing to schema extraction looks like a familiar <code>df.select().</code></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FLIs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5846202-d6a4-4bcf-9165-b14373425f7a_1140x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FLIs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5846202-d6a4-4bcf-9165-b14373425f7a_1140x1600.png 424w, https://substackcdn.com/image/fetch/$s_!FLIs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5846202-d6a4-4bcf-9165-b14373425f7a_1140x1600.png 848w, https://substackcdn.com/image/fetch/$s_!FLIs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5846202-d6a4-4bcf-9165-b14373425f7a_1140x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!FLIs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5846202-d6a4-4bcf-9165-b14373425f7a_1140x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FLIs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5846202-d6a4-4bcf-9165-b14373425f7a_1140x1600.png" width="1140" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f5846202-d6a4-4bcf-9165-b14373425f7a_1140x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1140,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FLIs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5846202-d6a4-4bcf-9165-b14373425f7a_1140x1600.png 424w, https://substackcdn.com/image/fetch/$s_!FLIs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5846202-d6a4-4bcf-9165-b14373425f7a_1140x1600.png 848w, https://substackcdn.com/image/fetch/$s_!FLIs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5846202-d6a4-4bcf-9165-b14373425f7a_1140x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!FLIs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5846202-d6a4-4bcf-9165-b14373425f7a_1140x1600.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>&#128226; What Is Fenic?</strong></h2><blockquote><p>&#8220;Think pandas/Polars, but with <code>semantic.extract</code>,<code> semantic.join</code>,<code> semantic.map </code>baked in.&#8221; &#8212; Kostas Pardalis, Fenic creator (<a href="https://mlops-community.slack.com/archives/C018E4N2H9V/p1752102040389509">MLOps Slack launch thread</a>).</p></blockquote><p>Under the hood, Fenic is an <strong>opinionated, PySpark-inspired query engine</strong> built from scratch for AI and agentic applications.</p><p>Fenic treats LLM workflows as built-in DataFrame primitives instead of attaching them to external systems. This approach might make it much easier to build and run your workflows.</p><p>Here are the primary selling points:</p><ul><li><p><strong>Semantic operators</strong> (<code>analyze_sentiment</code>,<code> classify</code>,<code> extract</code>,<code> group_by</code>,<code> join</code>,<code> predicate</code>) are first-class.</p></li><li><p><strong>Native unstructured types</strong>: Markdown, transcripts, JSON, long-form text with auto-chunking.</p></li><li><p><strong>Batch and retry layer</strong> with token counting and cost metrics built-in.</p></li><li><p><strong>Multi-provider</strong> (OpenAI, Anthropic, Gemini) and local/cloud execution modes.</p></li><li><p><strong>Familiar API</strong>: lazy DataFrame, SQL support, PySpark-like chaining.</p></li><li><p><strong>Languages: Python 87%, Rust 13%. </strong>Rust core already powers Arrow-native execution, with design choices influenced by Polars for speed and efficiency.</p></li></ul><p>Wes McKinney (pandas creator) <a href="https://www.typedef.ai/">publicly endorsed the concept</a> (&#8220;a natural evolution of the DataFrame abstraction&#8221;).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4hXM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bd45e94-1fbb-4abe-9fcd-46111edb4801_1560x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4hXM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bd45e94-1fbb-4abe-9fcd-46111edb4801_1560x1600.png 424w, https://substackcdn.com/image/fetch/$s_!4hXM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bd45e94-1fbb-4abe-9fcd-46111edb4801_1560x1600.png 848w, https://substackcdn.com/image/fetch/$s_!4hXM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bd45e94-1fbb-4abe-9fcd-46111edb4801_1560x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!4hXM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bd45e94-1fbb-4abe-9fcd-46111edb4801_1560x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4hXM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bd45e94-1fbb-4abe-9fcd-46111edb4801_1560x1600.png" width="1456" height="1493" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3bd45e94-1fbb-4abe-9fcd-46111edb4801_1560x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1493,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4hXM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bd45e94-1fbb-4abe-9fcd-46111edb4801_1560x1600.png 424w, https://substackcdn.com/image/fetch/$s_!4hXM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bd45e94-1fbb-4abe-9fcd-46111edb4801_1560x1600.png 848w, https://substackcdn.com/image/fetch/$s_!4hXM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bd45e94-1fbb-4abe-9fcd-46111edb4801_1560x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!4hXM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bd45e94-1fbb-4abe-9fcd-46111edb4801_1560x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>&#9889; Quick Spin-Up: Podcast &#8594; Segments &#8594; Extract &#8594; Summaries (Fenic)</h2><p>Here&#8217;s a quick example of how to analyze and extract summaries from a podcast episode with Fenic:</p><pre><code>pip install fenic              # Python 3.10-3.12, Fenic v0.2.1
export OPENAI_API_KEY=...      # Add your Op

# -------------------------
from pathlib import Path
from pydantic import BaseModel, Field
import fenic as fc

# 1. ---- Define schemas for structured extraction ----
class SegmentSchema(BaseModel):
    speaker: str = Field(description="Who is talking in this segment")
    start_time: float = Field(description="Start time (seconds)")
    end_time: float = Field(description="End time (seconds)")
    key_points: list[str] = Field(description="Bullet points for this segment")

class EpisodeSummary(BaseModel):
    title: str
    guests: list[str]
    main_topics: list[str]
    actionable_insights: list[str]

# 2. ---- Init a Fenic session with a model alias ----
config = fc.SessionConfig(
    app_name="podcast_quickspin",
    semantic=fc.SemanticConfig(
        language_models={
            "mini": fc.OpenAIModelConfig(model_name="gpt-4o-mini", rpm=300, tpm=150_000)
        }
    ),
)
session = fc.Session.get_or_create(config)

# 3. ---- Load raw transcript/metadata as strings ----
data_dir = Path("data")  # put your JSON/text here
transcript_text = (data_dir / "transcript.json").read_text()
meta_text       = (data_dir / "meta.json").read_text()

df = fc.DataFrame({"meta": [meta_text], "transcript": [transcript_text]})

# 4. ---- Extract structured metadata &amp; segment the transcript ----
processed = (
    df.select(
        "*",
        fc.semantic.extract("meta", EpisodeSummary, model="mini").alias("episode"),
        # Chunk transcript then extract per-chunk info
        fc.semantic.chunk("transcript", max_tokens=1200).alias("chunks"),
    )
    # Explode chunks to rows (one row per chunk)
    .explode("chunks")
    .select(
        fc.col("chunks").alias("chunk"),
        fc.semantic.extract("chunk", SegmentSchema, model="mini").alias("segment"),
    )
)

# 5. ---- Abstractive recap per speaker/segment &amp; global summary ----
final = (
    processed
    .select("*",
        fc.semantic.map(
            "Summarize this segment in 2 sentences:\n{chunk}"
        , model="mini").alias("segment_summary")
    )
    .group_by(fc.col("segment.speaker"))
    .agg(
        fc.semantic.map(
            "Combine these summaries into one clear paragraph:\n{segment_summary}"
        , model="mini").alias("speaker_summary")
    )
)

final.show(truncate=120)

# Optional: write to parquet/csv
final.write.parquet("podcast_summaries.parquet")

session.stop()</code></pre><h3><strong>Tips to Adapt Quickly</strong></h3><ul><li><p><strong>Different providers</strong>: swap <code>OpenAIModelConfig</code> for <code>AnthropicModelConfig</code>, etc.</p></li><li><p><strong>Bigger files</strong>: bump <code>max_tokens</code> or chunk size; Fenic batches/streams for you.</p></li><li><p><strong>Eval pass</strong>: add another <code>select()</code> with a classifier prompt to tag &#8220;quality: good/needs fix&#8221;.</p></li><li><p><strong>Cost guardrails</strong>: set <code>max_tokens_per_call</code> or inspect <code>session.metrics()</code> after run.</p></li></ul><p>Need a variant for <strong>YouTube transcripts</strong> or <strong>research PDFs</strong>? Simply adjust the loader and schemas, and the pipeline shape will remain unchanged.</p><h2><strong>&#128293; Why You Should Care</strong></h2><ul><li><p><strong>Declarative pipelines</strong> &#8594; push your ETL <em>and</em> inference into the same DAG.</p></li><li><p><strong>Cheaper evaluation loops</strong> &#8594; token and cost metrics are first-class.</p></li><li><p><strong>Semantic joins</strong> &#8594; fuzzy &#8220;does this paper help my research question?&#8221; join in one line via <code>semantic.join</code></p></li><li><p><strong>Structured extraction to Pydantic</strong> &#8594; easier downstream analytics, eval &amp; labeling.</p></li><li><p><strong>Agent synergy</strong> &#8594; pre-batch heavy reasoning offline, feed lean contexts to online agents.</p></li></ul><div class="pullquote"><p>Need creative, high-quality technical content? Happy to chat! Book a call with our Creative Engineers&#128071;&#127998;</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://calendar.app.google/s6ekF2rDbPjKWLLQ8&quot;,&quot;text&quot;:&quot;Let&#8217;s Chat! &#128222;&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://calendar.app.google/s6ekF2rDbPjKWLLQ8"><span>Let&#8217;s Chat! &#128222;</span></a></p></div><h2><strong>&#128201; Gotchas and Caveats</strong></h2><ol><li><p><strong>Toolchain friction:</strong> Installing Fenic via pip is straightforward, but developing new features requires the Rust toolchain (rustc, cargo, maturin), which isn&#8217;t fully documented yet.</p></li><li><p><strong>Young ecosystem: </strong>Core support includes Arrow, CSV, and Parquet, with native connectors for Snowflake, BigQuery, and S3 expected soon.</p></li><li><p><strong>Operational maturity:</strong> No proven large-scale benchmarks published; cloud engine <a href="https://www.typedef.ai/blog/typedef-launch">still alpha</a>.</p></li><li><p><strong>Docs still sparse: </strong>docs.fenic.ai exists but is thin; many API details live only in README/examples. A more structured documentation system is in the works, with an MCP server example.</p></li><li><p><strong>Single-node today</strong>: No distributed executor yet; large corpora need chunked runs or Spark/Polars fallback.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2GPc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc70d4050-f608-4ec9-acd5-019c4baeaa8f_1370x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2GPc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc70d4050-f608-4ec9-acd5-019c4baeaa8f_1370x1600.png 424w, https://substackcdn.com/image/fetch/$s_!2GPc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc70d4050-f608-4ec9-acd5-019c4baeaa8f_1370x1600.png 848w, https://substackcdn.com/image/fetch/$s_!2GPc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc70d4050-f608-4ec9-acd5-019c4baeaa8f_1370x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!2GPc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc70d4050-f608-4ec9-acd5-019c4baeaa8f_1370x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2GPc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc70d4050-f608-4ec9-acd5-019c4baeaa8f_1370x1600.png" width="1370" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c70d4050-f608-4ec9-acd5-019c4baeaa8f_1370x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1370,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2GPc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc70d4050-f608-4ec9-acd5-019c4baeaa8f_1370x1600.png 424w, https://substackcdn.com/image/fetch/$s_!2GPc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc70d4050-f608-4ec9-acd5-019c4baeaa8f_1370x1600.png 848w, https://substackcdn.com/image/fetch/$s_!2GPc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc70d4050-f608-4ec9-acd5-019c4baeaa8f_1370x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!2GPc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc70d4050-f608-4ec9-acd5-019c4baeaa8f_1370x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>&#129489;&#8205;&#9878;&#65039; Final Verdict: 4/5</strong></h2><h3><strong>Rating: &#11088;&#11088;&#11088;&#11088;&#9734; (4/5)</strong></h3><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/fenic-dataframes-for-an-llm-world/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/fenic-dataframes-for-an-llm-world/comments"><span>Leave a comment</span></a></p><p>Fenic nails a gap nobody else covers: <strong>treating LLM inference as a native DataFrame primitive</strong>. If your team already loves SQL/PySpark and spends hours duct-taping looped API calls, Fenic will feel like a super-power.</p><p><em>Ship it if&#8230;</em></p><ul><li><p>You batch-process lots of text and need semantic joins/extractions weekly.</p></li><li><p>You&#8217;re prototyping an RAG/agent pipeline and want repeatable, cost-aware ETL.</p></li><li><p>You can tolerate early-project rough edges and contribute fixes upstream.</p></li></ul><p><em>Hold off if&#8230;</em></p><ul><li><p>You need petabyte-scale, distributed compute <strong>today</strong>.</p></li><li><p>Your workloads are real-time, sub-second.</p></li><li><p>You require enterprise auth/row-level security out of the box.</p></li></ul><h2><strong>&#128204; More Resources</strong></h2><ul><li><p>Docs and Quickstarts &#8594;<a href="https://docs.fenic.ai/latest/"> https://docs.fenic.ai/latest/</a></p></li><li><p>GitHub repo (Apache-2.0) &#8594;<a href="https://github.com/typedef-ai/fenic"> https://github.com/typedef-ai/fenic</a></p></li><li><p>Blog intro (&#8220;PySpark-inspired DataFrame for AI&#8221;) (June 18, 2025) &#8594; https://www.typedef.ai/blog/fenic-open-source</p></li><li><p>Example gallery &#8594; <a href="https://github.com/typedef-ai/fenic/blob/main/examples">examples/</a> folder on GitHub</p></li><li><p>Author Q&amp;A in MLOps Slack &#8594;<a href="https://mlops-community.slack.com/archives/C018E4N2H9V/p1752102040389509"> link</a></p></li></ul><div class="pullquote"><p>Love this review? Forward it to your fellow data and MLOps friends, or share on X with <strong>#TuesdayToolReview</strong> and tag @mlopscommunity</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/fenic-dataframes-for-an-llm-world?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/fenic-dataframes-for-an-llm-world?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><p>Want deeper tutorials? Subscribe to <em><strong><a href="https://neurlcreators.substack.com/">The Neural Blueprint</a></strong></em> for hands-on guides! &#129761;</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Building an Agentic State Machine with LangGraph]]></title><description><![CDATA[LangGraph is a state-machine graph with built-in interrupt handling and a GA cloud platform that is powering the AI agents of Uber and LinkedIn. Should it power yours?]]></description><link>https://neurlcreators.substack.com/p/langgraph-agent-state-machine-review</link><guid isPermaLink="false">https://neurlcreators.substack.com/p/langgraph-agent-state-machine-review</guid><dc:creator><![CDATA[Eteimorde Youdiowei]]></dc:creator><pubDate>Thu, 10 Jul 2025 20:45:21 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/213478c7-e639-428c-a895-c6e3f8199a1b_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>At the inaugural <a href="https://youtu.be/DrygcOI-kG8?feature=shared">LangChain Interrupt conference this year</a>, the spotlight wasn&#8217;t on LangChain itself; it was on LangGraph. The event focused heavily on AI agents and showcased how major companies, such as Uber, LinkedIn, and Replit, are deploying them in production.</p><p>A recurring theme across talks was how these organizations are using LangGraph to build robust, production-grade agentic systems.</p><p>With so many insights shared about LangGraph during the conference, now is the perfect time to take a closer look at what LangGraph is, how it works, and the design philosophy behind it.</p><p>LangGraph was <a href="https://blog.langchain.com/langgraph/">introduced in early 2024</a> as a separate library built on top of LangChain. While LangChain helped developers build simple, linear agentic workflows, it fell short when it came to more complex agentic workflows, especially those involving loops or cycles, which are common in real-world agent interactions.</p><p>To address this limitation, the LangChain team created LangGraph: a framework that introduces graph-based agentic workflows for more flexible and powerful agent design.</p><p>In this article, we&#8217;ll dive deep into LangGraph and explore how it brings core computer science concepts like state machines to life in the world of AI agents. Specifically, we&#8217;ll cover:</p><ul><li><p>How LangGraph models agentic workflows as graphs</p></li><li><p>How to build these workflows using LangGraph&#8217;s API</p></li><li><p>How to use its high-level abstractions to simplify development</p></li><li><p>Real-world examples of how companies are using LangGraph in production</p></li></ul><p>Let&#8217;s get started. &#128640;</p><h2>Understanding State Machines</h2><p>Before we had modern computers, we had state machines, more specifically, Finite State Machines (<a href="https://en.wikipedia.org/wiki/Finite-state_machine">FSMs</a>). These were simple models of computation that operated using a finite set of states. Each state could transition to another based on specific inputs or actions.</p><p>Let&#8217;s take a real-world example: a subway <a href="https://en.wikipedia.org/wiki/Turnstile">turnstile</a>.</p><div id="youtube2-2xbWwk4-zTs" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;2xbWwk4-zTs&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/2xbWwk4-zTs?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Imagine walking into a subway station and approaching a turnstile that only unlocks when you insert a coin. This system has two states:</p><ul><li><p>Locked</p></li><li><p>Unlocked</p></li></ul><p>Here&#8217;s how it works:</p><ul><li><p>If you try to push the turnstile without inserting a coin, it stays locked.</p></li><li><p>If you insert a coin, the turnstile transitions to the Unlocked state.</p></li><li><p>Once you push through, it automatically returns to the Locked state.</p></li></ul><p>These movements between states, triggered by actions like inserting a coin or pushing the turnstile, are called <strong>state transitions</strong>. The turnstile system as a whole is a simple example of a state machine.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SyOZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d325ec0-4c76-4353-868b-04a588962a47_1600x905.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SyOZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d325ec0-4c76-4353-868b-04a588962a47_1600x905.png 424w, https://substackcdn.com/image/fetch/$s_!SyOZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d325ec0-4c76-4353-868b-04a588962a47_1600x905.png 848w, https://substackcdn.com/image/fetch/$s_!SyOZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d325ec0-4c76-4353-868b-04a588962a47_1600x905.png 1272w, https://substackcdn.com/image/fetch/$s_!SyOZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d325ec0-4c76-4353-868b-04a588962a47_1600x905.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SyOZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d325ec0-4c76-4353-868b-04a588962a47_1600x905.png" width="1456" height="824" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7d325ec0-4c76-4353-868b-04a588962a47_1600x905.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:824,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;This is the state machine representation of a turnstile. It has two states: Locked and Unlocked. The machine transitions from the Locked state to the Unlocked state when a coin is inserted. If no coin is inserted, the state remains unchanged.&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="This is the state machine representation of a turnstile. It has two states: Locked and Unlocked. The machine transitions from the Locked state to the Unlocked state when a coin is inserted. If no coin is inserted, the state remains unchanged." title="This is the state machine representation of a turnstile. It has two states: Locked and Unlocked. The machine transitions from the Locked state to the Unlocked state when a coin is inserted. If no coin is inserted, the state remains unchanged." srcset="https://substackcdn.com/image/fetch/$s_!SyOZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d325ec0-4c76-4353-868b-04a588962a47_1600x905.png 424w, https://substackcdn.com/image/fetch/$s_!SyOZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d325ec0-4c76-4353-868b-04a588962a47_1600x905.png 848w, https://substackcdn.com/image/fetch/$s_!SyOZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d325ec0-4c76-4353-868b-04a588962a47_1600x905.png 1272w, https://substackcdn.com/image/fetch/$s_!SyOZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d325ec0-4c76-4353-868b-04a588962a47_1600x905.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>This is the state machine representation of a turnstile. It has two states: <strong>Locked</strong> and <strong>Unlocked</strong>. The machine transitions from the Locked state to the Unlocked state when a coin is inserted. If no coin is inserted, the state remains unchanged.</em></figcaption></figure></div><p>You can also visualize state machines as graphs, where:</p><ul><li><p>States are represented as nodes</p></li><li><p>Transitions between states are represented as edges</p></li></ul><p>This simple model of computation is what LangGraph builds upon to model AI agent interactions as state machines.</p><h2>Agentic State Machines</h2><p>Now that we understand traditional state machines, let&#8217;s explore the idea of <strong>agentic state machines</strong>.</p><p>In the previous section, we saw that state transitions in a finite state machine are driven by predefined conditions. For example, when a user inserts a coin and pushes a turnstile, the machine transitions from a <em>Locked</em> to an <em>Unlocked</em> state based on those fixed inputs.</p><p>But in an agentic state machine, transitions aren't hardcoded. Instead, <strong>an agent</strong> decides which transition to take based on the current state and context. This introduces flexibility and autonomy into the state machine model, which is essential for building intelligent systems.</p><p>An agentic state machine reimagines the traditional model using the following mapping:</p><ul><li><p>Agents and tools as nodes</p></li><li><p>Agent decisions as transitions</p></li><li><p>Conversations as states</p></li></ul><h3>Agents and Tools as Nodes</h3><p>In an agentic state machine, each <strong>node</strong> in the graph represents either an <strong>AI agent</strong> or a <strong>tool</strong>. These components are independent:</p><ul><li><p>Agents are typically language models capable of reasoning and decision-making.</p></li><li><p>Tools are external functions or systems (like a calculator or search engine) that can be invoked when needed.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5-vC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d2ab02b-151c-49fc-a46c-43b09a2a35c9_1600x1266.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5-vC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d2ab02b-151c-49fc-a46c-43b09a2a35c9_1600x1266.png 424w, https://substackcdn.com/image/fetch/$s_!5-vC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d2ab02b-151c-49fc-a46c-43b09a2a35c9_1600x1266.png 848w, https://substackcdn.com/image/fetch/$s_!5-vC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d2ab02b-151c-49fc-a46c-43b09a2a35c9_1600x1266.png 1272w, https://substackcdn.com/image/fetch/$s_!5-vC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d2ab02b-151c-49fc-a46c-43b09a2a35c9_1600x1266.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5-vC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d2ab02b-151c-49fc-a46c-43b09a2a35c9_1600x1266.png" width="1456" height="1152" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5d2ab02b-151c-49fc-a46c-43b09a2a35c9_1600x1266.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1152,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;A set of unconnected nodes, consisting of an agent node and two tool nodes: a calculator tool and a time tool.&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="A set of unconnected nodes, consisting of an agent node and two tool nodes: a calculator tool and a time tool." title="A set of unconnected nodes, consisting of an agent node and two tool nodes: a calculator tool and a time tool." srcset="https://substackcdn.com/image/fetch/$s_!5-vC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d2ab02b-151c-49fc-a46c-43b09a2a35c9_1600x1266.png 424w, https://substackcdn.com/image/fetch/$s_!5-vC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d2ab02b-151c-49fc-a46c-43b09a2a35c9_1600x1266.png 848w, https://substackcdn.com/image/fetch/$s_!5-vC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d2ab02b-151c-49fc-a46c-43b09a2a35c9_1600x1266.png 1272w, https://substackcdn.com/image/fetch/$s_!5-vC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d2ab02b-151c-49fc-a46c-43b09a2a35c9_1600x1266.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>A set of unconnected nodes, consisting of an agent node and two tool nodes: a calculator tool and a time tool.</em></figcaption></figure></div><p>This modularity means each component can operate on its own, but the real power emerges when they are connected meaningfully through transitions.</p><h3>Agent Decision as Transitions</h3><p>To move between nodes, we use <strong>edges</strong>, which represent possible transitions. In an agentic state machine, these transitions are <strong>driven by the agent's decisions</strong>, not by hardcoded logic.</p><p>There are two kinds of edges:</p><ul><li><p><strong>Conditional edges</strong> (represented with dotted lines): The agent chooses whether or not to follow these paths based on its understanding of the current state.</p></li><li><p><strong>Definite edges</strong> (represented with solid lines): These are fixed transitions that are always followed when reached.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ivb4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc473284f-50d1-4ac0-9b98-7cf22902d29e_1600x1266.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ivb4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc473284f-50d1-4ac0-9b98-7cf22902d29e_1600x1266.png 424w, https://substackcdn.com/image/fetch/$s_!ivb4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc473284f-50d1-4ac0-9b98-7cf22902d29e_1600x1266.png 848w, https://substackcdn.com/image/fetch/$s_!ivb4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc473284f-50d1-4ac0-9b98-7cf22902d29e_1600x1266.png 1272w, https://substackcdn.com/image/fetch/$s_!ivb4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc473284f-50d1-4ac0-9b98-7cf22902d29e_1600x1266.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ivb4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc473284f-50d1-4ac0-9b98-7cf22902d29e_1600x1266.png" width="1456" height="1152" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c473284f-50d1-4ac0-9b98-7cf22902d29e_1600x1266.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1152,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;A set of connected nodes, the agent node is connected to the tool nodes through conditional edges, while each tool node is connected back to the agent via a definite edge.&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="A set of connected nodes, the agent node is connected to the tool nodes through conditional edges, while each tool node is connected back to the agent via a definite edge." title="A set of connected nodes, the agent node is connected to the tool nodes through conditional edges, while each tool node is connected back to the agent via a definite edge." srcset="https://substackcdn.com/image/fetch/$s_!ivb4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc473284f-50d1-4ac0-9b98-7cf22902d29e_1600x1266.png 424w, https://substackcdn.com/image/fetch/$s_!ivb4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc473284f-50d1-4ac0-9b98-7cf22902d29e_1600x1266.png 848w, https://substackcdn.com/image/fetch/$s_!ivb4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc473284f-50d1-4ac0-9b98-7cf22902d29e_1600x1266.png 1272w, https://substackcdn.com/image/fetch/$s_!ivb4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc473284f-50d1-4ac0-9b98-7cf22902d29e_1600x1266.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>A set of connected nodes, the agent node is connected to the tool nodes through conditional edges, while each tool node is connected back to the agent via a definite edge.</em></figcaption></figure></div><p>In the illustration above, you&#8217;ll notice that the agent is connected to multiple tools via <strong>conditional edges</strong>, allowing it to decide which tool to use based on the conversation. Once a tool is used, it typically returns to the agent via a <strong>definite edge</strong>.</p><h3>Conversation as States</h3><p>So far, we&#8217;ve defined our nodes (agents and tools) and transitions (edges). But what determines when an agent should make a decision? That&#8217;s where the <strong>state</strong> comes in.</p><p>In agentic state machines, the <strong>state of the system is driven by the conversation</strong> between the agent and the user. As the dialogue progresses, each message updates the state. This context is what the agent uses to decide its next move.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pjSz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea8ff509-f25d-42dc-8ff7-839edfd9d0e4_1345x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pjSz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea8ff509-f25d-42dc-8ff7-839edfd9d0e4_1345x1600.png 424w, https://substackcdn.com/image/fetch/$s_!pjSz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea8ff509-f25d-42dc-8ff7-839edfd9d0e4_1345x1600.png 848w, https://substackcdn.com/image/fetch/$s_!pjSz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea8ff509-f25d-42dc-8ff7-839edfd9d0e4_1345x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!pjSz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea8ff509-f25d-42dc-8ff7-839edfd9d0e4_1345x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pjSz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea8ff509-f25d-42dc-8ff7-839edfd9d0e4_1345x1600.png" width="1345" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ea8ff509-f25d-42dc-8ff7-839edfd9d0e4_1345x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1345,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;The conversation between the agent and a human serves as the states in an agentic state machine. When the human makes a request, it's the agent's job to decide which tool node is best suited to resolve it.&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="The conversation between the agent and a human serves as the states in an agentic state machine. When the human makes a request, it's the agent's job to decide which tool node is best suited to resolve it." title="The conversation between the agent and a human serves as the states in an agentic state machine. When the human makes a request, it's the agent's job to decide which tool node is best suited to resolve it." srcset="https://substackcdn.com/image/fetch/$s_!pjSz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea8ff509-f25d-42dc-8ff7-839edfd9d0e4_1345x1600.png 424w, https://substackcdn.com/image/fetch/$s_!pjSz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea8ff509-f25d-42dc-8ff7-839edfd9d0e4_1345x1600.png 848w, https://substackcdn.com/image/fetch/$s_!pjSz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea8ff509-f25d-42dc-8ff7-839edfd9d0e4_1345x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!pjSz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea8ff509-f25d-42dc-8ff7-839edfd9d0e4_1345x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>The conversation between the agent and a human serves as the states in an agentic state machine. When the human makes a request, it's the agent's job to decide which tool node is best suited to resolve it.</em></figcaption></figure></div><p>For example, in the image above, the state is updated with the user message: <em>"What is 2 times itself?"</em></p><p><em><br></em>The agent interprets this diagram and decides to use the calculator tool. The tool returns a result, which the agent can then act upon, perhaps replying to the user or performing another computation.</p><p>At every step, the conversation evolves, and with it, the state of the system. The agent uses this evolving state to determine transitions and actions.</p><p>Now that we understand the theory behind agentic state machines, let&#8217;s move on to building one using <strong>LangGraph</strong>.</p><p><strong>&#128064; Recommended:</strong></p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;9766f966-437d-4979-b379-0cd62565b33a&quot;,&quot;caption&quot;:&quot;In late 2022, something unexpected took the AI development world by storm, and no, it wasn&#8217;t ChatGPT. It was LangChain. On October 24, 2022, Harrison Chase made the first commit to the LangChain repository, introducing a framework like nothing the software industry had seen before.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;md&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Is LangChain Still Worth Using in 2025?&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:49908626,&quot;name&quot;:&quot;Eteimorde Youdiowei&quot;,&quot;bio&quot;:&quot;Simplifying complex ideas is my passion.....&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/90f6ea8f-0227-42b7-8c09-47e819b7f743_661x661.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null},{&quot;id&quot;:280510396,&quot;name&quot;:&quot;Neurl Creators&quot;,&quot;bio&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1b0dc913-49dd-4e59-ac39-bbbf289e1744_256x256.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-06-17T15:01:19.823Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/50bc91fa-fc15-4b01-a928-ea05b739b603_1456x1048.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://neurlcreators.substack.com/p/is-langchain-still-worth-using-in&quot;,&quot;section_name&quot;:&quot;&#9881;&#65039; BuildAIers&#8217; Toolkit &#9881;&#65039;&quot;,&quot;video_upload_id&quot;:null,&quot;id&quot;:166147065,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:1,&quot;comment_count&quot;:0,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;The Neural Blueprint: Practical Content for AI Builders &quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!6udc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5b15961-5020-4a71-b040-30f9b3d3f232_256x256.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><h2><strong>Build an Agentic State Machine with LangGraph</strong></h2><p>To build our agentic state machine in LangGraph, let&#8217;s start by installing the required libraries. We would need the following: LangGraph and LangChain alongside OpenAI dependencies.</p><p>This would enable us to use OpenAI as our model provider, but you can use any model provider of your choice.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;shell&quot;,&quot;nodeId&quot;:&quot;16bab949-2666-49c9-9f1f-4e3b6096a96a&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-shell">pip install langgraph langchain[openai]</code></pre></div><p>You can now add your OpenAI API key to the environment variable <code>OPENAI_API_KEY</code>.</p><h3><strong>Defining the States</strong></h3><p>With the setup complete, we can now start building our agentic state machine by defining its states.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;8812c5a0-309f-48c7-ba28-7b294fed0832&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">from typing import Annotated, TypedDict
from langgraph.graph.message import add_messages

class State(TypedDict):
    messages: Annotated[list, add_messages]</code></pre></div><p>Here, we define our states as a dictionary that stores the messages exchanged between the user and the agent. Each node will add messages to this state as it transitions from one node to another.</p><h3><strong>Initialize the Graph, Model, and Tool</strong></h3><p>Now that we&#8217;ve defined our states, the next step is to set up the Graph, Agent, and Tool. The <strong>Graph</strong> represents our state machine. It brings together the agents, tools, and transitions that define the agentic workflow.</p><p>We build the graph using the <code>StateGraph</code> class, which takes the state we defined earlier as input.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;139e668b-2881-47ab-b600-36915a42fd3d&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">from langgraph.graph import StateGraph

# Build the graph
graph_builder = StateGraph(State)</code></pre></div><p>Next, we need to define the language model that will serve as the <em>reasoning engine</em> for our agent.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;82d29852-ca3f-44a9-a480-d98146ac2b13&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">from langchain.chat_models import init_chat_model

# Initialize the chat model

llm = init_chat_model("openai:gpt-4.1")</code></pre></div><p>After setting up the model, the next step is to define a tool. We&#8217;ll create a simple calculator tool that the agent can use.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;f17d21ee-395b-4390-af69-e9a04e11e2ee&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python"># Define a simple calculator tool
def calculator(operation: str, a: float, b: float) -&gt; float:
    """A simple calculator function that performs basic arithmetic operations.
    Args:
        operation (str): The operation to perform. Can be "add", "subtract", "multiply", or "divide".
        a (float): The first number.
        b (float): The second number.
    Returns:
        float: The result of the operation.
    Raises:
        ValueError: If the operation is unknown.
    """

    if operation == "add":
        return a + b
    elif operation == "subtract":
        return a - b
    elif operation == "multiply":
        return a * b
    elif operation == "divide":
        return a / b
    else:
        raise ValueError("Unknown operation")</code></pre></div><p>With our tool defined, we can now bind it to the language model. This enables the model to understand how to interact with the tool during execution.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;bac421d5-14a8-4f38-b96f-cf1ca2f39afd&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python"># Bind the calculator tool to the LLM

tools = [calculator]

llm_with_tools = llm.bind_tools(tools)</code></pre></div><h3><strong>Adding the Nodes</strong></h3><p>With the tools and language model defined, we can now add them as nodes in the graph. We'll start by adding the language model, since it'll be using tools that essentially make it an agent.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;4cde24d9-fe98-477b-bfe2-4e777a1019be&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python"># Define the agent function that uses the LLM with tools
def agent(state: State):
    return {"messages": [llm_with_tools.invoke(state["messages"])]}

# Add the agent node to the graph
graph_builder.add_node("agent", agent)</code></pre></div><p>We can visualize the graph and see that it currently contains only one node.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uuvP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29cbf2e8-0873-460f-8a73-9869ebb0ee28_1600x1266.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uuvP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29cbf2e8-0873-460f-8a73-9869ebb0ee28_1600x1266.png 424w, https://substackcdn.com/image/fetch/$s_!uuvP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29cbf2e8-0873-460f-8a73-9869ebb0ee28_1600x1266.png 848w, https://substackcdn.com/image/fetch/$s_!uuvP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29cbf2e8-0873-460f-8a73-9869ebb0ee28_1600x1266.png 1272w, https://substackcdn.com/image/fetch/$s_!uuvP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29cbf2e8-0873-460f-8a73-9869ebb0ee28_1600x1266.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uuvP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29cbf2e8-0873-460f-8a73-9869ebb0ee28_1600x1266.png" width="1456" height="1152" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/29cbf2e8-0873-460f-8a73-9869ebb0ee28_1600x1266.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1152,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;The Agent Node in a Graph&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="The Agent Node in a Graph" title="The Agent Node in a Graph" srcset="https://substackcdn.com/image/fetch/$s_!uuvP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29cbf2e8-0873-460f-8a73-9869ebb0ee28_1600x1266.png 424w, https://substackcdn.com/image/fetch/$s_!uuvP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29cbf2e8-0873-460f-8a73-9869ebb0ee28_1600x1266.png 848w, https://substackcdn.com/image/fetch/$s_!uuvP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29cbf2e8-0873-460f-8a73-9869ebb0ee28_1600x1266.png 1272w, https://substackcdn.com/image/fetch/$s_!uuvP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29cbf2e8-0873-460f-8a73-9869ebb0ee28_1600x1266.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>The Agent Node in a Graph</em></figcaption></figure></div><p>So let&#8217;s add the tool as a node as well.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;96efa795-1d0f-49ca-83e5-edf796a71d18&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">from langgraph.prebuilt import ToolNode

# Add the tool node to the graph

tool_node = ToolNode(tools=tools)

graph_builder.add_node("tools", tool_node)</code></pre></div><p>We used the <code>ToolNode</code> class to simplify the process of creating a node for the tool.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1XfY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff154402-7b7e-4f7c-98ea-67edaf26426f_1600x1266.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1XfY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff154402-7b7e-4f7c-98ea-67edaf26426f_1600x1266.png 424w, https://substackcdn.com/image/fetch/$s_!1XfY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff154402-7b7e-4f7c-98ea-67edaf26426f_1600x1266.png 848w, https://substackcdn.com/image/fetch/$s_!1XfY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff154402-7b7e-4f7c-98ea-67edaf26426f_1600x1266.png 1272w, https://substackcdn.com/image/fetch/$s_!1XfY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff154402-7b7e-4f7c-98ea-67edaf26426f_1600x1266.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1XfY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff154402-7b7e-4f7c-98ea-67edaf26426f_1600x1266.png" width="1456" height="1152" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ff154402-7b7e-4f7c-98ea-67edaf26426f_1600x1266.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1152,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;The agent and tool node in a Graph&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="The agent and tool node in a Graph" title="The agent and tool node in a Graph" srcset="https://substackcdn.com/image/fetch/$s_!1XfY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff154402-7b7e-4f7c-98ea-67edaf26426f_1600x1266.png 424w, https://substackcdn.com/image/fetch/$s_!1XfY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff154402-7b7e-4f7c-98ea-67edaf26426f_1600x1266.png 848w, https://substackcdn.com/image/fetch/$s_!1XfY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff154402-7b7e-4f7c-98ea-67edaf26426f_1600x1266.png 1272w, https://substackcdn.com/image/fetch/$s_!1XfY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff154402-7b7e-4f7c-98ea-67edaf26426f_1600x1266.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>The agent and tool node in a Graph</em></figcaption></figure></div><p>We now have two nodes in our graph, but they&#8217;re not yet connected. Let's link them using edges.</p><h3>Adding Edges</h3><p>To connect the agent to the tool node in LangGraph, we use a <a href="https://langchain-ai.github.io/langgraph/concepts/low_level/?h=conditional+e#conditional-edges">conditional edge</a> because we want the agent to decide whether to transition the state to the tool or simply end.</p><p><strong>Here is how it works: </strong>we define a conditional function that checks whether the agent calls a tool, in this case the calculator. If it does, the state transitions to the calculator node where the tool is executed.</p><p>If the agent doesn&#8217;t call a tool, the state transitions to the end of the graph, returning the agent&#8217;s response.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;338b3116-a210-443a-b7f5-c8d6592ce3a4&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">from langgraph.graph import END, MessagesState

def condition(state: MessagesState):
    messages = state["messages"]
    last_message = messages[-1]
    if last_message.tool_calls:
        return "tools"
    return END</code></pre></div><p>With the conditional function defined, we can now create the edge.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;ee706f8d-6bde-4398-b049-b9247225157a&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python"># Add a conditional edge to the agent node that checks if tools are needed
graph_builder.add_conditional_edges(
    "agent",
    condition,
)</code></pre></div><p>Here's a visual representation of the graph so far.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6NOF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F110a5a77-473f-4dde-bef1-6b264d7dfd3d_1600x1266.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6NOF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F110a5a77-473f-4dde-bef1-6b264d7dfd3d_1600x1266.png 424w, https://substackcdn.com/image/fetch/$s_!6NOF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F110a5a77-473f-4dde-bef1-6b264d7dfd3d_1600x1266.png 848w, https://substackcdn.com/image/fetch/$s_!6NOF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F110a5a77-473f-4dde-bef1-6b264d7dfd3d_1600x1266.png 1272w, https://substackcdn.com/image/fetch/$s_!6NOF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F110a5a77-473f-4dde-bef1-6b264d7dfd3d_1600x1266.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6NOF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F110a5a77-473f-4dde-bef1-6b264d7dfd3d_1600x1266.png" width="1456" height="1152" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/110a5a77-473f-4dde-bef1-6b264d7dfd3d_1600x1266.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1152,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Conditional edges linking the agent to both the tool and end nodes. This means the agent has two paths to pick from.&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Conditional edges linking the agent to both the tool and end nodes. This means the agent has two paths to pick from." title="Conditional edges linking the agent to both the tool and end nodes. This means the agent has two paths to pick from." srcset="https://substackcdn.com/image/fetch/$s_!6NOF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F110a5a77-473f-4dde-bef1-6b264d7dfd3d_1600x1266.png 424w, https://substackcdn.com/image/fetch/$s_!6NOF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F110a5a77-473f-4dde-bef1-6b264d7dfd3d_1600x1266.png 848w, https://substackcdn.com/image/fetch/$s_!6NOF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F110a5a77-473f-4dde-bef1-6b264d7dfd3d_1600x1266.png 1272w, https://substackcdn.com/image/fetch/$s_!6NOF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F110a5a77-473f-4dde-bef1-6b264d7dfd3d_1600x1266.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>Conditional edges linking the agent to both the tool and end nodes. This means the agent has two paths to pick from.</em></figcaption></figure></div><p>A conditional edge exists between the agent and the tool, but if we transition to the tool node, how do we return to the agent node? This is where we introduce a new edge. Unlike the first, this one isn&#8217;t conditional; it&#8217;s a definite transition.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;6d216c9f-b67d-444d-b113-c634bf725b05&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python"># Add an edge from the tools node to the agent node

graph_builder.add_edge("tools", "agent")</code></pre></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OWEN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd36c416-2efd-47e5-8992-dedce53e46f7_1600x1266.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OWEN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd36c416-2efd-47e5-8992-dedce53e46f7_1600x1266.png 424w, https://substackcdn.com/image/fetch/$s_!OWEN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd36c416-2efd-47e5-8992-dedce53e46f7_1600x1266.png 848w, https://substackcdn.com/image/fetch/$s_!OWEN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd36c416-2efd-47e5-8992-dedce53e46f7_1600x1266.png 1272w, https://substackcdn.com/image/fetch/$s_!OWEN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd36c416-2efd-47e5-8992-dedce53e46f7_1600x1266.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OWEN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd36c416-2efd-47e5-8992-dedce53e46f7_1600x1266.png" width="1456" height="1152" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bd36c416-2efd-47e5-8992-dedce53e46f7_1600x1266.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1152,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;A definite edge linking the tool node to the agent node. This means the tool node has only one path to follow.&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="A definite edge linking the tool node to the agent node. This means the tool node has only one path to follow." title="A definite edge linking the tool node to the agent node. This means the tool node has only one path to follow." srcset="https://substackcdn.com/image/fetch/$s_!OWEN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd36c416-2efd-47e5-8992-dedce53e46f7_1600x1266.png 424w, https://substackcdn.com/image/fetch/$s_!OWEN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd36c416-2efd-47e5-8992-dedce53e46f7_1600x1266.png 848w, https://substackcdn.com/image/fetch/$s_!OWEN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd36c416-2efd-47e5-8992-dedce53e46f7_1600x1266.png 1272w, https://substackcdn.com/image/fetch/$s_!OWEN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd36c416-2efd-47e5-8992-dedce53e46f7_1600x1266.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>A definite edge linking the tool node to the agent node. This means the tool node has only one path to follow.</em></figcaption></figure></div><p>The graph is nearly complete; we simply need to incorporate one final element. Currently, the graph doesn&#8217;t have a starting node. We will add a start node and connect it to the agent node.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;c4d8d591-7f75-43ce-91da-5a21ac7c39cb&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">from langgraph.graph import START

# Add the starting edge to the agent node

graph_builder.add_edge(START, "agent")</code></pre></div><p>This means the agent is the first node to receive the state before any other part of the graph.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UwkR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2c18cbd-b289-49bc-9935-0042e1415bab_1600x1266.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UwkR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2c18cbd-b289-49bc-9935-0042e1415bab_1600x1266.png 424w, https://substackcdn.com/image/fetch/$s_!UwkR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2c18cbd-b289-49bc-9935-0042e1415bab_1600x1266.png 848w, https://substackcdn.com/image/fetch/$s_!UwkR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2c18cbd-b289-49bc-9935-0042e1415bab_1600x1266.png 1272w, https://substackcdn.com/image/fetch/$s_!UwkR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2c18cbd-b289-49bc-9935-0042e1415bab_1600x1266.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UwkR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2c18cbd-b289-49bc-9935-0042e1415bab_1600x1266.png" width="1456" height="1152" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c2c18cbd-b289-49bc-9935-0042e1415bab_1600x1266.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1152,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;A complete agentic graph, featuring the start node, agent node, tool node, and end node.&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="A complete agentic graph, featuring the start node, agent node, tool node, and end node." title="A complete agentic graph, featuring the start node, agent node, tool node, and end node." srcset="https://substackcdn.com/image/fetch/$s_!UwkR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2c18cbd-b289-49bc-9935-0042e1415bab_1600x1266.png 424w, https://substackcdn.com/image/fetch/$s_!UwkR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2c18cbd-b289-49bc-9935-0042e1415bab_1600x1266.png 848w, https://substackcdn.com/image/fetch/$s_!UwkR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2c18cbd-b289-49bc-9935-0042e1415bab_1600x1266.png 1272w, https://substackcdn.com/image/fetch/$s_!UwkR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2c18cbd-b289-49bc-9935-0042e1415bab_1600x1266.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>A complete agentic graph, featuring the start node, agent node, tool node, and end node.</em></figcaption></figure></div><p>With that, our graph is complete. The final step is to compile it.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;2c7ed895-4595-4d01-b38a-9c954a1fd51f&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python"># Compile the graph

graph = graph_builder.compile()</code></pre></div><h3>Interacting with the Agentic State Machine</h3><p>With the graph compiled, we can now test the agentic state machine by using the graph&#8217;s invoke method.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;4ba4accc-e3d2-47d3-8502-8de7a4d4df5a&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python"># Invoke the graph with a user message

result = graph.invoke({"messages": [{"role": "user", "content": "What is a million times four hundred and 8?"}]})

print("Assistant: ", result["messages"][-1].content)</code></pre></div><p>In the code above, we provided the agentic state machine with the initial input:</p><ul><li><p><em>&#8220;What is a million times four hundred and eight?&#8221;</em></p></li></ul><p>The state machine should return a response along the lines of:</p><ul><li><p><em>&#8220;A million times four hundred and eight is 408,000,000.&#8221;</em></p></li></ul><p>Now, let&#8217;s analyze the state transitions that occurred within the agentic state machine.</p><div id="youtube2-xc9s9LBswcs" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;xc9s9LBswcs&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/xc9s9LBswcs?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Here&#8217;s how the state transitioned:</p><ol><li><p>The user asked: <em>"What is a million times four hundred and eight?"</em> This became the initial state at the start node, which then transitioned to the agent node.</p></li><li><p>The agent analyzed the input and determined it needed to use the calculator tool, so it initiated a tool call.</p></li><li><p>The conditional function we defined detected the tool call, causing the state to transition to the tool node where the calculator was executed.</p></li><li><p>The result of the calculation was passed back to the agent.</p></li><li><p>The agent updated the state with its final response and transitioned to the end node.</p></li></ol><p>We&#8217;ve now seen how to bring an agentic state machine to life using LangGraph. While this low-level approach gives you full control over the system, it can be quite involved and requires more effort to build and maintain.</p><p>To make things easier, LangGraph also provides a set of high-level APIs that simplify the process of building agentic systems.</p><div class="pullquote"><p>Need creative, high-quality technical content? Happy to chat! Book a call with our Creative Engineers&#128071;&#127998;</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://calendar.app.google/s6ekF2rDbPjKWLLQ8&quot;,&quot;text&quot;:&quot;Let&#8217;s Chat! &#128222;&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://calendar.app.google/s6ekF2rDbPjKWLLQ8"><span>Let&#8217;s Chat! &#128222;</span></a></p></div><h2>LangGraph High-Level APIs</h2><p>LangGraph provides several high-level APIs that simplify the process of building AI agents. These abstractions allow you to focus more on behavior and logic, rather than the low-level details of graph construction. Some of the most notable include:</p><ul><li><p><a href="https://langchain-ai.github.io/langgraph/agents/agents/">LangGraph Prebuilt</a></p></li><li><p><a href="https://langchain-ai.github.io/langgraph/tutorials/multi_agent/agent_supervisor/">LangGraph Supervisor</a></p></li><li><p><a href="https://github.com/langchain-ai/langgraph-swarm-py">LangGraph Swarm</a></p></li><li><p><a href="https://github.com/langchain-ai/langchain-mcp-adapters">LangGraph MCP Adapter</a></p></li></ul><h3>LangGraph-Prebuilt</h3><p><strong>LangGraph Prebuilt</strong> offers ready-to-use components for building agentic systems. Instead of manually working with graph concepts like nodes and edges, you use higher-level abstractions that encapsulate common agent patterns.</p><p>For example, here&#8217;s how the graph we previously built from scratch can be created using the Prebuilt API:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;d87ac4bb-7a1c-44da-921b-898018dde004&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">from langgraph.prebuilt import create_react_agent

def calculator(operation: str, a: float, b: float) -&gt; float:
    """A simple calculator function that performs basic arithmetic operations.
    Args:
        operation (str): The operation to perform. Can be "add", "subtract", "multiply", or "divide".
        a (float): The first number.
        b (float): The second number.
    Returns:
        float: The result of the operation.
    Raises:
        ValueError: If the operation is unknown.
    """

    if operation == "add":
        return a + b
    elif operation == "subtract":
        return a - b
    elif operation == "multiply":
        return a * b
    elif operation == "divide":
        return a / b
    else:
        raise ValueError("Unknown operation")

agent = create_react_agent(
    model="openai:gpt-4.1",  
    tools=[calculator],  
    prompt="You are a helpful assistant"  
)

# Run the agent
agent.invoke(
    {"messages": [{"role": "user", "content": "What is a million times four hundred and 8?"}]}
)</code></pre></div><p>This code uses the <a href="http://python.langchain.com/api_reference/langchain/agents/langchain.agents.react.agent.create_react_agent.html">create_react_agent</a> function to create the agent. LangGraph Prebuilt also provides additional components, including tools for managing agent memory and state more effectively.</p><h3>LangGraph Supervisor</h3><p>If you're building a supervisor-style agentic system, where one agent oversees and coordinates the actions of other agents, then LangGraph Supervisor is the ideal API to use. Instead of building the entire system from scratch, you can use this high-level abstraction to simplify the process.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HGWL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a144e9f-1f99-414c-9025-e9aebf8814ed_1600x1266.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HGWL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a144e9f-1f99-414c-9025-e9aebf8814ed_1600x1266.png 424w, https://substackcdn.com/image/fetch/$s_!HGWL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a144e9f-1f99-414c-9025-e9aebf8814ed_1600x1266.png 848w, https://substackcdn.com/image/fetch/$s_!HGWL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a144e9f-1f99-414c-9025-e9aebf8814ed_1600x1266.png 1272w, https://substackcdn.com/image/fetch/$s_!HGWL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a144e9f-1f99-414c-9025-e9aebf8814ed_1600x1266.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HGWL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a144e9f-1f99-414c-9025-e9aebf8814ed_1600x1266.png" width="1456" height="1152" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1a144e9f-1f99-414c-9025-e9aebf8814ed_1600x1266.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1152,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;The supervisor agentic graph.&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="The supervisor agentic graph." title="The supervisor agentic graph." srcset="https://substackcdn.com/image/fetch/$s_!HGWL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a144e9f-1f99-414c-9025-e9aebf8814ed_1600x1266.png 424w, https://substackcdn.com/image/fetch/$s_!HGWL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a144e9f-1f99-414c-9025-e9aebf8814ed_1600x1266.png 848w, https://substackcdn.com/image/fetch/$s_!HGWL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a144e9f-1f99-414c-9025-e9aebf8814ed_1600x1266.png 1272w, https://substackcdn.com/image/fetch/$s_!HGWL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a144e9f-1f99-414c-9025-e9aebf8814ed_1600x1266.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>The supervisor agentic graph.</em></figcaption></figure></div><p>To use it, you need to install the <code>langgraph-supervisor</code> library. It can be used alongside the <code>langgraph-prebuilt</code> library.</p><h3>LangGraph Swarm</h3><p><a href="https://langchain-ai.github.io/langgraph/agents/multi-agent/#swarm">LangGraph Swarm</a> is a high-level library designed for building collaborative agent systems, where multiple agents work together and can hand off tasks to one another without a central supervisor.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!H2PT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d266394-ef5d-4449-a342-b48dd0de51d9_1600x1266.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!H2PT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d266394-ef5d-4449-a342-b48dd0de51d9_1600x1266.png 424w, https://substackcdn.com/image/fetch/$s_!H2PT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d266394-ef5d-4449-a342-b48dd0de51d9_1600x1266.png 848w, https://substackcdn.com/image/fetch/$s_!H2PT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d266394-ef5d-4449-a342-b48dd0de51d9_1600x1266.png 1272w, https://substackcdn.com/image/fetch/$s_!H2PT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d266394-ef5d-4449-a342-b48dd0de51d9_1600x1266.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!H2PT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d266394-ef5d-4449-a342-b48dd0de51d9_1600x1266.png" width="1456" height="1152" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2d266394-ef5d-4449-a342-b48dd0de51d9_1600x1266.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1152,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;The swarm agentic graph.&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="The swarm agentic graph." title="The swarm agentic graph." srcset="https://substackcdn.com/image/fetch/$s_!H2PT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d266394-ef5d-4449-a342-b48dd0de51d9_1600x1266.png 424w, https://substackcdn.com/image/fetch/$s_!H2PT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d266394-ef5d-4449-a342-b48dd0de51d9_1600x1266.png 848w, https://substackcdn.com/image/fetch/$s_!H2PT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d266394-ef5d-4449-a342-b48dd0de51d9_1600x1266.png 1272w, https://substackcdn.com/image/fetch/$s_!H2PT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d266394-ef5d-4449-a342-b48dd0de51d9_1600x1266.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>The swarm agentic graph.</em></figcaption></figure></div><h3>LangGraph MCP Adapters</h3><p>When building agentic graph systems that need to interact with external tools via the Model Context Protocol (<a href="https://neurlcreators.substack.com/p/how-ai-agents-use-tools-mcp-architecture-visual-explainer">MCP</a>), the langchain-mcp-adapters library simplifies the entire process.</p><p>It essentially acts as an MCP client for your agentic graph system, enabling it to communicate with both <a href="https://neurlcreators.substack.com/i/159944232/what-is-the-model-context-protocol-mcp-architecture">local and remote MCP servers</a>.</p><h2>Who is Using LangGraph in Prod?</h2><p>Every developer asks one question when they encounter a new tool: <a href="https://blog.langchain.dev/is-langgraph-used-in-production/">Is it used in production</a>? Some of the largest companies in the tech industry indeed use LangGraph in production to build AI agents. Let&#8217;s explore a few examples.</p><h3>Uber</h3><p>Uber&#8217;s Developer Platform team built AI agents using LangGraph to enhance the developer experience.</p><p>These agents can generate unit tests and ensure adherence to Uber&#8217;s internal coding standards, streamlining development and improving code quality.</p><div id="youtube2-Bugs0dVcNI8" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;Bugs0dVcNI8&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/Bugs0dVcNI8?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><h3>LinkedIn</h3><p>To improve recruiter efficiency, LinkedIn has developed an advanced AI Hiring Agent powered by LangGraph.</p><p>The team designed the agent to streamline the hiring process through automated functions, including performing conversational searches to understand recruiter needs and accurately matching candidates to open roles.</p><div id="youtube2-NmblVxyBhi8" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;NmblVxyBhi8&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/NmblVxyBhi8?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><h3>Elastic</h3><p>Elastic <a href="https://www.elastic.co/blog/building-automatic-import-attack-discovery-langchain">uses LangGraph to orchestrate its AI agents for threat detection scenarios</a>, significantly reducing labor-intensive SecOps tasks.</p><p>The integration improves Elastic's Generative AI features, such as the Elastic AI Assistant and Automatic Import, enabling them to understand complex security scenarios, generate queries, and craft accurate data integrations for streamlined security analytics.</p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/langgraph-agent-state-machine-review?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading The Neural Blueprint: Practical Content for AI Builders! Know a builder who needs to see this? Share it with your team or community.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/langgraph-agent-state-machine-review?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/langgraph-agent-state-machine-review?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><h2>Conclusion: LangChain&#8212;The Agentic State Machine</h2><p>LangGraph is quickly becoming the go-to framework for <a href="https://www.youtube.com/watch?v=aHCDrAbH_go">building effective AI agents</a>. It really shines because it takes proven computer science ideas like state machines and graphs, and uses them to create modern AI agents that are much tougher and better at tackling real-world problems.</p><p>Although LangChain has received significant criticism, LangGraph's design has earned widespread praise.</p><p>The big difference? Instead of agents relying on prompts to make decisions (which was a common criticism of LangChain), LangGraph uses clear, controllable, and easy-to-monitor graphs. This makes the agents far more predictable and reliable.</p><p>Think of this article as just a quick intro; if you want to dive deep and truly understand LangGraph, definitely check out its <a href="https://langchain-ai.github.io/langgraph/concepts/why-langgraph/">official documentation</a>.</p><div><hr></div><p>We created &#128736;&#65039;<strong>BuildAIers Toolkit</strong> to help you cut through the noise and build smarter with the right tools, stacks, and strategies. On bi-weekly Tuesdays, we deliver hands-on insights to help you evaluate, integrate, and scale AI tech that actually works.</p><p>&#128204; <strong>Key Goals for Every Post:</strong></p><p>&#9989; Help AI builders make informed, confident tooling decisions</p><p>&#9989; Highlight workflows, tradeoffs, and real-world use cases</p><p>&#9989; Encourage a builder-led conversation around what tools are shaping the future of AI.</p><p>Used this tool in production? Tell us how it performed&#8212;and what you&#8217;d do differently. &#11015;&#65039;</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/langgraph-agent-state-machine-review/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/langgraph-agent-state-machine-review/comments"><span>Leave a comment</span></a></p>]]></content:encoded></item><item><title><![CDATA[LangGraph: The Agentic State-Machine Framework Taking AI Workflows Mainstream]]></title><description><![CDATA[State-machine graphs, built-in interrupt handling, and a GA cloud platform. LangGraph is powering Uber & LinkedIn&#8217;s production agents. Should it power yours? Our deep-dive verdict inside.]]></description><link>https://neurlcreators.substack.com/p/langgraph-2025-review</link><guid isPermaLink="false">https://neurlcreators.substack.com/p/langgraph-2025-review</guid><dc:creator><![CDATA[Neurl Creators]]></dc:creator><pubDate>Tue, 01 Jul 2025 21:00:33 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/11c02413-400e-4f3c-b2cb-030a8b6c8dfb_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>LangGraph wowed devs at the 2025 LangChain Interrupt conference: Uber, LinkedIn, and Replit each showcased prod use-cases.</p><p>Why? Because LangGraph lifts agentic workflows from prompt spaghetti to explicit, traceable state machines.</p><p>Here are the core parts of LangGraph in 2025:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iBz-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37304546-60d9-4e77-8049-365e7390808d_1600x1118.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iBz-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37304546-60d9-4e77-8049-365e7390808d_1600x1118.png 424w, https://substackcdn.com/image/fetch/$s_!iBz-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37304546-60d9-4e77-8049-365e7390808d_1600x1118.png 848w, https://substackcdn.com/image/fetch/$s_!iBz-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37304546-60d9-4e77-8049-365e7390808d_1600x1118.png 1272w, https://substackcdn.com/image/fetch/$s_!iBz-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37304546-60d9-4e77-8049-365e7390808d_1600x1118.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iBz-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37304546-60d9-4e77-8049-365e7390808d_1600x1118.png" width="1456" height="1017" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/37304546-60d9-4e77-8049-365e7390808d_1600x1118.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1017,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Core parts of LangGraph in 2025.&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Core parts of LangGraph in 2025." title="Core parts of LangGraph in 2025." srcset="https://substackcdn.com/image/fetch/$s_!iBz-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37304546-60d9-4e77-8049-365e7390808d_1600x1118.png 424w, https://substackcdn.com/image/fetch/$s_!iBz-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37304546-60d9-4e77-8049-365e7390808d_1600x1118.png 848w, https://substackcdn.com/image/fetch/$s_!iBz-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37304546-60d9-4e77-8049-365e7390808d_1600x1118.png 1272w, https://substackcdn.com/image/fetch/$s_!iBz-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37304546-60d9-4e77-8049-365e7390808d_1600x1118.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Core parts of LangGraph in 2025.</figcaption></figure></div><blockquote><p>&#9193;<strong>TL;DR:</strong></p><p><strong>&#9989; Use it</strong> for multi-step agent graphs, fine-grained RAG, model-agnostic tooling, or deep tracing via LangSmith.</p><p><strong>&#128683; Skip it</strong> if you only need a one-vendor chat bot or you can&#8217;t spare the LCEL learning curve.</p><p>&#11088; Verdict: 4.5/5&#8212;the most disciplined path to production-grade agents today.</p></blockquote><div><hr></div><h2><strong>&#128226; How is LangChain Doing in 2025?</strong></h2><p>LangGraph is a standalone library (and now a managed platform) that <strong>models agent workflows as explicit state-machine graphs</strong>. Each node is an LLM or tool; edges define deterministic or agent-chosen transitions&#8212;yielding observability and error-handling you rarely get from prompt-only agents.</p><ul><li><p><strong>v0.4</strong> (<a href="https://changelog.langchain.com/announcements/langgraph-v0-4-working-with-interrupts">Apr 29 2025</a>) added automatic interrupt surfacing for safer long-running graphs.</p></li><li><p><strong>LangGraph Platform GA</strong> (<a href="https://blog.langchain.com/langgraph-platform-ga/">May 14 2025</a>) lets teams deploy, autoscale and monitor stateful agents in one click.</p></li><li><p>Works with 35 + model back-ends (OpenAI, Gemini, Claude, Bedrock, Ollama) via <a href="https://python.langchain.com/api_reference/community/adapters.html">LangChain adapters</a></p></li><li><p>Powers prod agents at Uber (dev-QA bot), LinkedIn (AI Hiring Agent), Elastic (threat-intel ingest), and more.</p></li><li><p>Optional <strong>LangSmith</strong> tracing for latency, token-cost &amp; evals.</p></li></ul><div><hr></div><h2><strong>&#9881;&#65039; Key Architecture Upgrades to LangGraph in 2025</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qeBq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F934f7bff-2288-41d5-a7b9-0b917bb9ceff_1419x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qeBq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F934f7bff-2288-41d5-a7b9-0b917bb9ceff_1419x1600.png 424w, https://substackcdn.com/image/fetch/$s_!qeBq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F934f7bff-2288-41d5-a7b9-0b917bb9ceff_1419x1600.png 848w, https://substackcdn.com/image/fetch/$s_!qeBq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F934f7bff-2288-41d5-a7b9-0b917bb9ceff_1419x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!qeBq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F934f7bff-2288-41d5-a7b9-0b917bb9ceff_1419x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qeBq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F934f7bff-2288-41d5-a7b9-0b917bb9ceff_1419x1600.png" width="1419" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/934f7bff-2288-41d5-a7b9-0b917bb9ceff_1419x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1419,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Key Architecture Upgrades to LangGraph in 2025&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Key Architecture Upgrades to LangGraph in 2025" title="Key Architecture Upgrades to LangGraph in 2025" srcset="https://substackcdn.com/image/fetch/$s_!qeBq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F934f7bff-2288-41d5-a7b9-0b917bb9ceff_1419x1600.png 424w, https://substackcdn.com/image/fetch/$s_!qeBq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F934f7bff-2288-41d5-a7b9-0b917bb9ceff_1419x1600.png 848w, https://substackcdn.com/image/fetch/$s_!qeBq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F934f7bff-2288-41d5-a7b9-0b917bb9ceff_1419x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!qeBq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F934f7bff-2288-41d5-a7b9-0b917bb9ceff_1419x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><strong>Key Architecture Upgrades to LangGraph in 2025</strong></figcaption></figure></div><h2><strong>&#9881;&#65039; How Does LangChain Work? (2025 Edition)</strong></h2><ol><li><p><strong>Define States = Conversation history</strong> (list of messages).</p></li><li><p><strong>Add Nodes = Agents or Tools</strong> (LLM with bound tools, or a standalone function).</p></li><li><p><strong>Wire Edges = Transitions</strong>.</p><ul><li><p><em>Conditional</em> dotted edges (LLM decides).</p></li><li><p><em>Definite</em> solid edges (always taken).</p></li></ul></li><li><p><strong>Compiler</strong> turns your graph into an async executor with built-in tracing.</p></li></ol><p>The <strong>Graph = Pure Python</strong>: no YAML, no DSL. Use LCEL blocks and type hints.</p><div><hr></div><h2><strong>&#128640; Quick Spin-Up (Agent + Calculator Tool)</strong></h2><pre><code># pip install -U langgraph langchain[openai] langgraph-prebuilt faiss-cpu

import os

from langgraph.prebuilt import create_react_agent

from langchain_openai import ChatOpenAI

os.environ["OPENAI_API_KEY"] = "sk-..." # &#9888;&#65039; set yours

os.environ["LANGCHAIN_TRACING_V2"] = "true" # optional: LangSmith traces

# 1-liner tool

def calc(operation:str, a:float, b:float): # simple tool

return {"add":a+b,"sub":a-b,"mul":a*b,"div":a/b}[operation]

agent = create_react_agent(

model="openai:gpt-4o",

tools=[calc],

prompt="You are a helpful and concise math tutor."

)

result = agent.invoke({"messages":[{"role":"user",

"content":"What is 42*999?"}]})

print(result["messages"][-1].content)</code></pre><p>The helper builds a graph with:</p><ol><li><p><strong>Agent node</strong> (GPT-4o)</p></li><li><p><strong>Tool node</strong> (calculator)</p></li><li><p>Conditional edge: agent &#8594; tool if function call detected</p></li><li><p>Definite edge: tool &#8594; agent with result</p></li></ol><p>Interrupts, retries and LangSmith traces are automatic.</p><div><hr></div><h2><strong>&#128293; Why MLOps Engineers Care</strong></h2><ul><li><p><strong>Deterministic control flow</strong>: Graph edges make loops, branches and error paths explicit.</p></li><li><p><strong>Better observability</strong>: LangSmith + OpenTelemetry give step-level traces and spend.</p></li><li><p><strong>Vendor freedom</strong>: Swap GPT-4o for Gemini 1.5, Claude 3 Opus or local Ollama without code rewrites.</p></li><li><p><strong>Production proof</strong>: Uber&#8217;s developer-productivity agents <a href="https://www.youtube.com/watch?v=Bugs0dVcNI8">claim </a><strong><a href="https://www.youtube.com/watch?v=Bugs0dVcNI8">21 k engineer-hours saved</a></strong>; LinkedIn&#8217;s <a href="https://www.youtube.com/watch?v=NmblVxyBhi8">AI Hiring Assistant</a> runs on LangGraph too.</p></li><li><p><strong>MCP adapter</strong>: plug external tool servers (e.g., DeepWiki, private RAG APIs) in one line.</p></li></ul><div><hr></div><h2><strong>&#128201; Gotchas &amp; Caveats</strong></h2><ol><li><p><strong>Learning curve</strong>: need to grok state machines and LCEL syntax.</p></li><li><p><strong>Extra layer</strong>: simple Response-API bots deploy faster + each node/hop adds ~tens of ms vs straight SDK.</p></li><li><p><strong>Ecosystem split</strong>: still relies on LangChain&#8217;s core; if you dislike LCEL, you may prefer DSPy or Pydantic-AI.</p></li><li><p><strong>Over-engineering risk</strong>: for plain Q&amp;A, graphs are unnecessary overhead.</p></li></ol><div><hr></div><h2><strong>&#128483;&#65039; Community Pulse</strong></h2><p><a href="https://mlops-community.slack.com/archives/C0861BQ65A7/p1749627594014839?thread_ts=1749573545.316829&amp;cid=C0861BQ65A7">Lily Nicholls (Jun 11th)</a></p><p><em>&#8220;&#8230; I want the agent to be able to decide whether additional retrieval steps are needed or to tweak it's generated response based on a user's input.&#8221;</em></p><p><a href="https://mlops-community.slack.com/archives/C0861BQ65A7/p1749720843653299?thread_ts=1749573545.316829&amp;cid=C0861BQ65A7">Sammer Puran (Jun 12th)</a></p><p><em>&#8220;Langgraph works great for this process. You can add tools to retrieve the information, rewrite the query if the retrieved documents are not relevant, rank the documents, and generate the response. Also, you can write flows as directed graphs and visualize them&#8221;</em></p><p><a href="https://mlops-community.slack.com/archives/C0861BQ65A7/p1749723048243519?thread_ts=1749573545.316829&amp;cid=C0861BQ65A7">Lily Nicholls (Jun 12th)</a></p><p><em>&#8220;Great thank you! I am trying to decide between LangGraph &amp; ADK&#8221;</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uIfJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ffb396-8fa9-4a72-b95d-c02842d89dc1_1356x930.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uIfJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ffb396-8fa9-4a72-b95d-c02842d89dc1_1356x930.png 424w, https://substackcdn.com/image/fetch/$s_!uIfJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ffb396-8fa9-4a72-b95d-c02842d89dc1_1356x930.png 848w, https://substackcdn.com/image/fetch/$s_!uIfJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ffb396-8fa9-4a72-b95d-c02842d89dc1_1356x930.png 1272w, https://substackcdn.com/image/fetch/$s_!uIfJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ffb396-8fa9-4a72-b95d-c02842d89dc1_1356x930.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uIfJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ffb396-8fa9-4a72-b95d-c02842d89dc1_1356x930.png" width="1356" height="930" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/83ffb396-8fa9-4a72-b95d-c02842d89dc1_1356x930.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:930,&quot;width&quot;:1356,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:291182,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://neurlcreators.substack.com/i/167299279?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ffb396-8fa9-4a72-b95d-c02842d89dc1_1356x930.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uIfJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ffb396-8fa9-4a72-b95d-c02842d89dc1_1356x930.png 424w, https://substackcdn.com/image/fetch/$s_!uIfJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ffb396-8fa9-4a72-b95d-c02842d89dc1_1356x930.png 848w, https://substackcdn.com/image/fetch/$s_!uIfJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ffb396-8fa9-4a72-b95d-c02842d89dc1_1356x930.png 1272w, https://substackcdn.com/image/fetch/$s_!uIfJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83ffb396-8fa9-4a72-b95d-c02842d89dc1_1356x930.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h2><strong>&#128161; Real-World Use Case: Uber&#8217;s Dev-Rel Copilot</strong></h2><ol><li><p><strong>Graph</strong>: Supervisor agent &#8594; code-generator sub-agent &#8594; test-writer sub-agent</p></li><li><p><strong>Flow</strong>: CL diff &#8594; graph generates unit tests &amp; standards feedback</p></li><li><p><strong>Impact</strong>: 21 k dev-hours saved in 90 days (LangChain Interrupt keynote)</p></li></ol><div id="youtube2-Bugs0dVcNI8" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;Bugs0dVcNI8&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/Bugs0dVcNI8?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><div><hr></div><h2>&#128202; How Does LangGraph Stack Up Against Alternatives in 2025</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Bjje!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe59557c1-60ad-4c42-86a4-a68d9203647d_1235x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Bjje!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe59557c1-60ad-4c42-86a4-a68d9203647d_1235x1600.png 424w, https://substackcdn.com/image/fetch/$s_!Bjje!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe59557c1-60ad-4c42-86a4-a68d9203647d_1235x1600.png 848w, https://substackcdn.com/image/fetch/$s_!Bjje!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe59557c1-60ad-4c42-86a4-a68d9203647d_1235x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!Bjje!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe59557c1-60ad-4c42-86a4-a68d9203647d_1235x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Bjje!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe59557c1-60ad-4c42-86a4-a68d9203647d_1235x1600.png" width="1235" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e59557c1-60ad-4c42-86a4-a68d9203647d_1235x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1235,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;How Does LangGraph Stack Up Against Alternatives in 2025&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="How Does LangGraph Stack Up Against Alternatives in 2025" title="How Does LangGraph Stack Up Against Alternatives in 2025" srcset="https://substackcdn.com/image/fetch/$s_!Bjje!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe59557c1-60ad-4c42-86a4-a68d9203647d_1235x1600.png 424w, https://substackcdn.com/image/fetch/$s_!Bjje!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe59557c1-60ad-4c42-86a4-a68d9203647d_1235x1600.png 848w, https://substackcdn.com/image/fetch/$s_!Bjje!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe59557c1-60ad-4c42-86a4-a68d9203647d_1235x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!Bjje!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe59557c1-60ad-4c42-86a4-a68d9203647d_1235x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">How Does LangGraph Stack Up Against Alternatives in 2025</figcaption></figure></div><div><hr></div><h2><strong>&#129489;&#8205;&#9878;&#65039; Final Verdict: 4.5 / 5 &#8212; Graphs &gt; Prompts for Complex Agents</strong></h2><h3><strong>&#11088; Rating: &#11088;&#11088;&#11088;&#11088;&#9734; (4.5 / 5)</strong></h3><p><strong>Ship it if&#8230;</strong></p><ul><li><p>You orchestrate multi-step agents, need reliable retries or must mix many tools.</p></li><li><p>You need loops, parallelism, or supervisor patterns.</p></li><li><p>Vendor-agnostic strategy or on-prem LLMs matter.</p></li><li><p>Granular tracing, cost guards and interrupt safety are non-negotiable.</p></li></ul><p><strong>Hold off if&#8230;</strong></p><ul><li><p>You&#8217;re pushing a single-provider chatbot with a 1-week MVP deadline.</p></li><li><p>One-vendor, single-prompt chatbots meet your roadmap.</p></li><li><p>Team can&#8217;t spare time to ramp-up on LCEL/graph mental-model.</p></li><li><p>Ultra-low latency (&lt;100 ms) matters more than complex logic.</p></li></ul><div class="pullquote"><p>Need creative, high-quality technical content? Happy to chat! Book a call with our Creative Engineers&#128071;&#127998;</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://calendar.app.google/s6ekF2rDbPjKWLLQ8&quot;,&quot;text&quot;:&quot;Let&#8217;s Chat! &#128222;&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://calendar.app.google/s6ekF2rDbPjKWLLQ8"><span>Let&#8217;s Chat! &#128222;</span></a></p></div><h2><strong>&#128204; More Resources</strong></h2><ul><li><p><strong><a href="https://changelog.langchain.com/announcements/langgraph-v0-4-working-with-interrupts">LangGraph v0.4 Interrupts changelog</a></strong></p></li><li><p><strong><a href="https://langchain-ai.github.io/langgraph/">LangGraph docs</a></strong></p></li><li><p><strong><a href="https://blog.langchain.dev/tag/case-studies/">Case-study hub</a></strong></p></li><li><p><strong><a href="https://blog.langchain.dev/langgraph-platform-ga/">LangGraph Platform GA blog</a></strong></p></li><li><p><strong><a href="https://langchain-ai.github.io/langgraph/agents/mcp">MCP adapter docs</a></strong></p></li><li><p><strong>Uber Interrupt talk (21 k hours saved):</strong></p></li></ul><div id="youtube2-Bugs0dVcNI8" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;Bugs0dVcNI8&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/Bugs0dVcNI8?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><ul><li><p><strong>LinkedIn <a href="https://www.youtube.com/watch?v=NmblVxyBhi8">Hiring Age</a>nt deep-dive: </strong></p></li></ul><div id="youtube2-NmblVxyBhi8" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;NmblVxyBhi8&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/NmblVxyBhi8?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><div><hr></div><p><strong>Enjoy this? </strong>Forward it to an AI&#8209;builder friend or share it on X with #NeuralBlueprint. &#129761;</p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/langgraph-2025-review?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading The Neural Blueprint: Practical Content for AI Builders ! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/langgraph-2025-review?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/langgraph-2025-review?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><p></p>]]></content:encoded></item><item><title><![CDATA[Is LangChain Still Worth Using in 2025?]]></title><description><![CDATA[LangChain revolutionized how developers built with LLMs back in 2022. But in 2025, do you really still need it? This deep dive compares LangChain with today&#8217;s cutting-edge APIs &#8212; and helps you decide if it&#8217;s still worth using.]]></description><link>https://neurlcreators.substack.com/p/is-langchain-still-worth-using-in</link><guid isPermaLink="false">https://neurlcreators.substack.com/p/is-langchain-still-worth-using-in</guid><dc:creator><![CDATA[Eteimorde Youdiowei]]></dc:creator><pubDate>Tue, 17 Jun 2025 15:01:19 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/bef36adf-300c-4b1b-9e40-6618e629e2b7_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In late 2022, something unexpected took the AI development world by storm, and no, it wasn&#8217;t ChatGPT. It was LangChain. On October 24, 2022, Harrison Chase made the first commit to the LangChain repository, introducing a framework like nothing the software industry had seen before.</p><p>At the time, language models were isolated systems. They took in text and returned text, nothing more. There were no built-in tools, no retrieval mechanisms, no memory. LangChain emerged to solve this problem by offering a way to orchestrate LLMs with tools, prompts, memory, and retrieval pipelines.</p><p>But 2022 feels like a lifetime ago.</p><p>Since then, the LLM landscape has changed dramatically. Language models no longer operate in isolation. Major providers now offer built-in capabilities like tool use, function calling, and external data access. Innovations such as OpenAI&#8217;s <a href="https://www.datacamp.com/tutorial/open-ai-function-calling-tutorial">function calling</a> and Anthropic&#8217;s Model Context Protocol (<a href="https://neurlcreators.substack.com/p/model-context-protocol-mcp-ai-integration">MCP</a>) have drastically reduced the limitations that LangChain originally set out to overcome.</p><p>So the question now is: Is LangChain still needed in 2025?</p><p>In this article, we&#8217;ll explore that question by comparing LangChain to the latest capabilities of the OpenAI API. With the introduction of the <a href="https://neurlcreators.substack.com/i/164874678/the-response-api-the-next-step-in-openais-api-evolution">Response API</a>, OpenAI has brought orchestration features directly into the platform.</p><p>This article will help you understand how far LLM tooling has come and whether LangChain still deserves a place in your toolbox.</p><p>Here&#8217;s what we&#8217;ll cover:</p><ul><li><p>Prompt management: How prompt handling compares between LangChain and OpenAI</p></li><li><p>Tool usage: How both platforms enable agentic behavior</p></li><li><p>Retrieval-Augmented Generation (RAG): LangChain&#8217;s stack compared to OpenAI&#8217;s native support</p></li><li><p>Multi-model support: How each handles access to different model providers</p></li><li><p>Structured outputs: LangChain&#8217;s output parsers versus OpenAI&#8217;s built-in response formatting</p></li></ul><p>Let&#8217;s dive in.</p><h2>&#127793; LangChain's Origin Story</h2><p>Back in 2022, when language model providers like OpenAI and Cohere first made their APIs publicly available, their capabilities were limited. You could send in a prompt and get back a response. Text in, text out. That was it.</p><p>While the APIs were simple, the research community was already exploring much more. Papers, tweets, and GitHub repositories were demonstrating how LLMs could be extended to perform advanced tasks like using calculators, searching the web, or reasoning step by step simply through clever prompting.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!V8uI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf1e9b4a-4c06-482e-8832-3770be4a5647_745x734.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!V8uI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf1e9b4a-4c06-482e-8832-3770be4a5647_745x734.png 424w, https://substackcdn.com/image/fetch/$s_!V8uI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf1e9b4a-4c06-482e-8832-3770be4a5647_745x734.png 848w, https://substackcdn.com/image/fetch/$s_!V8uI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf1e9b4a-4c06-482e-8832-3770be4a5647_745x734.png 1272w, https://substackcdn.com/image/fetch/$s_!V8uI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf1e9b4a-4c06-482e-8832-3770be4a5647_745x734.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!V8uI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf1e9b4a-4c06-482e-8832-3770be4a5647_745x734.png" width="745" height="734" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/af1e9b4a-4c06-482e-8832-3770be4a5647_745x734.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:734,&quot;width&quot;:745,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;A tweet from 2022 showing how to prompt GPT-3 to use a calculator [source: X]&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="A tweet from 2022 showing how to prompt GPT-3 to use a calculator [source: X]" title="A tweet from 2022 showing how to prompt GPT-3 to use a calculator [source: X]" srcset="https://substackcdn.com/image/fetch/$s_!V8uI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf1e9b4a-4c06-482e-8832-3770be4a5647_745x734.png 424w, https://substackcdn.com/image/fetch/$s_!V8uI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf1e9b4a-4c06-482e-8832-3770be4a5647_745x734.png 848w, https://substackcdn.com/image/fetch/$s_!V8uI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf1e9b4a-4c06-482e-8832-3770be4a5647_745x734.png 1272w, https://substackcdn.com/image/fetch/$s_!V8uI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf1e9b4a-4c06-482e-8832-3770be4a5647_745x734.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>A tweet from 2022 showing how to prompt GPT-3 to use a calculator [source: <a href="https://x.com/goodside/status/1568448128495534081">X</a>]</em></figcaption></figure></div><p>Research like &#8220;<a href="https://arxiv.org/abs/2210.03350">Self-Ask</a>&#8221; had <a href="https://github.com/ofirpress/self-ask">working implementations</a>, and developers were beginning to realize the hidden potential of LLMs. But all this knowledge was scattered across the internet.</p><p>That&#8217;s when <a href="https://www.forbes.com/profile/harrison-chase/">Harrison Chase</a> stepped in. He began collecting those ideas and turning them into a unified framework. What he was working on would soon become LangChain, the first full-featured <a href="https://medium.com/@akankshasinha247/agent-orchestration-when-to-use-langchain-langgraph-autogen-or-build-an-agentic-rag-system-cc298f785ea4">LLM orchestration library</a>.</p><p>At launch, LangChain supported just two providers: OpenAI and Cohere. But it allowed developers to do more than just send prompts. It enabled models to perform tasks like web search, math calculations, and more. These were two areas where LLMs struggled the most at the time: hallucinations and arithmetic.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!odC0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15bb529a-2fdf-4a81-ad64-acc2c5740d29_1473x561.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!odC0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15bb529a-2fdf-4a81-ad64-acc2c5740d29_1473x561.png 424w, https://substackcdn.com/image/fetch/$s_!odC0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15bb529a-2fdf-4a81-ad64-acc2c5740d29_1473x561.png 848w, https://substackcdn.com/image/fetch/$s_!odC0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15bb529a-2fdf-4a81-ad64-acc2c5740d29_1473x561.png 1272w, https://substackcdn.com/image/fetch/$s_!odC0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15bb529a-2fdf-4a81-ad64-acc2c5740d29_1473x561.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!odC0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15bb529a-2fdf-4a81-ad64-acc2c5740d29_1473x561.png" width="1456" height="555" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/15bb529a-2fdf-4a81-ad64-acc2c5740d29_1473x561.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:555,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!odC0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15bb529a-2fdf-4a81-ad64-acc2c5740d29_1473x561.png 424w, https://substackcdn.com/image/fetch/$s_!odC0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15bb529a-2fdf-4a81-ad64-acc2c5740d29_1473x561.png 848w, https://substackcdn.com/image/fetch/$s_!odC0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15bb529a-2fdf-4a81-ad64-acc2c5740d29_1473x561.png 1272w, https://substackcdn.com/image/fetch/$s_!odC0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15bb529a-2fdf-4a81-ad64-acc2c5740d29_1473x561.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">LangChain's interest over time since its initial inception</figcaption></figure></div><p>Langchain didn&#8217;t get much attention until early 2023, when more developers jumped into the LLM space and quickly ran into the same limitations: hallucinations, weak reasoning, prompt engineering complexity, and lack of tooling.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JVdI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8963161d-c8d5-40f4-9ac2-ff8a96955f04_1141x123.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JVdI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8963161d-c8d5-40f4-9ac2-ff8a96955f04_1141x123.png 424w, https://substackcdn.com/image/fetch/$s_!JVdI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8963161d-c8d5-40f4-9ac2-ff8a96955f04_1141x123.png 848w, https://substackcdn.com/image/fetch/$s_!JVdI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8963161d-c8d5-40f4-9ac2-ff8a96955f04_1141x123.png 1272w, https://substackcdn.com/image/fetch/$s_!JVdI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8963161d-c8d5-40f4-9ac2-ff8a96955f04_1141x123.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JVdI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8963161d-c8d5-40f4-9ac2-ff8a96955f04_1141x123.png" width="1141" height="123" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8963161d-c8d5-40f4-9ac2-ff8a96955f04_1141x123.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:123,&quot;width&quot;:1141,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;A user comment from the Patrick Loeber YouTube video on langchain released April 2023 [source: YouTube]&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="A user comment from the Patrick Loeber YouTube video on langchain released April 2023 [source: YouTube]" title="A user comment from the Patrick Loeber YouTube video on langchain released April 2023 [source: YouTube]" srcset="https://substackcdn.com/image/fetch/$s_!JVdI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8963161d-c8d5-40f4-9ac2-ff8a96955f04_1141x123.png 424w, https://substackcdn.com/image/fetch/$s_!JVdI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8963161d-c8d5-40f4-9ac2-ff8a96955f04_1141x123.png 848w, https://substackcdn.com/image/fetch/$s_!JVdI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8963161d-c8d5-40f4-9ac2-ff8a96955f04_1141x123.png 1272w, https://substackcdn.com/image/fetch/$s_!JVdI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8963161d-c8d5-40f4-9ac2-ff8a96955f04_1141x123.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption"><em>A user comment from the Patrick Loeber YouTube video on langchain released April 2023 [source: <a href="https://www.youtube.com/watch?v=LbT1yp6quS8">YouTube</a>]</em></figcaption></figure></div><p>By this time, LangChain had rapidly expanded its feature set:</p><ul><li><p>Composable chains for combining multiple LLM calls and tools</p></li><li><p>Retrieval-Augmented Generation (RAG) support</p></li><li><p>Agentic workflows like <a href="https://learnprompting.org/docs/agents/react">ReAct</a> and <a href="https://learnprompting.org/docs/agents/mrkl?srsltid=AfmBOoo25KQxXPn3DG9OV19ST-V-zdjyXPcc1I3xvk-OSZOWeFPsePkf">MRKL</a></p></li><li><p>Memory integration for maintaining conversational context</p></li><li><p>Embedding model support for search and retrieval</p></li></ul><p>For many developers, LangChain became the easiest way to start building serious LLM applications. It was a dream toolkit for rapid experimentation.</p><p>But as LangChain grew, so did the criticism. Many in the software community began to argue that LangChain was <a href="https://minimaxir.com/2023/07/langchain-problem/#hello-world-in-langchain-or-more-accurately-hell-world">too complex</a>, <a href="https://safjan.com/problems-with-Langchain-and-how-to-minimize-their-impact/#1-overly-complex-and-unnecessary-abstractions">too abstract</a>, and <a href="https://www.octomind.dev/blog/why-we-no-longer-use-langchain-for-building-our-ai-agents#:~:text=LangChain%20was%20helpful%20at%20first,it%20wasn%E2%80%99t%20a%20good%20sign">not production-ready</a>. Some preferred the simplicity and control of using the OpenAI SDK directly.</p><p>On Reddit, Twitter, and forums, there was no shortage of frustration over its design choices.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Wjw9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4df1bedb-0731-4c77-bb27-2d523b139f5f_1176x298.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Wjw9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4df1bedb-0731-4c77-bb27-2d523b139f5f_1176x298.png 424w, https://substackcdn.com/image/fetch/$s_!Wjw9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4df1bedb-0731-4c77-bb27-2d523b139f5f_1176x298.png 848w, https://substackcdn.com/image/fetch/$s_!Wjw9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4df1bedb-0731-4c77-bb27-2d523b139f5f_1176x298.png 1272w, https://substackcdn.com/image/fetch/$s_!Wjw9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4df1bedb-0731-4c77-bb27-2d523b139f5f_1176x298.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Wjw9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4df1bedb-0731-4c77-bb27-2d523b139f5f_1176x298.png" width="1176" height="298" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4df1bedb-0731-4c77-bb27-2d523b139f5f_1176x298.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:298,&quot;width&quot;:1176,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;A user expressing their frustration with LangChain [source: discussion thread].&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="A user expressing their frustration with LangChain [source: discussion thread]." title="A user expressing their frustration with LangChain [source: discussion thread]." srcset="https://substackcdn.com/image/fetch/$s_!Wjw9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4df1bedb-0731-4c77-bb27-2d523b139f5f_1176x298.png 424w, https://substackcdn.com/image/fetch/$s_!Wjw9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4df1bedb-0731-4c77-bb27-2d523b139f5f_1176x298.png 848w, https://substackcdn.com/image/fetch/$s_!Wjw9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4df1bedb-0731-4c77-bb27-2d523b139f5f_1176x298.png 1272w, https://substackcdn.com/image/fetch/$s_!Wjw9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4df1bedb-0731-4c77-bb27-2d523b139f5f_1176x298.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>A user expressing their frustration with LangChain [source: <a href="https://github.com/langchain-ai/langchain/discussions/16169">discussion thread</a>].</em></figcaption></figure></div><p>This article isn&#8217;t about those criticisms.</p><p>Instead, we want to focus on a bigger question: <strong>Is LangChain still necessary in 2025?</strong></p><p>LangChain was designed to orchestrate the complex workflows that LLMs couldn&#8217;t handle alone. But now, model providers themselves offer orchestration capabilities, including built-in function calling, retrieval, memory, and even agent-like behaviors.</p><p>So, where does LangChain fit in today&#8217;s LLM stack?</p><p>That&#8217;s what we&#8217;re here to explore.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EFcc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52d468b9-93f0-49e6-9fab-2964fb7c4b82_1600x1205.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EFcc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52d468b9-93f0-49e6-9fab-2964fb7c4b82_1600x1205.png 424w, https://substackcdn.com/image/fetch/$s_!EFcc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52d468b9-93f0-49e6-9fab-2964fb7c4b82_1600x1205.png 848w, https://substackcdn.com/image/fetch/$s_!EFcc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52d468b9-93f0-49e6-9fab-2964fb7c4b82_1600x1205.png 1272w, https://substackcdn.com/image/fetch/$s_!EFcc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52d468b9-93f0-49e6-9fab-2964fb7c4b82_1600x1205.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EFcc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52d468b9-93f0-49e6-9fab-2964fb7c4b82_1600x1205.png" width="1456" height="1097" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/52d468b9-93f0-49e6-9fab-2964fb7c4b82_1600x1205.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1097,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Key architecture upgrades to LangChain since 2024&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Key architecture upgrades to LangChain since 2024" title="Key architecture upgrades to LangChain since 2024" srcset="https://substackcdn.com/image/fetch/$s_!EFcc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52d468b9-93f0-49e6-9fab-2964fb7c4b82_1600x1205.png 424w, https://substackcdn.com/image/fetch/$s_!EFcc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52d468b9-93f0-49e6-9fab-2964fb7c4b82_1600x1205.png 848w, https://substackcdn.com/image/fetch/$s_!EFcc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52d468b9-93f0-49e6-9fab-2964fb7c4b82_1600x1205.png 1272w, https://substackcdn.com/image/fetch/$s_!EFcc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52d468b9-93f0-49e6-9fab-2964fb7c4b82_1600x1205.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Key architecture upgrades to LangChain since 2024</figcaption></figure></div><h2>&#128450;&#65039; Prompt Management</h2><p>One of the selling points of LangChain at the time was its ability to assist developers with prompting. LangChain had built-in prompts that could help developers achieve their goals, but now, come 2025, the act of &#8220;prompt engineering&#8221; is <a href="https://www.wsj.com/articles/the-hottest-ai-job-of-2023-is-already-obsolete-1961b054?mod=cio-journal_lead_pos1">almost non-existent</a>. This is mostly because of how language models have improved. Now, with just <a href="https://www.promptingguide.ai/techniques/zeroshot">zero-shot prompting</a>, you can get an LLM to achieve your task with high accuracy.</p><p>When LangChain first came out, its maintainers drew prompting techniques from the research community and carefully crafted new prompts. At the time, before OpenAI released the <a href="https://neurlcreators.substack.com/i/164874678/chat-completions-api">Chat Completions API</a> and chat-tuned models became common, LangChain played a crucial role in simulating chat-style interactions through prompt templates like this:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;3055b87b-d3a8-4f10-8842-e03ecee126a7&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">"""The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
{history}
Human: {input}
AI:"""
</code></pre></div><p>These types of prompt templates became less common with the introduction of the Chat Completions API.</p><p>Back then, prompts were also used to get models to use tools. But that changed when OpenAI introduced function calling, which quickly became the standard. As a result, more model providers began fine-tuning their models to support tool use through function calls, making it much easier for developers to integrate tools without relying on complex prompt patterns.</p><p>There was also a time when prompts like &#8220;think step by step&#8221; were widely used to guide reasoning, but with the rise of <a href="https://sebastianraschka.com/blog/2025/understanding-reasoning-llms.html">reasoning models</a>, many LLMs can now guide themselves without relying on predefined prompt templates.</p><p>With all these improvements, developers today can leverage the power of LLMs more easily and effectively, without needing to depend as heavily on frameworks like LangChain.</p><div class="pullquote"><p>Need creative, high-quality technical content? Happy to chat! Book a call with our Creative Engineers&#128071;&#127998;</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://calendar.app.google/s6ekF2rDbPjKWLLQ8&quot;,&quot;text&quot;:&quot;Let&#8217;s Chat! &#128222;&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://calendar.app.google/s6ekF2rDbPjKWLLQ8"><span>Let&#8217;s Chat! &#128222;</span></a></p></div><h2>&#129520; Tool Usage</h2><p>Tool use was one of the main reasons developers turned to LangChain when it first launched. At the time, language models couldn&#8217;t natively use tools, so LangChain filled the gap by offering a way to extend LLM capabilities through tool integration.</p><p>LangChain provided built-in support for connecting models to tools such as:</p><ul><li><p>Web search (e.g., using <a href="https://github.com/serpapi">SerpAPI</a> or Bing Search)</p></li><li><p>Calculators for basic and complex math</p></li><li><p>Python REPLs for running code snippets</p></li></ul><p>These integrations were orchestrated using agentic patterns like ReAct and MRKL, enabling models to decide which tool to use based on the conversation.</p><p>Now, LLM providers support tool usage natively:</p><ul><li><p><strong>Function Calling</strong>: Language models like those from OpenAI allow developers to define functions as schemas. The model can then decide when to use a function during execution.</p></li><li><p><strong>Built-in Tools</strong>: Models from OpenAI, Google, and Anthropic increasingly support built-in tools such as search, code execution, and more.</p></li><li><p><strong>MCP</strong>: Anthropic&#8217;s Model Context Protocol (MCP) provides a standardized way to extend language models with tools and additional context.</p></li></ul><p>LangChain also supports function calling and MCP, and it offers a large collection of tools for developers to choose from. However, tool usage is no longer the framework&#8217;s main value proposition. For complex agentic workflows, even LangChain itself <a href="https://python.langchain.com/docs/how_to/migrate_agent/">recommends using LangGraph</a> instead of relying solely on the core framework.</p><h2>&#128269; Retrieval-Augmented Generation (RAG)</h2><p>Retrieval-Augmented Generation, or <a href="https://www.ibm.com/think/topics/retrieval-augmented-generation">RAG</a>, was one of the hottest topics in AI. Everyone wanted to connect their LLMs to external knowledge sources, largely due to the limited context length of models at the time. LangChain quickly became the go-to solution for this. It provided a complete RAG pipeline, including:</p><ul><li><p><strong>Loaders</strong> to pull data from external sources</p></li><li><p><strong>Vector stores</strong> for storing embedded data</p></li><li><p><strong>Retrievers</strong> for managing retrieval logic</p></li><li><p><strong>Chunking strategies</strong> to break down documents into manageable pieces that fit within context limits</p></li></ul><p>Later on, OpenAI introduced <a href="https://neurlcreators.substack.com/i/164874678/file-search-tool">file search tool</a> in their Assistant API (initially in beta) and eventually rolled it into the more stable Response API. This allowed users to upload files to OpenAI&#8217;s built-in vector store, enabling OpenAI to handle the entire retrieval process without requiring users to manage chunking, embeddings, or vector storage.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;ebf24b87-27a0-4a44-ae7a-bfefad9d05f6&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">from openai import OpenAI
client = OpenAI()

response = client.responses.create(
    model="gpt-4o-mini",
    input="What is deep research by OpenAI?",
    tools=[{
        "type": "file_search",
        "vector_store_ids": ["&lt;vector_store_id&gt;"]
    }]
)
print(response)</code></pre></div><p>While OpenAI&#8217;s approach is much simpler to use, it&#8217;s also less flexible. LangChain, on the other hand, gives developers full control over the RAG pipeline, including choice of vector store, chunking logic, retrievers, and more. This can be critical when working with non-OpenAI models or when you need fine-grained control over retrieval behavior.</p><p>Some may argue that the need for RAG has diminished with the advent of long-context models, but it still plays a vital role in many applications. And when it comes to building a customized RAG pipeline, LangChain remains a strong option.</p><h2>&#127760; Access to Multiple Model Providers</h2><p>OpenAI was the dominant provider when language model APIs first became available. However, LangChain anticipated a future with multiple providers and built its framework to support them from the start.</p><p>As new providers entered the space, LangChain quickly integrated them, making it easy for developers to swap between models. This flexibility was one of the key reasons LangChain gained popularity early on.</p><p>Today, OpenAI&#8217;s own SDK supports multiple model providers, as long as they implement the OpenAI-compatible API standard. Providers like <a href="https://ai.google.dev/gemini-api/docs/openai">Google</a>, <a href="https://docs.anthropic.com/en/api/openai-sdk">Anthropic</a>, and <a href="https://ollama.com/blog/openai-compatibility">Ollama</a> have adopted this approach, allowing developers to use their models seamlessly through a unified interface.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tPZe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6942227b-06e6-4c51-b655-ebd4cfba7ac6_1080x1327.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tPZe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6942227b-06e6-4c51-b655-ebd4cfba7ac6_1080x1327.png 424w, https://substackcdn.com/image/fetch/$s_!tPZe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6942227b-06e6-4c51-b655-ebd4cfba7ac6_1080x1327.png 848w, https://substackcdn.com/image/fetch/$s_!tPZe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6942227b-06e6-4c51-b655-ebd4cfba7ac6_1080x1327.png 1272w, https://substackcdn.com/image/fetch/$s_!tPZe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6942227b-06e6-4c51-b655-ebd4cfba7ac6_1080x1327.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tPZe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6942227b-06e6-4c51-b655-ebd4cfba7ac6_1080x1327.png" width="1080" height="1327" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6942227b-06e6-4c51-b655-ebd4cfba7ac6_1080x1327.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1327,&quot;width&quot;:1080,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;This meme surfaced in early 2025 as many saw it as ironic that Deepseek seemed to be replacing OpenAI while still using their SDK.&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="This meme surfaced in early 2025 as many saw it as ironic that Deepseek seemed to be replacing OpenAI while still using their SDK." title="This meme surfaced in early 2025 as many saw it as ironic that Deepseek seemed to be replacing OpenAI while still using their SDK." srcset="https://substackcdn.com/image/fetch/$s_!tPZe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6942227b-06e6-4c51-b655-ebd4cfba7ac6_1080x1327.png 424w, https://substackcdn.com/image/fetch/$s_!tPZe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6942227b-06e6-4c51-b655-ebd4cfba7ac6_1080x1327.png 848w, https://substackcdn.com/image/fetch/$s_!tPZe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6942227b-06e6-4c51-b655-ebd4cfba7ac6_1080x1327.png 1272w, https://substackcdn.com/image/fetch/$s_!tPZe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6942227b-06e6-4c51-b655-ebd4cfba7ac6_1080x1327.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">This meme surfaced in early 2025 as many saw it as ironic that Deepseek seemed to be replacing OpenAI while still using their SDK.</figcaption></figure></div><p>Some newer providers, like DeepSeek, didn&#8217;t even bother building a separate SDK. They simply conformed to the OpenAI spec from the start. This works well for both sides: model providers can focus on training and deploying their models, while users benefit from immediate access using tools they&#8217;re already familiar with.</p><p>In contrast, with LangChain, users often have to wait for official or community-built integrations to be added before they can access a new model, adding friction to the development process.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7GbE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4219c3da-1a5a-4cd8-a70a-7b78f5883a4b_1235x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7GbE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4219c3da-1a5a-4cd8-a70a-7b78f5883a4b_1235x1600.png 424w, https://substackcdn.com/image/fetch/$s_!7GbE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4219c3da-1a5a-4cd8-a70a-7b78f5883a4b_1235x1600.png 848w, https://substackcdn.com/image/fetch/$s_!7GbE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4219c3da-1a5a-4cd8-a70a-7b78f5883a4b_1235x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!7GbE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4219c3da-1a5a-4cd8-a70a-7b78f5883a4b_1235x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7GbE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4219c3da-1a5a-4cd8-a70a-7b78f5883a4b_1235x1600.png" width="1235" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4219c3da-1a5a-4cd8-a70a-7b78f5883a4b_1235x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1235,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;How does LangChain stack up against other frameworks in 2025?&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="How does LangChain stack up against other frameworks in 2025?" title="How does LangChain stack up against other frameworks in 2025?" srcset="https://substackcdn.com/image/fetch/$s_!7GbE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4219c3da-1a5a-4cd8-a70a-7b78f5883a4b_1235x1600.png 424w, https://substackcdn.com/image/fetch/$s_!7GbE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4219c3da-1a5a-4cd8-a70a-7b78f5883a4b_1235x1600.png 848w, https://substackcdn.com/image/fetch/$s_!7GbE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4219c3da-1a5a-4cd8-a70a-7b78f5883a4b_1235x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!7GbE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4219c3da-1a5a-4cd8-a70a-7b78f5883a4b_1235x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">How does LangChain stack up against other frameworks in 2025?</figcaption></figure></div><h2>&#129513; Structured Outputs</h2><p>The output from LLMs was originally unstructured text, which made it challenging to parse and use reliably. LangChain addressed this by introducing <strong><a href="https://python.langchain.com/docs/concepts/output_parsers/">output parsers</a></strong>, letting developers define expected structures and automatically interpret the model&#8217;s responses.</p><div id="youtube2-yj-wSRJwrrc" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;yj-wSRJwrrc&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/yj-wSRJwrrc?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;3f50143f-5a18-4468-b1ac-d523225f637e&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">from pydantic import BaseModel, Literal
from openai import OpenAI

class SentimentResponse(BaseModel):
    sentiment: Literal["positive", "neutral", "negative"]

client = OpenAI()

response = client.responses.parse(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "Classify the review sentiment."},
        {"role": "user", "content": "The service was excellent!"}
    ],
    response_format=SentimentResponse  # schema ensures type-safe output
)

print(response.sentiment)  # e.g., "positive"</code></pre></div><p>Today, structured outputs are supported natively by LLM providers including OpenAI, Google, and Anthropic.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/subscribe?"><span>Subscribe now</span></a></p><h2>&#129489;&#8205;&#9878;&#65039; Final Verdict: Is LangChain Still Worth Using in 2025?</h2><p>Absolutely. Despite some criticism and the fact that major providers like OpenAI now offer many of LangChain&#8217;s features out of the box, LangChain remains a powerful and relevant tool thanks to several key strengths.</p><h3>Pros:</h3><ul><li><p><strong>Open Source and actively maintained</strong>: LangChain continues to receive frequent updates and improvements from an active community.</p></li><li><p><strong>Multi-model support</strong>: It supports smaller or open-source LLMs that may lack native features like function calling or structured output.</p></li><li><p><strong>Strong RAG toolkit</strong>: LangChain offers a robust pipeline for retrieval-augmented generation, with built-in loaders, vector stores, and customizable retrieval logic.</p></li><li><p><strong>Rich ecosystem</strong>: Beyond its core library, LangChain includes LangGraph for multi-agent workflows and LangSmith for observability and debugging.</p></li></ul><h3>Cons:</h3><ul><li><p><strong>Steeper learning curve</strong>: Its extensive feature set can be overwhelming for beginners.</p></li><li><p><strong>Complex abstractions</strong>: You&#8217;ll need to learn LangChain-specific concepts like LCEL.</p></li><li><p><strong>Documentation navigation</strong>: The abundance of content and component layers can make the docs difficult to navigate</p></li></ul><h2>&#128282; Final Take</h2><p>When choosing an AI orchestration approach in 2025, you generally have three options:</p><ul><li><p>Use an open-source orchestration framework like LangChain</p></li><li><p>Rely directly on a model provider&#8217;s SDK, such as OpenAI&#8217;s</p></li><li><p>Build your own orchestration layer from scratch</p></li></ul><p>Each option comes with its own set of trade-offs, ranging from flexibility and control to simplicity and ease of integration. The good news is that, unlike the early days of LLMs, developers today have a rich ecosystem of choices to suit different project needs and levels of complexity.</p><div class="pullquote"><p>What tools or stacks do you want us to break down next? Drop a comment.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/is-langchain-still-worth-using-in/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/is-langchain-still-worth-using-in/comments"><span>Leave a comment</span></a></p></div>]]></content:encoded></item><item><title><![CDATA[OpenAI Response API vs Chat Completions: Which Should You Use for Your Next Build?]]></title><description><![CDATA[Navigating OpenAI&#8217;s Evolving API Landscape]]></description><link>https://neurlcreators.substack.com/p/openai-response-api-vs-chat-completions</link><guid isPermaLink="false">https://neurlcreators.substack.com/p/openai-response-api-vs-chat-completions</guid><dc:creator><![CDATA[Eteimorde Youdiowei]]></dc:creator><pubDate>Sat, 31 May 2025 16:02:23 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/853c0078-512b-4aa8-9d47-665c6c8b9b92_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>You're gearing up to build your next AI-powered application and have chosen OpenAI as your language model provider. You've likely worked with the powerful GPT models before, and now you're back in the docs, setting up your stack. But right away, you're faced with a fundamental question: Should you stick with the tried and true <a href="https://platform.openai.com/docs/quickstart?api-mode=chat">Chat Completions API</a> or embrace the <a href="https://openai.com/index/new-tools-for-building-agents/">new Response API</a> that OpenAI is now recommending for all new projects?</p><p>The Chat Completions API is familiar territory. It has been the backbone of countless AI applications since <a href="https://blog.langchain.dev/chat-models/">early 2023</a>. It&#8217;s flexible, well-documented, and battle-tested. Other companies like Anthropic and Google have also adopted similar message-based interfaces, helping them become a de facto industry standard. So why change what already works?</p><p>Well, OpenAI has a track record of evolving its APIs to reflect the latest innovations in AI development. The Chat Completions API was introduced when chat models like ChatGPT gained widespread use.</p><p>The Response API follows that same trend, but its focus is on agentic capabilities. It introduces first-class support for tools, simplifies conversation management with built-in state handling, and enhances support for streaming.</p><p>In this article, we will dive deeper into these new features of the response API and see how it compares with the chat completion API. When you are done with the article, you will understand the following:</p><ul><li><p>How the Response API differs from the Chat Completions API</p></li><li><p>How it manages conversational state natively</p></li><li><p>The Response API&#8217;s first-class support for tools</p></li><li><p>The semantic-driven streaming support in the Response API</p></li></ul><p>Let&#8217;s dive in.</p><h2>Evolution of OpenAI API</h2><p>The OpenAI API has evolved significantly over the last few years. When OpenAI first <a href="https://openai.com/index/openai-api/">introduced its API in 2020</a>, it gave developers access to <a href="https://arxiv.org/abs/2005.14165">GPT-3</a> through a simple interface. Since then, OpenAI has continued to refine and expand its API offerings to better support a variety of use cases,</p><p>Here&#8217;s a quick overview of the key APIs OpenAI has released over the years:</p><ul><li><p>The Completions API</p></li><li><p>The Chat Completion API</p></li><li><p>The Assistant API</p></li></ul><h3>The Completions API</h3><p>The Completions API was OpenAI&#8217;s first publicly available API, launched in 2020. It uses a simple "text in, text out" interface. You provide a text prompt, and the model responds by continuing or completing the input. This design reflects the foundational principle of large language models: predicting the next token in a sequence.</p><p>Here&#8217;s an example of how to use the Completions API with the OpenAI Python SDK:</p><pre><code>import openai

openai.api_key = "YOUR_API_KEY"

response = openai.Completion.create(
    model="text-davinci-003",
    prompt="Write a poem about the ocean at night.",
    max_tokens=100,
    temperature=0.7,
)

print(response.choices[0].text)</code></pre><p>This API was groundbreaking at the time, as it marked the first introduction of a general-purpose language model that could be prompted to perform a wide range of tasks, including text generation, summarization, question answering, and more.</p><h3>Chat Completions API</h3><p>With the release of ChatGPT in 2022, there was a growing demand for an API that mirrored the conversational style users experienced in the ChatGPT interface. In <a href="https://github.com/openai/openai-python/releases/tag/v0.27.0">March 2023</a>, OpenAI responded by launching the Chat Completions API.</p><p>Unlike the original Completions API, which followed a simple text-in, text-out format, the Chat Completions API was designed around multi-turn conversations. It introduced a structured format using messages, where each message had a role (system, user, or assistant), enabling developers to build more interactive and context-aware experiences.</p><p>Here&#8217;s an example of how to use the Chat Completions API with the Python SDK:</p><pre><code>import openai

openai.api_key = "YOUR_API_KEY"

response = openai.ChatCompletion.create(
    model="gpt-4",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What's a good recipe for homemade pizza?"}
    ],
    temperature=0.7
)

print(response.choices[0].message["content"])</code></pre><p>Over time, OpenAI continued to improve this API by adding features like <a href="https://openai.com/index/function-calling-and-other-api-updates/">function calling</a>, which allowed models to interact with external tools. This introduced agent-like capabilities, but they were not fully integrated into the design of the API.</p><h3>The Assistant API</h3><p>At OpenAI&#8217;s Developer Day 2023, the <strong>Assistant API</strong> was introduced in beta as a step toward building more capable AI agents. It introduced the concept of <strong>assistants</strong>, AI agents that could be created and given access to built-in tools like code execution, file handling, and function calling. It also brought in <strong>threads</strong>, which allowed developers to manage and persist conversation state across interactions.</p><p>Here&#8217;s how the API works in practice. First, you create an assistant and grant it access to a built-in tool like the code interpreter:</p><pre><code>from openai import OpenAI
client = OpenAI()

assistant = client.beta.assistants.create(
  name="Math Tutor",
  instructions="You are a personal math tutor. Write and run code to answer math questions.",
  tools=[{"type": "code_interpreter"}],
  model="gpt-4o",
)</code></pre><p>Next, you create a thread to manage the ongoing conversation:</p><pre><code>thread = client.beta.threads.create()</code></pre><p>User messages are added to the thread using its ID:</p><pre><code>message = client.beta.threads.messages.create(
  thread_id=thread.id,
  role="user",
  content="I need to solve the equation `3x + 11 = 14`. Can you help me?"
)</code></pre><p>Finally, you initiate a run that ties the assistant and the thread together:</p><pre><code>run = client.beta.threads.runs.create_and_poll(
  thread_id=thread.id,
  assistant_id=assistant.id,
  instructions="Please address the user as Jane Doe. The user has a premium account."
)</code></pre><p>The Assistant API introduced a powerful architecture for building agentic workflows, but it remained experimental and in beta. Many of the ideas it tested, like persistent conversation state and built-in tool support, laid the foundation for what would become the more streamlined and production-ready <strong>Response API</strong>.</p><h2>The Response API: The next step in OpenAI&#8217;s API evolution</h2><p>In March 2025, nearly two years after the launch of the Chat Completions API, OpenAI introduced the Response API, a unified interface that combines the best of both the Chat Completions and Assistant APIs.</p><p>From the Chat Completions API, it inherits its simplicity: you send a list of messages and get a response. There&#8217;s no need to create assistants, manage thread objects, or handle extra orchestration. The Assistant API brings over the features that made agentic workflows possible, including built-in tool support, native state management, and event-driven streaming. The result is an API that&#8217;s lightweight and easy to use, yet powerful enough to support fully capable AI agents.</p><p>With the introduction of the Response API, OpenAI announced plans to deprecate the Assistant API. However, the Chat Completions API isn&#8217;t going anywhere. That means developers will now have two primary APIs to choose from on the OpenAI platform. So, which one should you use?</p><p>OpenAI&#8217;s general recommendation is simple: use the Response API for new projects, and stick with the Chat Completions API for existing ones that are already in production. Now let&#8217;s take a look at some of the features that make the Response API stand out from the Chat Completions API.</p><h2>Conversation Statement Management</h2><p>The Chat Completions API is stateless; each call is independent and doesn&#8217;t retain memory of previous interactions unless you explicitly pass the full message history every time. The Response API changes that by offering optional stateful conversations, managed directly by the OpenAI platform. This works similarly to how threads functioned in the Assistant API, but with a much simpler interface.</p><p>To maintain state between calls, you just pass the <code>previous_response_id</code> of an earlier response. This tells the API to continue the conversation from where it left off:</p><pre><code>from openai import OpenAI
client = OpenAI()

response = client.responses.create(
    model="gpt-4o-mini",
    input="tell me a joke",
)
print(response.output_text)

second_response = client.responses.create(
    model="gpt-4o-mini",
    previous_response_id=response.id,
    input=[{"role": "user", "content": "explain why this is funny."}],
)
print(second_response.output_text)</code></pre><p>If you prefer to manage state manually, as you would in the Chat Completions API, you still can. The Response API is backward-compatible and supports manual message construction.</p><p>You can also opt out of automatic state management entirely by setting the store parameter to <code>False</code>. This disables storage of the conversation state:</p><pre><code>from openai import OpenAI

client = OpenAI()

response = client.responses.create(
    model="gpt-4o-mini",
    input=[
        {"role": "user", "content": "knock knock."},
        {"role": "assistant", "content": "Who's there?"},
        {"role": "user", "content": "Orange."},
    ],
    store=False
)

print(response.output_text)</code></pre><p>In short, the Response API improves on the Chat Completions API by making conversation state optional, seamless, and easier to manage.</p><h2>Built-in Tool Support in the response API</h2><p>The response API, similar to the Assistant API, offers built-in tool support. With the chat completion API, you could only have access to tools via function calling or via dedicated models. Let&#8217;s go through some of the built-in tools response API currently supports.</p><h3>Web Search Tool</h3><p>The Response API includes a <a href="https://openai.com/index/introducing-chatgpt-search/">built-in web search tool</a> that allows the model to perform live searches. Using it is straightforward, just pass the tool in the tools parameter by specifying its type as <code>web_search_preview</code>.</p><pre><code>from openai import OpenAI
client = OpenAI()

response = client.responses.create(
    model="gpt-4.1",
    tools=[{"type": "web_search_preview"}],
    input="What was a positive news story from today?"
)

print(response.output_text)</code></pre><p>While the Chat Completions API also supports web search, it requires using specific models like <code>gpt-4o-search-preview </code>or <code>gpt-4o-mini-search-preview</code>. In contrast, the Response API offers greater flexibility, web search can be enabled across a wider range of models, making it easier to integrate into different applications without being locked into special model variants.</p><h3>File Search Tool</h3><p>The Response API introduces support for the <strong>file search tool</strong>. This tool allows the model to retrieve and reference information from files you've uploaded, making it great for use cases like answering questions from documentation, PDFs, knowledge bases, or research papers.</p><p>The Chat Completions API doesn&#8217;t support this tool at all. This is one of the Response API&#8217;s more agentic capabilities, carried over and improved from the Assistant API.</p><p>To use the file search tool, you first upload your files and organize them into a vector store. You then attach this store to your request using the tools parameter.</p><p>Here&#8217;s a basic example of how to use it:</p><pre><code>from openai import OpenAI
client = OpenAI()

response = client.responses.create(
    model="gpt-4o-mini",
    input="What is deep research by OpenAI?",
    tools=[{
        "type": "file_search",
        "vector_store_ids": ["&lt;vector_store_id&gt;"]
    }]
)
print(response)</code></pre><p>This tool gives the model retrieval-augmented generation (RAG) abilities with just a few lines of code.</p><div class="pullquote"><p><em>Need creative, high-quality technical content? Happy to chat! Book a call with our Creative Engineers&#128071;&#127998;</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://calendar.app.google/s6ekF2rDbPjKWLLQ8&quot;,&quot;text&quot;:&quot;Let&#8217;s Chat! &#128222;&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://calendar.app.google/s6ekF2rDbPjKWLLQ8"><span>Let&#8217;s Chat! &#128222;</span></a></p></div><h3>Image Generation</h3><p>Another powerful feature of the Response API is its support for image generation as a tool. Instead of requiring a dedicated model like in previous APIs, image generation is now treated as a native tool, just like web search or file search. This gives developers a more unified and consistent interface for triggering different capabilities.</p><p>Behind the scenes, the tool uses the <a href="https://platform.openai.com/docs/models/gpt-image-1">GPT image model</a>, so you still get the full capabilities of <code>gpt-image-1</code>, but without switching models. That means you can generate images while still using <code>gpt-4o</code> or any other supported model in your workflow.</p><p>Here&#8217;s how you can use it:</p><pre><code>from openai import OpenAI
import base64

client = OpenAI() 

response = client.responses.create(
    model="gpt-4.1-mini",
    input="Generate an image of gray tabby cat hugging an otter with an orange scarf",
    tools=[{"type": "image_generation"}],
)

# Save the image to a file
image_data = [
    output.result
    for output in response.output
    if output.type == "image_generation_call"
]
    
if image_data:
    image_base64 = image_data[0]
    with open("otter.png", "wb") as f:
        f.write(base64.b64decode(image_base64))</code></pre><h3>MCP Support</h3><p>OpenAI <a href="https://openai.com/index/new-tools-and-features-in-the-responses-api/">introduced support for MCP</a> within the Response API, which enables the model to access <a href="https://neurlcreators.substack.com/i/159944232/remote-communication-http-sse">remote MCP servers</a> directly from the API. This essentially gives the response api access to an unlimited number of tools that implement the MCP protocol. Here&#8217;s how you can access an MCP server with it:</p><p>The Response API includes <a href="https://openai.com/index/new-tools-and-features-in-the-responses-api/">native support for MCP</a>, enabling the model to directly access <a href="https://neurlcreators.substack.com/i/159944232/remote-communication-http-sse">remote MCP servers</a>. This effectively unlocks access to an unlimited number of external tools and knowledge bases that implement the MCP protocol.</p><p>Here&#8217;s an example of how to connect to an MCP server using the Response API:</p><pre><code>from openai import OpenAI

client = OpenAI()

resp = client.responses.create(
    model="gpt-4.1",
    tools=[
        {
            "type": "mcp",
            "server_label": "deepwiki",
            "server_url": "https://mcp.deepwiki.com/mcp",
            "require_approval": "never",
        },
    ],
    input="What transport protocols are supported in the 2025-03-26 version of the MCP spec?",
)

print(resp.output_text)</code></pre><p>In this example, the MCP tool connects to the <a href="https://deepwiki.org/">deepwiki</a> MCP server, which hosts various documentation. This allows your application to answer detailed questions based on any documentation available within the wiki.</p><p>Given MCP&#8217;s growing adoption, this integration is a game-changer; now developers can easily plug into remote MCP servers without needing to handle the MCP SDK or complex integrations themselves.</p><h3>Additional Tools and What&#8217;s Next</h3><p>OpenAI is expected to continue adding more tools to the Response API over time. So far, they have introduced the Computer Use tool, which builds on their Computer-Using Agent (<a href="https://openai.com/index/computer-using-agent/">CUA</a>) model, as well as the <a href="https://platform.openai.com/docs/guides/tools-code-interpreter">code interpreter</a> tool. These additions will make the Response API even more powerful and agentic.</p><h2>Response API Streaming With Semantic Events</h2><p>The Chat Completions API&#8217;s streaming feature delivered raw text chunks as they were generated. In contrast, the Response API introduces <strong>semantic streaming</strong>, where each streamed result is a structured event providing rich information beyond just text.</p><p>For example, the API emits semantic events like:</p><ul><li><p><code>ResponseCreatedEvent</code> &#8212; signals when a streaming response starts</p></li><li><p><code>ResponseCompletedEvent</code> &#8212; signals when the stream finishes</p></li><li><p><code>ResponseOutputTextDelta</code> &#8212; delivers partial text updates as they arrive</p></li></ul><p>Beyond text, the Response API also streams events related to tool usage, such as <code>ResponseFileSearchCallSearching</code> and <code>ResponseFileSearchCallCompleted</code>, enabling developers to track tool interactions in real time.</p><p>This event-driven approach makes streaming more informative and easier to integrate into complex, agentic workflows.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/openai-response-api-vs-chat-completions/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/openai-response-api-vs-chat-completions/comments"><span>Leave a comment</span></a></p><h2>Is This Vendor Lock-In?</h2><p>The Response API brings powerful new capabilities, but it also raises an important question: <a href="https://www.maragu.dev/blog/openai-responses-api-is-an-attempt-at-lock-in">Is this a form of vendor lock-in?</a></p><p>Many of the features introduced, like built-in tools, native state management, and semantic streaming, are tightly integrated into the OpenAI platform. These enhancements don't carry over to other model providers. So once you start building with the Response API, switching to a different provider may require significant rework, reducing the incentive to move away from OpenAI.</p><p>That said, OpenAI isn&#8217;t forcing developers to use these features. They&#8217;re optional enhancements designed to improve developer experience and agent capabilities. You can still use the Response API in a minimal, model-only fashion, just like Chat Completions, or continue using the Chat Completions API itself.</p><p>This also opens up new possibilities: as the Response API becomes more widely adopted, we may see community-built wrappers that abstract these features and make them usable across different model backends.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/subscribe?"><span>Subscribe now</span></a></p><h2>Conclusion</h2><p>The Response API brings a clear evolution in how AI agents are built. With native support for tools, built-in state management, and a more event-driven interface, it simplifies many of the complexities developers previously had to handle manually. If you're building an agentic application and you're comfortable working within the OpenAI ecosystem, the Response API is the clear choice. It provides more out of the box, with less boilerplate.</p><p>However, if your project needs to remain model-agnostic or you&#8217;re aiming for maximum portability, the Chat Completions API might still be the better fit. Thanks to its simplicity and wide adoption, it remains a solid choice. The Response API&#8217;s backward compatibility also means you can still use it in a familiar, lightweight way without adopting the full toolset.</p><p>Ultimately, the Response API represents OpenAI&#8217;s vision for the future of AI development: more structured, more capable, and more agent-driven. Whether that future aligns with your next build depends on your goals. The good news is you&#8217;ve got options, and both APIs are here to stay.</p><div class="pullquote"><p>Know a builder who needs to see this? Share it with your team or community.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/openai-response-api-vs-chat-completions?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/openai-response-api-vs-chat-completions?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div>]]></content:encoded></item><item><title><![CDATA[Building a Hotel Reservation AI Agent Using OpenAI Agent SDK]]></title><description><![CDATA[A hands-on guide to building a multi-agent hotel reservation assistant.]]></description><link>https://neurlcreators.substack.com/p/build-hotel-reservation-agent-with-openai-agent-sdk</link><guid isPermaLink="false">https://neurlcreators.substack.com/p/build-hotel-reservation-agent-with-openai-agent-sdk</guid><dc:creator><![CDATA[Neurl Creators]]></dc:creator><pubDate>Sat, 24 May 2025 18:04:32 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/518cd780-25c9-4ae9-9c4e-53953f5bb879_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Booking a hotel should be easy. Yet customer support teams often face an endless stream of repetitive questions like </p><blockquote><p><em>&#8220;What&#8217;s your check-in time?&#8221;</em>, <em>&#8220;Do you have free parking?&#8221;</em> or <em>&#8220;Can I cancel my reservation?&#8221;</em> </p></blockquote><p>&#8230; alongside routine booking and cancellation requests. </p><p>With AI, there&#8217;s a better way. A conversational hotel support agent can provide instant answers, handle reservations, and even manage booking changes without ever involving a human agent. </p><p>And with the <a href="https://openai.github.io/openai-agents-python/">OpenAI Agent SDK</a>, building such a system is not only possible but also remarkably straightforward.</p><p>In this article, we&#8217;ll walk through how to build a fully functional hotel reservation support agent using the Agent SDK. </p><p>You&#8217;ll see how to design multi-agent workflows where responsibilities are cleanly divided: <strong>triaging user intent</strong>, <strong>answering FAQs</strong>, and <strong>processing reservations or cancellations</strong>. </p><p>We&#8217;ll show how to connect your agents to real tools like room managers and FAQ providers, and how to maintain conversation state using context.</p><p>By the end, you&#8217;ll understand:</p><ol><li><p>How to define and orchestrate AI agents using the OpenAI Agent SDK</p></li><li><p>How to create and register tools that interact with external systems</p></li><li><p>How to use context to carry user-specific information across turns</p></li></ol><p>Whether you're exploring AI for customer service or want a hands-on project with the Agent SDK, this guide will show you how to build something useful, practical, and production-ready.</p><div><hr></div><h2>Understanding Multi-Agent Systems</h2><p>A Multi-Agent System (<a href="https://www.ibm.com/think/topics/multiagent-system">MAS</a>) is a framework where multiple autonomous agents interact within an environment to achieve individual or collective goals. Each agent operates independently, perceiving its surroundings and making decisions based on its objectives and knowledge. These agents can collaborate, coordinate, or even compete, depending on the system's design and desired outcomes. Such systems are particularly effective in handling complex, distributed tasks that are beyond the capabilities of a single agent.</p><p>In the context of hotel reservations, a MAS can be structured with specialized agents handling distinct responsibilities:</p><ul><li><p><strong>Primary Agent</strong>: Acts as the initial point of contact, interpreting user intents and directing queries to the appropriate specialized agent.</p></li><li><p><strong>Reservation Agent</strong>: Manages room bookings, cancellations, and modifications, interfacing with the hotel's reservation system.</p></li><li><p><strong>FAQ Agent</strong>: Provides information on hotel amenities, policies, and other frequently asked questions.</p></li></ul><p>This division of labor ensures that each agent focuses on its area of expertise, leading to more efficient and accurate responses.</p><p>To build these multi-agent systems, developers can use powerful toolkits like <strong>OpenAI Agent SDK</strong>, <strong><a href="https://developers.googleblog.com/en/agent-development-kit-easy-to-build-multi-agent-applications/">Google ADK</a></strong>, and <strong><a href="https://www.crewai.com/">CrewAI</a></strong>. These platforms simplify the process of creating, managing, and coordinating AI agents that work together seamlessly in real-world applications. <strong>In this tutorial, we&#8217;ll be using the OpenAI Agent SDK</strong> to demonstrate how to build a hotel reservation support agent from the ground up.</p><h2>Setting the Stage</h2><p>Before we start building intelligent agents, let&#8217;s set up the foundation of our system. In this section, we&#8217;ll walk through how to install the necessary tools, configure your environment, and create the external system that our agents will interact with.</p><h3><strong>Installing the OpenAI Agent SDK</strong></h3><p>You can install the SDK via pip:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;a4134086-ee9d-4d7a-9ec3-6f3926d1c679&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">pip install openai-agents</code></pre></div><p>Next, you'll need an OpenAI API key to run the agent. Store it as an environment variable.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;ad310489-6a13-44c3-884e-ff435d35ee43&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">export OPENAI_API_KEY="your-api-key-here"</code></pre></div><h3><strong>Building the RoomManager</strong></h3><p>Before diving into agents and SDK features, let&#8217;s first build the <strong>external system</strong> that agents will interact with: a simple in-memory <strong>room reservation manager</strong>.</p><p>This <code>RoomManager</code> class will simulate a hotel&#8217;s inventory of rooms, tracking availability, handling reservations, and releasing bookings. Our agents will call methods from this class via tools in later sections.</p><p>Create a new file named <code>room_manager.py</code>, and add the following code:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;a0005338-b26e-4493-9d96-1de3c06a8ae2&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">from typing import Optional

class RoomManager:
    def __init__(self):
        self.rooms = {}
        for i in range(1, 21):
            room_type = "standard" if i &lt;= 15 else "deluxe"
            self.rooms[f"room_{i}"] = {"type": room_type, "available": True}

    def find_available_room(self, room_type: str) -&gt; Optional[str]:
        for room_id, info in self.rooms.items():
            if info["type"] == room_type and info["available"]:
                return room_id
        return None

    def reserve_room(self, room_id: str) -&gt; bool:
        if room_id in self.rooms and self.rooms[room_id]["available"]:
            self.rooms[room_id]["available"] = False
            return True
        return False

    def release_room(self, room_id: str) -&gt; bool:
        if room_id in self.rooms and not self.rooms[room_id]["available"]:
            self.rooms[room_id]["available"] = True
            return True
        return False


room_manager = RoomManager()</code></pre></div><p>This class keeps things simple and self-contained. It supports:</p><ul><li><p>Finding the next available room of a given type</p></li><li><p>Reserving a room (marking it as unavailable)</p></li><li><p>Releasing a reserved room (making it available again)</p></li></ul><p>Now that our system is ready to simulate a hotel backend, we can begin integrating it with tools and agents using the Agent SDK.</p><h2>Understanding Context Management in the Agent SDK</h2><p>When working with large language models (<a href="https://aws.amazon.com/what-is/large-language-model/">LLMs</a>), <strong>context</strong> usually means the conversation history or the text the model uses to generate its next response.</p><p>In the <strong>OpenAI Agent SDK</strong>, context means something different. It is a structured Python object that holds important information about the ongoing conversation or session. This context is passed to the <a href="https://openai.github.io/openai-agents-python/running_agents/">agent runner</a>, which shares it among agents and tools as needed.</p><h3>How Context Differs in Agent SDK</h3><ul><li><p><strong>LLM context</strong>: The text-based conversation history that helps the model understand the current interaction.</p></li><li><p><strong>Agent SDK context</strong>: A stateful Python object that stores key information (like a user&#8217;s booking details) and is used by the agent runner to keep track of what has happened so far.</p></li></ul><p>This makes it easy for agents to remember important details across multiple steps. For example, if a user reserves a room, the room ID can be saved in the context. Later, if the user wants to cancel, the cancellation agent can access that same room ID from the context without asking the user again.</p><h3>Defining Our Context Object</h3><p>For our hotel reservation system, the context will store the user&#8217;s reserved room ID. Here's how you can define it in a file named <code>context.py</code>:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;c8e07496-98c5-4bd3-8881-d9557e9d316a&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python"># context.py

from dataclasses import dataclass
from typing import Optional

@dataclass
class UserBooking:
    room_id: Optional[str] = None</code></pre></div><p>This <code>UserBooking</code> object is passed to the <strong>agent runner</strong> and shared with agents and tools during the conversation. It helps keep the interaction smooth and natural by remembering the user&#8217;s booking information across multiple turns.</p><h2>Defining the Tools</h2><p>In this section, we define the tools our agents will use. These tools utilize the RoomManager we defined earlier and act as a bridge between the agents and the RoomManager class. They also interact with the context manager to maintain the current user&#8217;s room_id throughout the conversation.</p><p>First, here are the necessary imports you&#8217;ll need at the top of the file:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;d5263fed-a110-4702-b684-0f3952223695&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">from agents import function_tool, RunContextWrapper
from context import UserBooking
from room_manager import room_manager</code></pre></div><h4>Reserving a Room</h4><p>The <code>reserve_room_by_type</code> function allows the agent to reserve an available room of the requested type (either "standard" or "deluxe"). It uses the <code>RoomManager</code> to find a suitable available room. If a room is found and successfully reserved, it updates the user&#8217;s context with the reserved room&#8217;s ID. This helps keep track of the user&#8217;s current booking throughout the session.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;dde6bd1a-559e-4cae-a8dc-9ac8c0dce84f&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">@function_tool
async def reserve_room_by_type(wrapper: RunContextWrapper[UserBooking], room_type: str) -&gt; str:
    "Reserve an available hotel room of a specified type (standard or deluxe)."

    room_id = room_manager.find_available_room(room_type)
    if not room_id:
        return f"No available {room_type} rooms."

    if room_manager.reserve_room(room_id):
        wrapper.context.room_id = room_id
        return f"Room {room_id} has been reserved for you."
    else:
        return f"Failed to reserve room {room_id}."</code></pre></div><h4>Freeing a Reserved Room</h4><p>The <code>free_reserved_room</code> function releases a room that the user has previously reserved. It checks the user&#8217;s context for a currently reserved room ID. If one exists, it uses the <code>RoomManager</code> to release the room and clears the room ID from the context. This keeps the user session consistent and ensures that room availability is updated properly.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;2ae9a433-0dd7-47e1-8417-c154aa28a01d&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">@function_tool
async def free_reserved_room(wrapper: RunContextWrapper[UserBooking]) -&gt; str:
    "Releases the currently reserved room for the user."

    room_id = wrapper.context.room_id
    if not room_id:
        return "You don't have any active room reservation."

    if room_manager.release_room(room_id):
        wrapper.context.room_id = None
        return f"Room {room_id} has been released and is now available."
    else:
        return f"Failed to release room {room_id}."</code></pre></div><h4>Providing FAQ Answers</h4><p>The <code>get_faqs</code> function returns a static string containing common hotel FAQs, such as room types, prices, check-in/out times, and amenities. Unlike the other tools, this one does not use the <code>RoomManager</code> or context because it simply serves as an informational resource for the FAQ agent.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;13e52c65-5560-49a6-8af7-711507b0e29f&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">@function_tool
async def get_faqs() -&gt; str:
    "Returns answers to common hotel reservation FAQs."

    faqs = """
    1. What types of rooms are available?
    - Standard: Cozy and affordable
    - Deluxe: Spacious with additional amenities

    2. Room prices:
    - Standard: $100/night
    - Deluxe: $180/night

    3. Check-in/out:
    - Check-in: 2:00 PM
    - Check-out: 11:00 AM

    4. Cancellation policy:
    - Free up to 24 hours before check-in

    5. Amenities:
    - Free breakfast and parking
    """
    return faqs</code></pre></div><p>Together, these tools connect the AI agents to the hotel&#8217;s backend room management system and maintain the user&#8217;s booking context during conversations. This separation of concerns keeps the agents focused on dialog and decision-making while delegating business logic to dedicated components.</p><p>Place all this code, including the imports and the three functions, into a file named <code>tools.py</code> in your project directory.</p><div class="pullquote"><p>Need creative, high-quality technical content? Happy to chat! Book a call with our Creative Engineers&#128071;&#127998;</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://calendar.app.google/s6ekF2rDbPjKWLLQ8&quot;,&quot;text&quot;:&quot;Let&#8217;s Chat! &#128222;&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://calendar.app.google/s6ekF2rDbPjKWLLQ8"><span>Let&#8217;s Chat! &#128222;</span></a></p></div><h3>Defining the Agents</h3><p>In this section, we define the core AI agents that will handle different parts of the hotel reservation system. Each agent has a clear role and set of instructions. They use the tools we defined earlier and can hand off conversations between each other to create a seamless multi-agent workflow.</p><p>Start with the necessary imports:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;a7691e86-04c3-46ab-a5fb-dc4983cc1c99&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">from agents import Agent
from agents.extensions.handoff_prompt import RECOMMENDED_PROMPT_PREFIX
from context import UserBooking
from tools import reserve_room_by_type, free_reserved_room, get_faqs</code></pre></div><h4><strong>FAQ Agent</strong></h4><p>The <code>faq_agent</code> specializes in answering common questions about the hotel, such as room types, pricing, check-in/out policies, and amenities.</p><ul><li><p>It relies entirely on the <code>get_faqs</code> tool to provide answers.</p></li><li><p>It is instructed <strong>not</strong> to use any knowledge outside of this tool.</p></li><li><p>If it encounters a question it can&#8217;t answer or one outside the FAQs, it transfers control back to the primary agent.</p></li></ul><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;3743bce8-bc2f-4bc2-b86e-a47a116ab35f&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">faq_agent = Agent[UserBooking](
    name="FAQ Agent",
    handoff_description="Agent for answering frequently asked questions about the hotel",
    instructions="""
        You are an FAQ agent for a hotel reservation system. Your job is to answer questions about room types, pricing, check-in times, and other policies.

        # Routine
        1. Use the `get_faqs` tool to answer any customer question.
        2. Do not use your own knowledge &#8212; rely solely on the tool output.
        3. If you cannot answer the question, or it falls outside FAQs, transfer back to the primary agent.""",
    tools=[get_faqs],
)</code></pre></div><h4><strong>Reservation Agent</strong></h4><p>The <code>reservation_agent</code> manages room bookings and cancellations.</p><ul><li><p>It guides the user through either reserving a new room or cancelling an existing booking.</p></li><li><p>For new reservations, it:</p><ul><li><p>Asks for the preferred room type,</p></li><li><p>Uses the <code>reserve_room_by_type</code> tool to find and book a room,</p></li><li><p>Saves the reserved room ID in the user context,</p></li><li><p>Confirms the reservation.</p></li></ul></li><li><p>For cancellations, it:</p><ul><li><p>Checks if the user has a current reservation,</p></li><li><p>Uses the <code>free_reserved_room</code> tool to release the room,</p></li><li><p>Confirms cancellation and clears the booking from the context.</p></li><li><p>Notifies the user if no active reservation exists.</p></li></ul></li><li><p>If the user request does not fit these intents, it hands off to the primary agent.</p></li><li><p>It can hand off conversations to the FAQ agent if needed.</p></li></ul><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;a7651893-732a-4020-a409-4aa0cebf7010&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">reservation_agent = Agent[UserBooking](
    name="Reservation Agent",
    handoff_description="Agent for handling all hotel reservations and cancellations",
    instructions="""You are a hotel reservation agent. Your job is to help users reserve rooms and cancel existing bookings.
        # Routine
        1. Determine if the user wants to make a new reservation or cancel an existing one.
        2. If making a reservation:
            a. Ask for the preferred room type (standard or deluxe).
            b. Use the `reserve_room_by_type` tool to find and reserve an available room.
            c. Save the booked room ID to the user's context.
            d. Confirm the reservation.
        3. If cancelling:
            a. Check the user's context for an existing room ID.
            b. If found, use the `free_reserved_room` tool to cancel the booking.
            c. Confirm the cancellation and clear the context.
            d. If no room is reserved, inform the user.

        If the request does not match these intents, transfer back to the primary agent.""",
    tools=[reserve_room_by_type, free_reserved_room],
    handoffs=[faq_agent],
)</code></pre></div><h4>Primary Agent</h4><p>The <code>primary_agent</code> is the triage or "front desk" agent.</p><ul><li><p>It interprets user intent and routes the conversation to the appropriate specialist agent:</p><ul><li><p>To the Reservation Agent for booking or cancellation requests,</p></li><li><p>To the FAQ Agent for questions about policies and amenities.</p></li></ul></li><li><p>If the intent is unclear or changes, the primary agent takes control or redirects accordingly.</p></li><li><p>The <code>RECOMMENDED_PROMPT_PREFIX</code> adds system-level context explaining the multi-agent system and how handoffs work, so the agent knows to switch seamlessly between specialists without exposing this to the user.</p></li></ul><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;8c664681-6442-47ba-9d4a-851ca0509a40&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">primary_agent = Agent[UserBooking](
    name="Primary Agent",
    instructions=(
        f"{RECOMMENDED_PROMPT_PREFIX} "
        "You are the primary triage agent in a hotel reservation system. You understand user intent and hand off to the appropriate specialist."
        "1. If the user wants to reserve or cancel a room, hand off to the Reservation Agent."
        "2. If they ask about hotel amenities, pricing, or policies, hand off to the FAQ Agent."
        "3. If unsure or the request changes, take control or redirect accordingly."
    ),
    handoffs=[reservation_agent, faq_agent],
)</code></pre></div><h4><strong>Circular Handoffs</strong></h4><p>To keep the conversation flowing naturally, both the reservation and FAQ agents also have the primary agent in their <code>handoffs</code> list. This allows them to hand back control if needed.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;c809affb-84bc-4590-b8e1-5baf3d5b79cb&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">reservation_agent.handoffs.append(primary_agent)
faq_agent.handoffs.append(primary_agent)</code></pre></div><p>This multi-agent setup creates a natural flow where the primary agent acts as the central coordinator, passing requests to specialist agents that use tailored instructions and tools. This design keeps responsibilities clear and allows each agent to focus on its expertise while providing a smooth user experience.</p><p>Place all this code, including imports and agent definitions, in a file named <code>hotel_agents.py</code> in your project directory.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/build-hotel-reservation-agent-with-openai-agent-sdk?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/build-hotel-reservation-agent-with-openai-agent-sdk?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h2><strong>Putting It All Together and Running the Agents</strong></h2><p>Now that we have defined our context, tools, and agents, it&#8217;s time to bring everything together and run our multi-agent system in action.</p><p>The core of the system is the Runner class from the OpenAI Agent SDK, which manages agent execution, context passing, and conversation flow, including seamless handoffs between agents.</p><p>Here is a simple example of how you can set up an interactive loop that:</p><ul><li><p>Takes user input from the console,</p></li><li><p>Passes it to the current agent (starting with the primary agent),</p></li><li><p>Handles multi-turn conversations,</p></li><li><p>Prints out agent responses and any tool calls or handoffs.</p></li></ul><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;cb4880ff-1fd8-463f-8a63-cdaaed93810f&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">import asyncio
from agents import Runner, MessageOutputItem, ItemHelpers, HandoffOutputItem, ToolCallItem, ToolCallOutputItem
from context import UserBooking
from hotel_agents import primary_agent

async def main():
    # Initialize user context that keeps track of user state like current room reservation
    context = UserBooking()
    
    # Start with the primary agent, which triages and routes user requests
    current_agent = primary_agent
    
    # Conversation history input items
    input_items = []

    while True:
        # Get user input from the console
        user_input = input("Enter your message: ")
        if user_input.lower() == "exit":
            print("Exiting the hotel reservation assistant.")
            break

        # Append user message to the conversation history
        input_items.append({"content": user_input, "role": "user"})

        # Run the agents system with the current agent, conversation input, and user context
        result = await Runner.run(
            starting_agent=current_agent,
            input=input_items,
            context=context,
        )
        
        # Process and display outputs from the agent runner
        for new_item in result.new_items:
            agent_name = new_item.agent.name
            
            if isinstance(new_item, MessageOutputItem):
                # Agent response message
                print(f"{agent_name}: {ItemHelpers.text_message_output(new_item)}")
            
            elif isinstance(new_item, HandoffOutputItem):
                # Inform user about agent handoff
                print(f"Handed off from {new_item.source_agent.name} to {new_item.target_agent.name}")
                # Update current agent to the new agent after handoff
                current_agent = new_item.target_agent
            
            elif isinstance(new_item, ToolCallItem):
                # Indicate when the agent is calling a tool
                print(f"{agent_name}: Calling a tool...")
            
            elif isinstance(new_item, ToolCallOutputItem):
                # Output from a tool call
                print(f"{agent_name}: Tool call output: {new_item.output}")
            
            else:
                print(f"{agent_name}: Skipping item: {new_item.__class__.__name__}")

if __name__ == "__main__":
    asyncio.run(main())</code></pre></div><p>Here&#8217;s a breakdown of the code above:</p><ul><li><p><strong>UserBooking Context:</strong> This keeps track of the user's current reservation state, like the <code>room_id</code> they booked.</p></li><li><p><strong>Primary Agent:</strong> Acts as the triage agent directing queries to FAQ or reservation specialists.</p></li><li><p><strong>Input Items:</strong> All user messages are accumulated to provide full conversation context for the agents.</p></li><li><p><strong>Runner.run:</strong> This asynchronously processes user input with the current agent and returns messages, handoffs, and tool calls.</p></li><li><p><strong>Output Handling:</strong> The loop prints agent messages, notifies about handoffs, and displays tool call outputs, maintaining a smooth dialogue flow.</p></li><li><p><strong>Agent Switching:</strong> When a handoff happens, the <code>current_agent</code> updates to the new agent to continue the conversation.</p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/p/build-hotel-reservation-agent-with-openai-agent-sdk/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/p/build-hotel-reservation-agent-with-openai-agent-sdk/comments"><span>Leave a comment</span></a></p><p>At this point, our main program is complete and saved as <strong>main.py</strong>. The full project structure looks like this:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;1af38d02-f996-44d5-bcdb-abe054dc7ebf&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">hotel-reservation-agent/
&#9500;&#9472;&#9472; main.py
&#9500;&#9472;&#9472; context.py
&#9500;&#9472;&#9472; tools.py
&#9500;&#9472;&#9472; room_manager.py
&#9492;&#9472;&#9472; hotel_agents.py</code></pre></div><p>To start the hotel reservation assistant, run the following command:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;5d685c16-5e8b-438e-bb38-d13bf2c0f626&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">python main.py</code></pre></div><p>You can find the complete code available <a href="https://github.com/Neurl-LLC/Hotel_Reservation_AI_Agent">here</a>.</p><h2><strong>Conclusion</strong></h2><p>In this article, we built a multi-agent hotel reservation system using the OpenAI Agent SDK. We covered how to structure AI agents with clear roles, manage user context, and connect tools to interact with an external room manager. </p><p>By orchestrating seamless handoffs between specialized agents, we created a natural, conversational experience for hotel booking and FAQs.</p><p>Looking ahead, this system can be enhanced by integrating a real database instead of an in-memory store, adding user authentication, and expanding agents to handle more complex workflows like payment processing or personalized recommendations. The modular design makes it easy to extend and adapt for real-world deployment.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://neurlcreators.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://neurlcreators.substack.com/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item></channel></rss>