post after a while, and I want to start with something that bit me early on.
When I was building and shipping my first Generative AI product, I did what most of us do. I hard-coded the prompts. It worked until it didn’t. Every time I wanted to tweak the tone, improve the wording, or fix a hallucination, it meant pushing code and re-deploying the service.
This made fast iteration nearly impossible and left product folks completely out of the loop. Eventually I realised that prompts should be treated like content, not code.
What Breaks When Prompts Live in the Code
At first it feels like just another string in your backend. But prompts are not static config. They are behaviour and behaviour needs room to evolve.
The moment your prompts are shipped with your code, every small change becomes a process. You need to create a branch.
Make a commit. Open a pull request. Wait for CI pipelines to run. Merge. Then redeploy. All this friction for what might be a one-word change in how your assistant talks to users.
You lose the ability to iterate quickly. You block product folks or non-engineers from contributing. And worst of all, your prompts end up inheriting all the friction of your backend deployment process.
It also becomes nearly impossible to understand what changed and why. Git might show you the diff, but not the outcome.
- Did that change to reduce hallucinations?
- Did it make completions shorter?
- Are users happier?
Without tracking and experimentation, you’re guessing. You wouldn’t hard-code customer support replies in your source code or your marketing copy. Prompts deserve the same level of flexibility.
What Prompt Management Actually Looks Like
Prompt management is not some fancy new practice.
It is just applying the same principles we already use for other dynamic parts of the product, like CMS content, feature flags, or translations.
A good prompt management setup gives you a place outside of your codebase where prompts can live, evolve, and be tracked over time.
It does not have to be complex. You just need a simple way to store, version, and update prompts without touching your application code.
Once you decouple prompts from the code, everything gets easier. You can update a prompt without redeploying. You can roll back to a previous version if something breaks.
You can let non-engineers make changes safely, and you can start connecting prompt versions to outcomes, so you can actually learn what works and what does not.
Some tools offer built-in versioning and prompt analytics. Others plug into your existing stack. The important thing is not what tool you use, but that you stop treating prompts as static strings buried in code.
Using Langfuse For Prompt Management
One tool I have used and recommend is Langfuse. It is open source, developer-friendly, and built to support teams working with LLM-powered applications in production.
Prompt management is just one of the things it helps with. LangFuse also gives you full visibility into your application’s traces, latency, and cost.
But for me, it’s the approach to managing and iterating on prompts that was a turning point.
Langfuse gives you a clean interface where you can create and update prompts outside your codebase.
You can version them, track changes over time, and roll back if something goes wrong.
You can also A/B test different versions of the same prompt and see how each one performs in production and you can do all this without redeploying your app.
This is not a sponsored mention. Just a personal recommendation based on what has worked well in my own projects.
It also makes it easier for non-engineers to contribute.
The Langfuse console lets product teams or writers tweak prompts safely, without touching the codebase or waiting for a release. It fits well into modern Generative AI stacks.
You can use it with LangChain, LlamaIndex, or your own custom setup and since it is open source, you can self-host it if you want full control.
A Quick Look at How it Works
Just to give you a feel for it, here’s a basic example of how prompt management with Lang-fuse works in practice.
We can simply create a new prompt with variables, through the User Interface (you can create or update prompts programmatically, too).
Note the production
and latest
labels assigned to the specific prompt version. You can use labels to retrieve specific versions of the prompts.
This makes it super easy to test new prompt versions on staging or development environments as well as performing A/B testing.
We can now pull the latest version of a prompt and use it in a simple generation pipeline with Google’s GenAI SDK.
What I’d Do Differently Today
If I were starting again, I would never hard-code prompts into my app. It slows you down, hides things from the people who could help, and turns every tiny change into a release.
Prompt management sounds like a nice-to-have until your first iteration bottleneck.
Then it becomes obvious. Decouple your prompts early. You’ll move faster, build better, and keep your team in the loop.
Source link
#Prompts #Dont #Belong #Git