Member of Technical Staff - Agent Engineer
Agents are only as good as the team behind them. We're experienced researchers and engineers building vertical AI agents that run in production inside some of the most demanding environments there are: health insurers, banks, and the public sector, where decisions are audited and mistakes carry real consequences.
We're looking for a strong generalist who's comfortable moving across the stack and isn't fenced into one specialty, the kind of engineer who can follow a problem from the frontend down to the infrastructure and into the AI layer. Agents are a big part of how you cover that ground. The twist that makes this role unusual: you use agents to improve agents. You'll wield agentic coding tools to build, evaluate, and harden the production AI agents we ship to customers, and feed what you learn straight back into making the next ones better. You go wherever the hard problem is, and you understand what's happening underneath, in the systems you build and in the models and agents themselves.
What you'll do
Use agents to improve agents. Wield agentic coding tools (Claude, Codex, ...) to build, evaluate, and harden the production agents we ship, then fold what you learn back into faster, better tooling and stronger agents for the next use case. The better you get at directing agents, the better the agents you ship.
Work across the stack. Frontend, backend, infrastructure, data, and the AI layer. You go where the problem is instead of waiting for it to land in your lane.
Build and ship production AI agents. Take agents from prototype to production inside regulated environments: orchestration, tool use, messy real-world inputs, and the reliability and auditability real operations demand.
Own things end-to-end. From a vague problem to a deployed, reliable system: design it, build it, deploy it, and keep it running.
Go deep when it counts. Understand the systems beneath the surface, distributed systems behavior, infra, and how LLMs and agents actually work, so you can debug what others can't and make good calls under uncertainty.
Stay ahead. The tools and models change every few weeks. You test new releases early and translate what works into how you and the team build.
What we're looking for
Must-haves
Generalist range: you're excellent in at least one area and genuinely comfortable working outside it. You don't need to be an expert in everything; you need to be the kind of person who picks up an unfamiliar part of the stack and gets productive fast, with agents helping you go further.
Fluency with coding agents as your default way of working. You can point to specific workflows, failure modes, and things you've actually built or shipped with agents, not just tried.
A real mental model of how LLMs and agents work: why they fail, how to evaluate them, and how to make non-deterministic systems dependable.
Strong engineering judgment: you read unfamiliar code critically, catch when an agent has gone off track, and know what good architecture looks like.
Enthusiasm about AI and its applications. In software development and beyond.
On-site collaboration 3 days/week in Berlin or Bremen. Travel to our Bremen HQ during onboarding.
Fluency in English (at least B2).
-
Valid EU work authorization.
Nice-to-haves
Depth in one area that complements the breadth (e.g. distributed systems, infra/DevOps, full-stack, or applied AI/LLMs).
Experience building agent systems: orchestration, tool use, evaluation, or agent frameworks.
Experience taking AI from prototype to production, not just demos.
Experience in regulated industries (insurance, banking, public sector) or other compliance-heavy domains.
German language skills.
Open-source contributions or public writing on agents, applied AI, or agentic workflows.
What matters most
We prioritize demonstrated excellence in your projects and career. If you're motivated to build and optimize AI solutions, we want to hear from you, even if you don't meet every single criterion.
Why us?
Shape the future of AI development: You'll have real influence over our products and technical direction, helping decide how AI agents get built, evaluated, and deployed in the environments where it's hardest to get right.
Always at the frontier: You'll work with the newest models and techniques the moment they land, on the problems that make agents actually function in production: orchestrating multi-step workflows, integrating and switching across LLMs, building robust evaluation and guardrails, handling messy real-world inputs (PDFs, scans, voice), and engineering for auditability and reliability under regulatory constraints. Modern, well-architected systems, no legacy baggage holding you back.
Career-defining opportunity: AI agents are about to reshape how entire regulated industries operate, and getting them out of the demo and into real operations is the hardest, most valuable problem in the field right now. Almost no one has done it inside environments like health insurers, banks, and public institutions. You'll be one of the people who builds them first, and walk away with expertise and a track record that very few engineers in the world can claim.
Ownership and impact: Get full end-to-end ownership of the agents and systems you build, direct collaboration with AI researchers and engineers, and immediate feedback on how your work helps customers ship reliable AI. Your engineering decisions directly shape agents that make real, audited decisions in production.
Competitive package with upside: In addition to a competitive salary, we offer a VSOP (Virtual Stock Option Program) to give you a stake in the company's success as we grow.
Best-in-class development experience: Generous, no-friction access to all the AI tools and platforms that make your day-to-day faster, so you spend your time on hard problems, not on overhead.
Work environment: Our Bremen office features stunning waterfront views, complimentary beverages, smoothies, and a boat. We also have an office in Berlin, giving you flexibility across both locations.
Grow with transformative technology: Build deep expertise in AI agents, evaluation and infrastructure alongside our expanding team, mastering the technologies that are reshaping entire industries.
About ellamind
We build the tools enterprises need to trust, deploy, and scale AI agents. elluminate evaluates LLMs and agents with evidence instead of guesswork; ellarun deploys them securely in hours, not months; and the ellaverse provides realistic, domain-specific, rigorously validated environments to put agents through their paces before they ever reach a customer. We like owning problems end-to-end, shipping pragmatically, and giving back to the open-source community. We're cash-flow positive, with offices in Bremen (HQ) and Berlin.