Our work

Built for us. Built for you next.

It's the spark for what we could build for you.

Live

Pinecone

Production-grade OSINT capture, paired with an R&D agent platform we push to its limits

Three pieces in one stack. (1) A continuous OSINT capture pipeline, indexed daily, in active use. (2) The same agent technology we deploy for clients, on-rails and on-demand. (3) An R&D layer where we run those agents 24/7 unattended on our own ops, finding failure modes before clients see them. All on a local supercomputer cluster: token cost is just watts, sensitive data never leaves the premises.

Technical stack

  • Local AI supercomputer cluster (cost is watts, data stays on premise)
  • Continuous OSINT capture and corpus indexing
  • Autonomous agent technology (on-rails for clients, 24/7 in R&D)
  • Agent fleet observability and tracing

Why it matters

Most agentic systems demo well and break at month three. The patterns we deploy on client work have already survived months of unattended operation. You get an architecture proven against the failure modes that emerge over time, not just the ones that show in a demo.

Patterns used Continuous data ingestion · Autonomous agent technology · Agent fleet observability
Paused

AutoPundit

Daily AI-generated YouTube channel, multi-model pipeline end-to-end

Topics get selected from research briefs, scripts get written, AI-generated audio narrates, image and video models produce visuals, and the final cut composes and uploads automatically. Paused while character and audio generation costs continue to drop.

Technical stack

  • Multi-model creative pipeline (LLM, TTS, image, video)
  • End-to-end automated production
  • Cron-scheduled, runs unattended

Why it matters

Multi-model orchestration is its own discipline. We have shipped end-to-end content pipelines that chain LLM, TTS, image, video, and avatar systems into something that runs unattended.

Patterns used Autonomous agentic flow · Filter-then-generate cascade · Multi-model orchestration
Archived

Herald

AI social media manager, ~500 followers in 3 months before X banned automation

Two-tier architecture: a fast model triages relevance, a reasoning model generates contextual replies. Brand voice loaded from context. Human-in-the-loop approval before posting. The X policy change in 2025 ended autopilot operation. The pattern lives on as a manual reply runbook.

Technical stack

  • Tiered model cascade (cheap filter, reasoning model generates)
  • Brand voice loaded from context
  • Human-in-the-loop approval before posting

Why it matters

The cost-tier pattern (cheap-model filter, expensive-model generate) is reusable everywhere. So is brand-voice-in-context. We learn from every build, including the ones the platform kills.

Patterns used Filter-then-generate cascade · Brand voice in context · Human-in-the-loop gate
Live

InkwellAI

Proposal editor with audio-driven review and agentic note integration

An audio-driven editor for long-form documents. Listen to your draft narrated, take notes inline as you go, and an agent integrates those notes into a revised draft. Built for high-stakes documents that need careful, repeated review.

Technical stack

  • Audio-driven document review
  • Agentic note integration produces revised drafts
  • Built for long-form, high-stakes documents

Why it matters

Editing long documents is judgment-heavy work that loses momentum if you have to break flow to mark up a PDF. Listen, mark, integrate, repeat.

Patterns used Editorial workflow automation · Audio-driven review · Agentic document editing
Live

Lead-Ops

Agentic B2B outreach, multi-source discovery with human-in-the-loop gates

Multi-source prospect discovery feeds a pipeline that enriches, scores fit, and drafts outreach. Each stage gates for human review before progressing. The agent does the slow work; the human does the cheap work.

Technical stack

  • Multi-source signal-based prospecting
  • Staged human-in-the-loop gates
  • AI-driven scoring and personalized drafting

Why it matters

Outreach pipelines with judgment built in, not blast tools. The same pattern works for any pipeline that needs human checkpoints: sales, customer onboarding, content moderation.

Patterns used HITL agentic flow · Signal-based prospecting · Staged pipeline gates
Pre-launch

Wild Companion

Character AI app with multi-user group chat, persistent memory, and image generation

Character AI app with multi-user group chat, persistent memory, and image generation. Cross-platform mobile, built on Blazor with a Supabase backend and integrated billing.

Technical stack

  • Cross-platform mobile (Blazor)
  • Supabase backend with row-level security
  • Multi-tenant character memory and image generation

Why it matters

Consumer apps live or die on three things: AI features that feel alive, auth and billing that don't fail, and polish details that don't slip. We engineer all three to ship-ready. The same bar applies to internal tools and customer portals.

Patterns used Cross-platform consumer mobile · Multi-tenant secure backend · Integrated AI features
Runbooks

The patterns above, made executable.

Each runbook is a markdown file an agent reads and runs against your stack, pausing where human judgment matters. This is how the patterns ship — not as slideware, as files.

Runbook · Live

Wild Companion Testing Runbook

Agentic red-teaming of mobile apps on a real Android emulator

An orchestrator dispatches sub-agents to drive a real Android emulator. Categories include visual regression, adversarial red-teaming, multi-model comparisons, context-retention, stress, network resilience, and code-level security checks. Output: an audit report.

What it does

  • Agentic red-teaming on real device
  • Sub-agent orchestration with token discipline
  • Adversarial safety prompt suite (jailbreak, prompt injection, policy bypass)

Why it matters

QA at the polish level customers expect, run by agents instead of manual click-throughs. The pattern works on web apps, internal tools, anything with a UI surface.

Patterns used Agentic red-teaming · Sub-agent orchestration · Adversarial safety testing · Mobile UI automation
Runbook · Live

Pinecone Agent Audit Runbook

Diagnose and fix degradation in a long-running multi-agent system

When our R&D agent platform shows degradation, this runbook investigates: pipeline health, agent performance, config-runtime alignment, gaming detection, and fix planning. The output is a written fix plan with citations. A separate session executes the plan and verifies the metric actually moved.

What it does

  • Agent fleet observability and drift detection
  • Production agent ops discipline
  • Web-search-grounded fix plans with citations

Why it matters

Multi-agent systems silently degrade if you do not actively look. This is the discipline that keeps the system shipping clean output day after day.

Patterns used Agent fleet observability · Production agent ops · Web-search-grounded fixes · Plan-then-execute
Runbook · Live

Federal Proposal Runbook

End-to-end runbook for finding, scoring, and writing federal proposals

A runbook that walks the federal proposal workflow from topic discovery through draft assembly to review-ready output. Reusable boilerplate, agency-specific templates, and opportunity scoring keep the boilerplate parts repeatable so you can focus on the technical narrative.

What it does

  • LLM-based opportunity fit scoring
  • Reusable proposal boilerplate and templates
  • Compliance-aware assembly automation

Why it matters

Federal proposals are repetitive on the boilerplate parts and judgment-heavy on the technical content. The runbook handles the repetition so you keep your focus on the narrative.

Patterns used Reusable spec-driven assembly · Opportunity fit scoring · Compliance-aware automation
Runbook · Portable

Delivery Playbooks

Reusable, language-agnostic runbooks we deploy on our own work and yours

A library of portable playbooks that drop into any codebase. Same shape across all of them: investigation phases produce a written artifact, human approves a subset, execution phases follow. Nothing destructive without approval. We use them on our own code; they are also part of what client engagements include.

What's in the library

  • AI-driven code audit — read-only assessment of any codebase
  • Behavior-preserving refactor — kills oversized modules, atomic commits
  • Idea pressure-testing — product, market, and business-fit audit
  • Editorial workflow automation — session-scoped review-and-integrate for long-form documents

Why it matters

Each is portable in the literal sense. Point an agentic coding tool at one and say "follow it on this codebase," and the work runs end-to-end. Same discipline every time.

Patterns used 5-phase / 2-gate audit shape · Behavior-preserving refactor · Atomic commits · Backup branch before any change

See a piece of work you've been putting off in here?

The free discovery is mostly figuring out which bucket each piece of your work falls into. Modernization, agentic flow, runbook, or some mix.

Take the Assessment Not sure what you need? Three minutes, no contact info.
Book a Discovery Call Have a project in mind? 30-minute scoping conversation.