Two Machines, One Brain

My laptop kept getting OOM-killed. Multiple Claude Code sessions, a browser full of tabs, and AI pipelines all fighting over 32GB of RAM. The OOM killer would nuke whatever was biggest, usually the thing I was actively working in.

Meanwhile, the desktop sits three feet away. 64GB RAM, RTX 4090 with 24GB VRAM. Mostly idle. Bought it for gaming and occasional GPU work, but 90% of the time it’s a very expensive space heater.

So I connected them. An afternoon of work, and now the laptop offloads all GPU tasks to the desktop, the workspace syncs bidirectionally between both machines, and I can switch the desktop back to gaming mode with one command.

GPU Offloading

Four lightweight API services run on the desktop, each wrapping a GPU tool that was already installed there. Talking head video generation, audio transcription, image generation, and a coordinator that tracks VRAM usage and mode state.

These are the three heaviest GPU consumers in our daily pipelines. SadTalker pegs the GPU for 15+ minutes generating avatar videos. Whisper loads a 4GB model just to transcribe YouTube audio. ComfyUI eats VRAM for image generation. All of that now happens on the desktop’s 4090 instead of fighting over laptop resources.

Each service runs as a systemd unit. Start them, stop them, check their health — standard Linux service management.

Workspace Sync

Unison handles bidirectional sync between the machines. It tracks state on both sides, detects conflicts, and the laptop always wins. Ignore list keeps it fast — no .git, no node_modules, no virtual environments, no media files. First sync moved 2GB in under 3 minutes. Incremental syncs take under a second.

This means I can run Claude Code on the desktop when I need the full 64GB of RAM. A wrapper script handles the cycle: sync workspace over, SSH into a tmux session, open Claude Code, sync changes back when done. No more OOM kills.

Gaming Mode

The whole point of the desktop is that it’s also a gaming machine. GPU services and Steam don’t share 24GB of VRAM gracefully.

One command toggles between modes. Gaming mode stops the GPU services, unloads any AI models from VRAM, and updates the state so pipelines gracefully skip GPU work instead of timing out. Switching back restarts everything in about 15 seconds.

What Changed

The laptop runs cooler. No more 4GB Whisper model loaded into local GPU memory. No more 15-minute SadTalker sessions pegging the GPU at 100%. Claude Code sessions on the desktop get 64GB of RAM to work with.

An afternoon of Claude Code and some SSH. Most of that was just wrapping tools that were already installed. If you’ve got a second machine collecting dust, this is a solved problem.