The Machine That Dreams: Inside My Personal AI Stack
A couple of weeks ago, I wrote about how AI eliminates the Non-Value-Add Positions — any role, at any seniority, whose primary function is to collate information, compress it, and forward it upward or downward. The piece provoked the question I’ve been getting ever since: what does that actually look like in practice?
So let me open the hood.
What I run today is not a custom science project. It is an opinionated assembly of public, mostly open-source components — each one picked for a specific job — stitched together on hardware I already owned. The interesting part is not that it exists. It’s that the ecosystem around agentic AI has matured enough in the last four months that any serious operator can build a version of it. And the cost structure of doing so is an order of magnitude lower than what most enterprises are still paying for SaaS “productivity suites.”
Here is the setup, why each piece is in it, and why I think this now matters for the corporate world — not just for me on a Sunday afternoon.
The Stack
There is no single “AI” in my workflow. There are multiple models and one memory system, each with a specific role, and the whole thing is orchestrated by OpenClaw.
OpenClaw is the conductor — the runtime that connects messaging channels (Telegram, in my case), schedules iterative tasks, loads “skills,” and invokes the right AI model for the right job. Think of it as the operating system of a Chief of Staff. The actual thinking is done by models that OpenClaw calls on demand.
Gemini 3 Flash and Gemini 3.1 Pro are my default models — not Claude. This is a deliberate cost decision. An always-on agent that runs dozens of tasks a day burns tokens at a completely different scale than a human chatting with a chatbot. At roughly $0.30 per million input tokens and $0.60 per million output, Flash is about ten times cheaper than Claude Sonnet and roughly fifty times cheaper than Claude Opus. Gemini 3.1 Pro, at around $2 per million input, undercuts Sonnet meaningfully too — I started on Sonnet and migrated once the cost arithmetic became obvious.
The split between the two runs like this. Flash drives the scheduled cron jobs — the background work that runs without my attention. Healthcheck sweeps, variant generation, routine analytics, data fetches. Fast, cheap, good enough. Gemini 3.1 Pro is reserved for the moments where quality is visible. Two in particular: the Monday morning routine, because it is the first thing I read each week and the framing has to earn my attention — and live Telegram chat with me, where I am asking questions in real time and both latency and judgment matter. Use the expensive model where you’ll notice it. Everywhere else, Flash.
Claude still has its place, but outside the OpenClaw pipeline. I keep a Claude Max subscription on the Claude app for personal writing and deep reasoning — the pieces where voice and judgment really carry the argument. Different tools for different benefits. That’s the whole game.
Qwen 14B (local) handles the overnight heavy lifting. Because it runs on my own hardware, inference is effectively free after electricity. It powers the Dream Cycle (more on that below), processes sensitive documents that should never leave the building, and handles anything else that benefits from long, uninterrupted, private processing. Why pay for it when it can run on an idle PC I already own?
Nomic-Embed-Text (local) is the search engine. Nomic is an open-source embedding model with an 8,192-token context window that, notably, outperforms OpenAI’s Ada-002 on standard benchmarks — and you can run it locally under Apache 2.0. Its job is purely mathematical: turn every entry in my memory system into a search vector. Running it locally means my entire searchable memory index is private, portable, and proprietary. No vendor ever sees it.
GBRAIN is the memory layer. This is open-sourced by Garry Tan (the President and CEO of Y Combinator, who built it to run his own agents), and it has become the canonical knowledge brain for the OpenClaw ecosystem. The architecture is elegantly simple: markdown files in a git repo as the source of truth, Postgres with pgvector for hybrid search, and a “compiled truth plus timeline” pattern on every page — your current best understanding above the line, an append-only evidence trail below. Every person, company, conversation, decision, and document flows into it.
The Hardware: Meet Claw Station
Before anyone imagines a cold room full of GPUs — this entire setup runs on a consumer all-in-one that was already sitting on a side table, barely used. I repurposed hardware I already owned. Zero incremental capex.
- Machine: Lenovo Yoga AIO 7 (27ARH7) — a consumer all-in-one, a few years old
- CPU: AMD Ryzen 5 6600H — 6 cores, 12 threads, Zen 3+
- RAM: 16GB LPDDR5 (14GB usable; ~2GB reserved for integrated graphics)
- Storage: 512GB NVMe SSD — currently ~100GB used, 345GB free (about 23% utilization)
- OS: Ubuntu 25.10
- Local model: Qwen 14B, running with roughly 10GB of RAM free at idle
That’s it. No dedicated GPU. No workstation-class tower. No data-centre contract. No new hardware purchase either — this machine was already in the house. A few-year-old consumer all-in-one is quietly running my Dream Cycle every night, indexing my entire memory, embedding thousands of documents, and handling everything I want kept off the cloud — for the cost of electricity. I’m using less than a quarter of the storage and have room to double the model footprint before I’d even think about upgrading.
If a single consumer PC that was sitting idle can host the personal cognitive infrastructure of a CEO running two companies — what exactly is your organization paying enterprise productivity licences for?
The Dream Cycle — Memory That Consolidates Itself
Dreaming is an official OpenClaw feature, not something I invented. It’s a three-phase background process, modeled on how biological sleep consolidates human memory, and it runs on cron every night at 2 AM:
- Light Sleep — ingests the day’s raw signals (conversations, documents, web fetches) and stages candidates.
- REM Sleep — reflects on patterns, builds theme summaries, extracts connections across disparate inputs.
- Deep Sleep — promotes entries that pass a three-gate threshold (minimum score, recall count, query diversity) into long-term memory. Everything else is quietly allowed to fade.
The weighting matters. Only candidates that score above 0.8, that have been recalled at least three times, across at least three distinct queries, make it into permanent memory. Everything else ages out. This solves the classic knowledge-management failure mode — either everything lands in your notes and they become an unsearchable swamp, or nothing does and you lose the pattern.
Every phase of this cycle now runs on local Qwen 14B. Earlier versions of the setup made occasional cloud calls for the harder reflection steps, and that added pennies per night. Moving the whole thing fully local brought the cost to zero, privacy to absolute, and the quality held up. Nothing leaves Claw Station.
By the time I wake up, my brain-outside-my-brain has digested the previous day and filed it correctly. When I reference “that guy from the Microsoft piece” or “the PE fund from last Tuesday,” GBRAIN already knows — because it dreamed about it overnight.
This is not a fancier Evernote. This is institutional memory for a single operator, running on hardware I own, with no SaaS vendor in the loop.
What GBRAIN Actually Does
The architecture sounds abstract. The behaviour is not.
A keyword database tells you where the word “consultant” appears. GBRAIN tells you about the consultant who pushed back on your ESG positioning eighteen months ago, even though his name has evaporated from your memory and the meeting notes never used the word “pushback.” The embedding captures the theme of the conversation — external voice, ESG scepticism, structured counter-argument — and surfaces him with follow-up emails and the eventual outcome attached.
Three examples make the difference concrete:
- The “I know we talked about this somewhere” search. I want everyone I have ever discussed contactless retail with in Asia. A keyword search on “contactless” misses the JD.com executive who called it “friction-free checkout,” the Huawei deck about “ambient commerce,” and the mall-tour note that just said “no cashiers, interesting.” GBRAIN links all three because the concept is the same even when the words never line up.
- Cross-language bridging. Working across Asia means a meeting note in Thai, a Slack comment in English, and a legacy document in German can all describe the same thing. Nomic embeds meaning, not vocabulary, so it connects them anyway.
- The person I cannot name. “That PE partner who was sharp on supply-chain fragility but wrong about the timeline.” No name, no firm, no date. GBRAIN surfaces him because the embedding captures the argument, not the label.
And the part that matters for anyone who has tried to build something like this at enterprise scale: I do not manage the vector database. Because Nomic is wired directly into OpenClaw’s write pipeline, every page update triggers embedding, entity extraction, and index maintenance automatically. When a portfolio company rebrands, the new information flows in, the compiled truth is rewritten, the timeline preserves the history, and the vectors refresh themselves — with zero manual intervention, no chunking strategy to tune, no ingestion script to maintain. The database manages itself.
This is the detail most enterprise “AI memory” vendors bury under six layers of managed service and an annual licence. The open-source version is now operator-simple.
Monday 08:00 — The Competitor Report
Every Monday at 8 AM, a Generative Engine Optimization report lands in my inbox before I open my laptop. It analyzes the AI-discoverable footprint of my competitive set. Who’s getting cited in LLM answers this week? Who dropped off? What changed?
Gemini 3 Flash runs the queries across multiple engines. Qwen on Claw Station diffs against last week’s baseline. Gemini 3.1 Pro writes the final narrative — because the framing has to be sharp enough that leadership actually reads it. OpenClaw renders the HTML and sends the email.
No one briefed it. No one assembled the deck. Total cost of running this weekly: a few cents in Gemini tokens.
Sunday Night — The Healthcheck
On Sunday evenings, a different scheduled agent runs a full system security audit: exposed endpoints, credential rotations, dependency updates, skill integrity, API-key drift. If something is wrong, I hear about it immediately. If everything is fine, I do not get an email.
This is how a competent operator actually wants to be informed. Most corporate reporting is the opposite — endless confirmation that nothing is happening, dressed up as accountability.
On Demand — The Rest of the Staff
Beyond the scheduled work, the system is on call for whatever the day throws at it:
- Blog drafting. OpenClaw captures my thinking and scaffolds the structure. Gemini 3.1 Pro handles the first draft. Final polish happens in the Claude app on my Max subscription — where voice matters most. Edit, tighten, publish. Yes — including this one.
- LinkedIn and X posting. Variant generation in three languages, tone calibration, scheduling — Gemini 3 Flash handles the volume, Gemini 3.1 Pro polishes the hero variant.
- Financial report analysis. Drop in an annual report or dataset, get back comparisons, peer benchmarks, red flags.
- Deep-dive strategy sessions. I argue with the machine the way I argue with a smart consultant. It pushes back, surfaces counter-arguments, synthesizes.
Why This Setup Makes Sense Now
The obvious pushback: why not wait for Microsoft or OpenAI to package this?
1. The pieces finally work. Nine months ago, this was duct tape. Today, each component is best-in-class: OpenClaw for orchestration, GBRAIN for memory, Nomic for embeddings, Qwen and Gemini for reasoning. And each is replaceable. I’m not locked into anyone’s ecosystem.
2. Sovereignty. My institutional memory — every person, every conversation, every insight — sits on Claw Station in my office, not in a SaaS database someone else prices, controls, and can revoke. That matters as a CEO. It matters more as a Chief Digital Officer sitting on group strategy.
3. Economics. This is the one most enterprise buyers have not yet internalized. Gemini Flash at thirty cents per million input tokens is not “AI pricing” — it is closer to database-lookup pricing. Local Qwen and local Nomic are free after hardware, and the hardware itself was already in the house. The whole setup costs me less per month than a single seat of most enterprise SaaS tools. For a company paying seven figures a year for productivity software that still requires armies of PAs and analysts to operate it, the arithmetic is brutal.
4. The learning is not optional. If I had waited for the polished enterprise version, I would have learned nothing. The only way to understand where this technology is going — as a leader, not a spectator — is to build with it. The strategic insight is a by-product of the work. You cannot outsource the learning and then expect to lead the transformation.
The Personal Impact
The honest answer on what this gives me: attention.
I no longer spend mornings reconstructing yesterday — the Dream Cycle handled it. I don’t brief myself on Bangkok’s competitive landscape on Monday morning — the report briefed me. I don’t chase who said what three weeks ago — GBRAIN surfaces it instantly.
What’s left is the part only I can do: decide, argue, pitch, build relationships, write in my own voice, and push the organization forward. The same things I was doing before — but now I do them instead of managing the people whose job was to prepare me to do them. My calendar is lighter. My output is heavier. That’s the trade.
And This Is Where It Gets Interesting for Corporates
Everything I just described is a personal deployment. Now extrapolate.
Nvidia’s CEO Jensen Huang said at GTC 2026 that “every company needs an OpenClaw strategy,” and called OpenClaw “the operating system for personal AI.” He’s not wrong, but he’s also not neutral. Nvidia launched NemoClaw at the same GTC: an open-source enterprise wrapper around OpenClaw that adds policy-based privacy guardrails, a privacy router to block unauthorized data egress, and full audit trails. It installs in a single command.
The significance is this: the open-source agentic stack powering my desk is the same stack that, wrapped in NemoClaw, becomes deployable inside a regulated enterprise. A Thai bank, a luxury retail group, a listed conglomerate — any of them can run this architecture with proper guardrails. The question has shifted from “is agentic AI ready for the enterprise?” to “does your organization have a deployment strategy, or are your employees already running shadow agents on their personal laptops?”
Spoiler: your employees are already running shadow agents on their personal laptops. Your security team has not yet noticed.
For every executive reading this, the corporate implications are concrete:
- Your software procurement model is obsolete. The unit economics of Gemini Flash plus open-source orchestration crush traditional SaaS. Stop signing five-year contracts for productivity tools that will be undercut by agents in eighteen months.
- Shadow AI is your biggest near-term risk and your biggest near-term opportunity. Ban it and you lose the learning. Ignore it and you lose control. The answer is NemoClaw-style wrappers, clear policy, and rapid legitimization of what your best people are already doing.
- The delayering accelerates. Hierarchies built to move information up and decisions down are the exact scaffolding that agentic memory replaces. Flatter, expertise-driven organizations — the argument I made in the Executive Reboot — become achievable, not aspirational.
The Choice, Again
I said it in the Executive Reboot: every executive has to decide whether to defend the old model or pick up the tools.
This is what picking up the tools looks like, a few months in. Mostly markdown files, YAML configs, a few-year-old consumer desktop humming quietly at 2 AM, and a handful of models each doing what it does best. Not glamorous. But the compounding effect is real — and it widens the gap every week between those building with this and those still waiting for the vendor roadshow.
Your competitors are already dreaming at night.
Are you?
One Response
[…] For a look at the actual tools powering this shift, check out The Machine That Dreams: Inside My Personal AI Stack. […]
Comments are closed.