How I set up a self-improving, always-on personal knowledge base that ingests my blog articles, X activity (posts, articles, likes), news articles from the web, research reports and documents; all without paying $200/mo for X API or $8/mo for ngrok.
What Is GBrain?

GBrain is an open-source personal knowledge brain built by Garry Tan (President & CEO of Y Combinator) to run on top of AI agent platforms like Hermes and OpenClaw. Think of it as a persistent, compounding memory layer for your AI agent; not a chatbot that forgets everything between sessions, but a system that gets smarter every day.
The core idea, also articulated by Andrej Karpathy in his LLM Wiki pattern: instead of re-deriving knowledge from raw documents on every query, the LLM builds and maintains a structured wiki of markdown pages that compounds with every new source you feed it.
What it does in practice:
- Ingests articles, PDFs, tweets, emails, meeting transcripts
- Builds a knowledge graph with typed relationships between entities
- Runs hybrid search (vector + keyword + graph traversal)
- Maintains itself overnight via a dream/maintenance cycle
- Gets smarter on autopilot while you sleep
This guide covers exactly what I did to get it running, including the parts that aren’t in the README.
Prerequisites
- Install Hermes on AWS EC2 VPS (or any Ubuntu 24 server). I use
t3.medium - Hermes Agent already installed and running
- Basic comfort with the terminal
- An X (Twitter) developer account (free tier works for personal use)
Part 1: Installing GBrain
Step 1: Install Bun
GBrain runs on Bun (fast JavaScript runtime). Install it:
curl -fsSL https://bun.sh/install | bash
source ~/.bashrc
which bun # should return /home/ubuntu/.bun/bin/bun
Step 2: Clone and Install GBrain
git clone https://github.com/garrytan/gbrain ~/gbrain
cd ~/gbrain
bun install
bun link
Verify:
gbrain --version
# gbrain 0.22.x
Step 3: Fix the PATH Problem (Critical)
This is the most common issue. Hermes spawns commands in a non-interactive shell that doesn’t inherit your PATH. Fix it by adding Bun to ~/.profile (not .bashrc):
echo 'export PATH="$HOME/.bun/bin:$PATH"' >> ~/.profile
source ~/.profile
Verify Hermes can see it:
bash -c 'which gbrain'
# should return /home/ubuntu/.bun/bin/gbrain
Step 4: Create a Dedicated Brain Repo
Do not use the ~/gbrain directory as your brain repo — that’s the GBrain code itself. Create a separate git repo for your content:
mkdir ~/brain
cd ~/brain
git init
git commit --allow-empty -m "init brain repo"
Step 5: Initialize GBrain
gbrain init
# Uses PGLite by default — no server required, zero config
Step 6: Set Up Auto-Sync Cron
GBrain needs to sync markdown files from your brain repo into its database. Add this to crontab since the --install-cron flag has a known PATH issue:
crontab -e
Add:
*/5 * * * * /home/ubuntu/.bun/bin/gbrain sync --repo /home/ubuntu/brain >> /home/ubuntu/brain/sync.log 2>&1
Step 7: Verify Health
gbrain doctor
GBrain graph coverage warnings (0%) are normal at this stage; they improve as content accumulates.
Part 2: Content Ingestion
How Ingestion Works
It took some trial and error for me figure this out. I read the README.md and thought GBrain expected us to convert our documents into markdown files and manually ingest them. It didn’t feel right, but I ended up wasting some time on this. Then I figured, it might be better to ask Hermes to ingest files and create markdown files automatically. It worked. I honestly felt very stupid for not having tried this approach to begin with.
GBrain’s CLI handles storage. The intelligence of ingestion lives in Hermes + skills. The flow is:
You tell Hermes: "ingest this article: [URL]"
→ Hermes reads the ingest skill
→ Hermes fetches and processes the content
→ Hermes writes markdown to ~/brain/
→ Cron picks it up every 5 minutes
→ GBrain indexes it
You never need to manually write markdown files.
Ingesting a PDF
Tell your Hermes agent:
“Ingest this PDF: [URL or file path]. Write the brain page to ~/brain/”
Hermes will process it and write structured markdown. The 5-minute cron handles the rest.
Ingesting an Article
“Ingest this article: https://example.com/article. Write to ~/brain/”
Important: Slug Convention
If Hermes writes files to a subdirectory (e.g., ~/brain/notes/article.md), the frontmatter slug: field must match the path. Either:
- Remove the
slug:line entirely (GBrain derives it from the path), or - Set
slug: notes/articleto match the subdirectory
Mismatched slugs cause sync failures. Clear them with:
> ~/.gbrain/sync-failures.jsonl
gbrain sync --repo ~/brain --skip-failed
Part 3: Connecting X to Fetch Posts and Reposts
What You Actually Need
- X Developer account (free tier works for your own posts)
- Bearer Token from https://developer.x.com
- Credits on your X developer account ($0.001 per resource = ~$1-2/month for personal use)
Note for non-US developers: X’s Basic tier ($200/mo) is not available outside US. Use the pay-per-use credits model instead — add $25 to start, set auto-recharge.
Step 1: Store Bearer Token in GBrain
gbrain config set secrets.X_BEARER_TOKEN "your_token_here"Step 2: Get Your Numeric User ID
export X_BEARER_TOKEN=$(gbrain config get secrets.X_BEARER_TOKEN | tr -d '[:space:]') curl -sf -H "Authorization: Bearer $X_BEARER_TOKEN" \ "https://api.x.com/2/users/by/username/YOUR_HANDLE" | python3 -m json.tool # Note your numeric "id" — e.g. "15688105"Step 3: Ask Hermes to Build the Collector
Send this to your Hermes CLI (not a web chat): "Set up the x-to-brain gbrain integration. My details: X handle: @yourhandle X user ID: YOUR_NUMERIC_ID X_BEARER_TOKEN is already stored in gbrain config Brain repo is at ~/brain Collect: own posts, likes, reposts only Do NOT configure any keyword searches No bookmarks for now Stagger the cron schedule with my existing crons After first collection, ingest the tweets into gbrain as brain pages"
Hermes will write a Python collector script, run the first collection, and attempt to set up the cron.
Step 4: Fix the Cron (Hermes may have missed this)
Hermes may not register the cron due to the PATH issue. Add it manually:
crontab -e
Add (adjust the minutes to avoid clashing with your other crons):
13,43 * * * * X_BEARER_TOKEN=$(/home/ubuntu/.bun/bin/gbrain config get secrets.X_BEARER_TOKEN) /usr/bin/python3 /home/ubuntu/.gbrain/integrations/x-to-brain/x_to_brain.py >> /home/ubuntu/.gbrain/integrations/x-to-brain/collector.log 2>&1
Step 5: Tune the Schedule
Every 30 minutes is overkill for personal use and burns unnecessary API credits. Change to twice daily:
0 8,20 * * * X_BEARER_TOKEN=...
Test It
export X_BEARER_TOKEN=$(gbrain config get secrets.X_BEARER_TOKEN | tr -d '[:space:]')
python3 /home/ubuntu/.gbrain/integrations/x-to-brain/x_to_brain.py
Expected output
{
"own_total": 399,
"likes_total": 0,
"own_new": 7
}
Part 4: OAuth 2.0 for Likes, without paying for ngrok
The X developer liked_tweets endpoint requires OAuth 2.0 user context; Bearer tokens alone won’t work. The standard advice is to use ngrok ($8/mo) for the OAuth callback URL. You don’t need to.
Your AWS VPS already has a public IP. Use it directly.
Step 1: Enable OAuth 2.0 in X Developer Portal
Go to https://developer.x.com → your app → “User authentication settings”:
- Enable OAuth 2.0
- Set callback URL to:
http://YOUR_VPS_IP:8000/callback - Note your Client ID and Client Secret
Step 2: Open Port 8000 on AWS
AWS Console → EC2 → Security Groups → Edit inbound rules → Add:
- Type: Custom TCP
- Port: 8000
- Source: 0.0.0.0/0
Important: Close this port after completing the OAuth flow. It’s only needed once.
Step 3: Run the OAuth Flow
Ask your Hermes agent:
"Create a Python script at ~/x-oauth.py that does the X OAuth 2.0 PKCE flow. Details: Client ID: YOUR_CLIENT_ID Client Secret: YOUR_CLIENT_SECRET (tell Hermes privately) Callback URL: http://YOUR_VPS_IP:8000/callback Scopes: tweet.read users.read like.read offline.access Listens on port 8000 for the callback After capturing the token, stores it in gbrain config as secrets.X_USER_ACCESS_TOKEN and secrets.X_USER_REFRESH_TOKEN"Run it on your VPS:
python3 ~/x-oauth.pyThe script will print a URL like:
Open this URL in your browser: https://twitter.com/i/oauth2/authorize?... Waiting for callback on port 8000...Open that URL in your laptop browser (not on the VPS). Authorize the app. X redirects to your VPS IP, the script captures the token, stores it in GBrain config.
Expected token response:
{ "token_type": "bearer", "scope": "users.read like.read tweet.read offline.access", "access_token": "...", "refresh_token": "..." }The
offline.accessscope gives you a refresh token — it won’t expire.Step 4: Update the Collector for Likes
Ask Hermes:
"Update the x-to-brain collector script to use the OAuth 2.0 user access token for likes. Load from gbrain config get secrets.X_USER_ACCESS_TOKEN. If it returns 401, refresh using secrets.X_USER_REFRESH_TOKEN and write updated tokens back to gbrain config."Test:
python3 /home/ubuntu/.gbrain/integrations/x-to-brain/x_to_brain.pyExpected:
{ "own_total": 399, "likes_total": 396, "likes_new": 396 }Step 5: Close Port 8000
OAuth is done. Close the port:
AWS Console → EC2 → Security Groups → Edit inbound rules → delete the port 8000 rule.
Final Architecture
Your X Activity ↓ (8 AM and 8 PM daily) x_to_brain.py collector ├── Bearer token → own posts + reposts └── OAuth 2.0 token → likes (auto-refreshes) ↓ ~/brain/ markdown files ↓ (every 5 minutes) gbrain sync cron ↓ GBrain database (PGLite) ↓ Hermes agent queries brain → Full context on every responseCrontab Summary
Here’s what my full crontab looks like after everything is set up:
# GBrain sync — every 5 minutes
*/5 * * * * /home/ubuntu/.bun/bin/gbrain sync --repo /home/ubuntu/brain >> /home/ubuntu/brain/sync.log 2>&1
# X collector — 8 AM and 8 PM daily
0 8,20 * * * X_BEARER_TOKEN=$(/home/ubuntu/.bun/bin/gbrain config get secrets.X_BEARER_TOKEN) /usr/bin/python3 /home/ubuntu/.gbrain/integrations/x-to-brain/x_to_brain.py >> /home/ubuntu/.gbrain/integrations/x-to-brain/collector.log 2>&1