I just open-sourced Habit Sprint - a different take on habit tracking and works great with OpenClaw.
It’s not a checklist app with a chat wrapper on top.
It’s an AI-native engine that understands:
Weighted habits
“Don’t break the chain” streak logic
Sprint scoring
Category tradeoffs
And how those things interact
The idea started in 2012 with a simple spreadsheet grid to track daily habits.
In 2020, I borrowed the two-week sprint cycle from software development and applied it to personal growth.
Two weeks feels like the sweet spot:
Long enough to build momentum
Short enough to course-correct
Built-in retrospective at the end
What’s new now is the interface.
You interact in plain language:
“I meditated and went to the gym today.”
“Log 90 minutes of deep work.”
“How consistent have I been this week?”
“Which category is dragging my score down?”
“Let’s run a habit retro.”
The model translates that into validated engine actions and returns clean markdown dashboards, sprint summaries, streak tracking, and retrospectives.
Under the hood:
Habits have weights based on behavioral leverage
Points accumulate based on weekly targets and consistency
Streaks are automatic
Two-week sprints support themes and experiments
Strict JSON contract between LLM and engine
Lightweight Python + SQLite backend
Structured SKILLS.md teaches the LLM the action schema
The user never sees JSON. The assistant becomes the interface.
It works as an LLM skill for Claude Code, OpenClaw, or any agent that supports structured tool calls.
I’m really interested in what AI-native systems look like when the traditional “app UI” fades away and the assistant becomes the operating layer.
I’ve been working on a lightweight workflow that sits on top of Claude Code and Vibe Kanban to make AI-assisted development more structured and less fragile over time.
It’s not a product or UI. It’s a set of slash commands that run inside Claude Code and use Vibe Kanban (via MCP) as the persistent coordination layer.
The core flow is:
- PRD review with clarifying questions (optional PRD generation)
- Development plan with epics, task dependencies, complexity, and acceptance criteria
- Bidirectional sync with Vibe Kanban (drift detection, dependency violations)
- Task execution with full context (PRD + plan + AC + codebase)
Most of this has been exercised heavily in a single-task, human-in-the-loop model.
Recently I started experimenting with parallel execution, using full agent sessions in isolated git worktrees (and optional delegation to VK workspace sessions). Early results are promising: small batches of independent tasks complete much faster, while still stopping on conflicts and keeping humans in the loop for merges and judgment calls.
The main idea is treating task systems as memory and governance for agents, not just tracking, and making parallelism dependency-aware rather than optimistic.
Docs include architecture notes, comparisons with other agent frameworks, and a cookbook with concrete workflows.
One clarification: the gateway intentionally does not execute arbitrary commands or expose the filesystem. All supported actions are explicitly allow-listed and still go through macOS TCC prompts. The goal is to keep the agent sandboxed while making OS interactions predictable and auditable.
I’ve been experimenting with running AI agents fully sandboxed (Linux containers / VMs), specifically while configuring and testing OpenClaw (Clawdbot). One issue I kept hitting is that many existing skills for macOS apps like Reminders or Messages assume the agent runs directly on the host and are very permissive in what they allow.
That felt like the wrong security model.
So last weekend I built Mac Agent Gateway, a small open-source project that acts as a local macOS gateway for agents.
The approach is:
- agents stay sandboxed (Linux, VM, container, or remote host)
- a small service runs locally on macOS
- the service exposes a tightly scoped HTTP API that agents access via skills
This allows sandboxed agents to safely interact with Apple apps that are normally restricted to macOS, without giving the agent shell access or broad system permissions.
Current support includes Reminders and Messages. One concrete example: a sandboxed agent can review the last 1–2 weeks of messages, identify what’s important or unanswered, and create follow-up reminders with full context using a reasoning model.
Security-wise, the design is intentionally conservative:
- local-only HTTP interface
- explicitly allow-listed actions
- no shell access
- no filesystem access
- macOS TCC permissions remain enforced
I’ve tested this so far with OpenClaw and Claude, but the design should work with any agent framework that supports a SKILLS.md-style integration.
It’s not a checklist app with a chat wrapper on top. It’s an AI-native engine that understands:
Weighted habits
“Don’t break the chain” streak logic
Sprint scoring
Category tradeoffs
And how those things interact
The idea started in 2012 with a simple spreadsheet grid to track daily habits. In 2020, I borrowed the two-week sprint cycle from software development and applied it to personal growth.
Two weeks feels like the sweet spot:
Long enough to build momentum
Short enough to course-correct
Built-in retrospective at the end
What’s new now is the interface.
You interact in plain language:
“I meditated and went to the gym today.”
“Log 90 minutes of deep work.”
“How consistent have I been this week?”
“Which category is dragging my score down?”
“Let’s run a habit retro.”
The model translates that into validated engine actions and returns clean markdown dashboards, sprint summaries, streak tracking, and retrospectives.
Under the hood:
Habits have weights based on behavioral leverage
Points accumulate based on weekly targets and consistency
Streaks are automatic
Two-week sprints support themes and experiments
Strict JSON contract between LLM and engine
Lightweight Python + SQLite backend
Structured SKILLS.md teaches the LLM the action schema
The user never sees JSON. The assistant becomes the interface.
It works as an LLM skill for Claude Code, OpenClaw, or any agent that supports structured tool calls.
I’m really interested in what AI-native systems look like when the traditional “app UI” fades away and the assistant becomes the operating layer.
Curious what people think. Would love feedback.