March 2026 Tooling Roundup: Profiles, Proxies, and Free 744B Models

It’s been a busy couple of weeks. Three of my open-source tools got meaningful updates, and I stumbled into something that changes how I think about model access. Here’s the roundup.

tether-cli v1.11: Machine Profiles

Tether syncs your dotfiles, packages, and project secrets across machines via an encrypted Git repo. The problem it never solved well: what happens when your work laptop and personal machine need different configurations from the same repo?

v1.11 introduces machine profiles. Each machine gets assigned a named profile (e.g., “work”, “personal”), and dotfiles are stored under profiles/<name>/ in the sync repo. Shared files live in profiles/shared/ and apply everywhere.

Per-machine dotfile control - your work .zshrc sources corporate VPN scripts, your personal one doesn’t
Per-machine package lists - Homebrew formulas that only make sense on one machine stay there
Automatic migration - existing repos with the old flat dotfiles/ layout migrate on first sync
History and restore - tether history shows file changes over time, tether restore git rolls back to any previous version

The TUI dashboard also got a history viewer, inline config editing, and better file grouping by personal vs. team dotfiles.

The v1.11.x bug-fix marathon

Profiles touched enough edge cases that v1.11.0 through v1.11.7 shipped within 48 hours. New machines joining flat repos, profile bleed-through between machines, legacy cleanup timing. The kind of things you only find when real configs hit real machines.

claude-launcher v0.4: NVIDIA NIM Backend

claude-launcher lets you run Claude Code against alternative model backends. v0.4 adds NVIDIA NIM as the fourth backend, alongside Anthropic, OpenRouter, and Ollama.

The interesting engineering here is the translation proxy. Claude Code speaks the Anthropic message format (content blocks, tool_use, tool_result). NIM speaks OpenAI format. claude-launcher now runs a local HTTP proxy that translates between the two in real-time, handling both streaming SSE and non-streaming responses.

Other v0.4 changes:

Ollama tool filtering - auto-detects which local models support tool use via /api/show and hides the rest
Role models across all backends - configure separate models for sonnet/opus/haiku task tiers per backend
New model alerts - tracks seen OpenRouter models and notifies you when new ones appear

claude-tools: Monday.com, Maestro, and Simplification

claude-tools is a plugin marketplace for Claude Code. Three changes worth noting:

Monday.com plugin - a new GraphQL-based agent for managing Monday.com boards and tasks directly from Claude Code
Mobile testing: two paradigms - the plugin now has both Appium and Maestro, each doing what it’s good at. Appium powers the AI-driven agents (mobile:test and mobile:parity) where Claude takes screenshots, reads UI hierarchy, and acts like a human tester. Maestro powers the declarative dev loop (mobile:test-runner and the /dev command) with YAML flow files, file watching, and hot reload. I initially replaced Appium with Maestro entirely, then realized they solve different problems and restored both
WhatsApp search - a new plugin that queries the native macOS WhatsApp client’s SQLite database directly. Search messages by contact, keyword, or date range. Find and display media. No API keys, no third-party services: just sqlite3 against ChatStorage.sqlite
Headless browser simplification - deleted the custom TypeScript browser library (136 lines) and now relies entirely on the agent-browser CLI. Less code, same functionality

The pattern across these changes: finding the right tool for each job instead of forcing one abstraction to do everything.

The NVIDIA NIM Surprise

This is the thing that caught my attention most. NVIDIA is hosting GLM-5 on their NIM platform with a free tier of 40 requests per minute.

GLM-5 is Z.ai’s latest: 744B total parameters (40B active via MoE), 205K context window, MIT license. The benchmarks are genuinely impressive for an open model: 92.7% on AIME 2026, 86% on GPQA-Diamond, 77.8% on SWE-bench Verified.

You can point claude-launcher at a frontier-class open model with 205K context and pay nothing. The 40 RPM limit is fine for development work.
— The practical takeaway

This is why the NIM backend in claude-launcher matters. You sign up for an NVIDIA developer account, grab an API key, and you’re running Claude Code’s interface against a 744B model for free. No credit card, no usage-based billing anxiety.

The weights are also on Hugging Face under MIT, so you can self-host if you have the hardware. But the NIM free tier removes the “I need a beefy GPU” barrier entirely.

What’s Next

tether-cli profiles need more time in the wild before I’m confident in edge cases. claude-launcher’s proxy needs battle-testing against more NIM models as they add them. And claude-tools keeps growing as I find more services I want Claude Code to talk to.

The broader trend: the gap between “I need a specific API key from one provider” and “I can run this toolchain against whatever model makes sense” keeps shrinking. That’s the whole point of building this stuff.

March 2026 Tooling Roundup: Profiles, Proxies, and Free 744B Models

tether-cli v1.11: Machine Profiles

claude-launcher v0.4: NVIDIA NIM Backend

claude-tools: Monday.com, Maestro, and Simplification

The NVIDIA NIM Surprise

What’s Next

Share this article

Related Posts

The Workspace CLI: A Daily Driver for Multi-Repo Chaos

The Terminal Renaissance

Stop Speedrunning Claude Code (Master the Core Loop First)