Agent Skills

Published (permalink)

After seeing an ex-colleague release details of their AI second brain, I was curious. Not because I want to outsource the process of note taking, but because I wanted to know how he had any confidence in this process at all.

A quick look at their git repository shows very little code beyond some Python helper scripts for requesting page summaries from Gemini. I had questions:

  • Why did they require Google’s Antigravity to run the workflow?
  • Why was there no coordination code?
  • Why couldn’t I use a local model for this?

At the heart of the workflow are Agent Skills.

“A simple, open format for giving agents new capabilities and expertise.”

They are, in short, folders of text files. And so I wondered, how would I build something to use the skills instructions to perform tasks? How would I get a model to chain skills together? And so I built something. It’s crude but it works. I don’t intend to use it, but it was a useful exercise in understanding “multi-agent” systems.

  • maintaining a single context throughout the task results in chaos
    • appending each skill as it is required results in confusion
    • the instructions for a skill can trigger that skill to be used again
  • keeping track of task state in the context is unreliable
  • agents probably shouldn’t share the same context
  • infinite loops are easily triggered

I particularly liked this from Gemini:

“To achieve that logical, human-like execution you described—where we build a plan once and step through it reliably—we should consider shifting the Control Plane from the LLM to the Go code.”

Getting the LLM to spit out a plan of action in a machine readable format and then using traditional code to coordinate the execution of that plan was far more reliable than trying to do everything in a single agent loop. Being minimal with the context passed to sub-agents along with an associated skill also yielded far more predictable results.

It seems obvious with hindsight but a good rule of thumb seems to be to give each agent its own context.

I don’t intend to release or even use the code for this, but it was an interesting exercise to help me understand what might be going on behind these multi-agent tools.