How Batonship Works
Real challenges. Real AI tools. Quantified skill measurement.
1. Face realistic challenges
Work in a full development environment with modern AI tools. Fix bugs, implement features, adapt to changing requirements—the same work you do every day.
2. Get scored across 5 dimensions
We measure how you work, not just what you produce. Your approach to directing AI matters as much as your outcome.
3. Earn your Batonship Score
Receive a quantified score with percentile ranking. Share your verifiable certification with employers and on LinkedIn.
Process reveals sustainable skill.
Two developers can achieve the same outcome. One demonstrated genuine mastery—clear direction, effective context, thoughtful verification. The other got lucky.
Outcome-only assessment can't tell the difference. Batonship can.
We measure your process alongside your outcome because sustainable skill matters more than a single successful result.
“The skills you demonstrate are the skills you'll bring to your next job.”
Five dimensions of AI-era engineering skill.
Effective AI-augmented development isn't one skill. We measure the behaviors that define great engineers in the AI era.
Clarity
How precisely you decompose problems and communicate requirements.
What Good Looks Like
- Clear constraints and requirements stated upfront
- Specific details rather than vague descriptions
- Unambiguous direction that AI can act on effectively
Context
How effectively you provide the information AI needs to be useful.
What Good Looks Like
- Relevant file references when applicable
- Error logs and stack traces included when debugging
- Related context that helps AI understand the full picture
Orchestration
How well you direct tools and coordinate your workflow.
What Good Looks Like
- Exploring before editing (understanding the codebase first)
- Using the right tool for each task
- Clear delegation and direction of AI agents
Verification
How thoroughly you validate AI output before shipping.
What Good Looks Like
- Running tests after significant changes
- Reviewing code before accepting AI suggestions
- Catching potential issues before they propagate
Adaptability
How you respond when requirements change or approaches fail.
What Good Looks Like
- Acknowledging changes and re-orienting your approach
- Preserving working progress when possible
- Communicating updated plans clearly
Transparency that builds skill, not gaming.
We're transparent about what we measure and what great AI collaboration looks like. We're intentionally private about exact scoring mechanics.
Why? Because the only way to improve your score is to actually improve your skills.
What We Share
- The five dimensions we measure
- What good looks like in each dimension
- That process matters alongside outcome
What We Protect
- Exact scoring formulas
- Specific signal weights
Gaming = Improvement
If you learn our dimensions and start providing better context, running tests more consistently, and verifying outputs more carefully—you haven't "gamed" us. You've become genuinely better at AI collaboration.
"The only way to score well is to actually be good at working with AI."
Challenge Types
Our assessments mirror the real work of engineering.
Broken Repo Debugging
Drop into a messy codebase with subtle bugs. Use AI to explore, understand, and fix.
Feature Implementation
Implement new functionality in an existing codebase with constraints and requirements.
Requirement Injection
Mid-challenge, requirements change. Adapt your approach while preserving progress.
AI Code Review
Review AI-generated code. Spot bugs, suggest improvements, ensure quality.
Architecture Planning
Design system architecture and implement key components with AI assistance.
The standard is being set now.
Join the engineers and hiring teams defining AI-era development.