Back to BlogEngineering Leadership

11 Metrics Distributed Engineering Teams Should Stop Tracking in 2026

|6 min read|
metricsengineering leadershipdistributed teamsproductivity measurement

Engineering metrics have a Goodhart problem: any metric that becomes a target stops being a useful measure. This is especially acute in distributed teams, where managers reach for metrics to compensate for the visibility they used to get through proximity. The result is dashboards full of numbers that nobody actually trusts but everyone optimizes against. The team is busy hitting the metrics, and nobody is willing to say out loud that the metrics no longer correspond to anything that matters.

The eleven metrics below are widely tracked, widely gamed, and rarely useful. Stop tracking them. The space cleared on the dashboard can be filled with metrics that actually correlate with team health and shipping velocity — a few of which we'll name at the end.

1. Lines of code

The most famous bad metric, and somehow still on dashboards in 2026. Lines of code measures keyboard activity, not value created. A senior engineer who deletes ten thousand lines of dead code and replaces it with two hundred lines of cleaner logic produces negative LOC and positive engineering value. Tracking LOC rewards verbose code and punishes refactoring. It has no defenders among engineers and only a few among managers who haven't examined the metric closely.

2. Commits per day

Slightly less obviously bad than LOC, but the same failure mode. Commits per day rewards engineers who structure their work into many small commits and punishes those who work in larger logical units. It encourages performative commits — engineers committing trivial changes to keep the number up. It also penalizes engineers doing exploratory work where a long thinking session might produce one careful commit at the end. The metric trains the team in commit theater.

3. PRs per engineer per week

This metric rewards engineers who break work into smaller PRs and punishes those who ship larger, more coherent units. Some PRs should be small; some should be large; the choice is contextual. A weekly PR count creates pressure to artificially shrink work into ship-sized chunks regardless of whether the chunks make architectural sense. It also punishes engineers whose work is gnarly, like infrastructure migrations or hard bug fixes, where one PR may represent two weeks of investigation.

4. Standup attendance rate

Tracking who attends the daily standup measures presence, not productivity. The engineer who attends every standup and ships nothing is rated highly; the engineer who misses standups because they're deep in focused work is rated poorly. The metric inverts the relationship between observation and value. Worst case: it leaks into performance reviews and the team learns that attendance matters more than output.

5. Slack response time

Response time to Slack messages measures availability, not effectiveness. Engineers who optimize for fast Slack response do so by keeping Slack constantly open and breaking focus to respond — which makes them feel responsive and makes them less productive. The metric rewards exactly the behavior most associated with shallow work and burnout. Distributed teams should be deliberately reducing real-time Slack expectations, not measuring engineers against them.

6. Hours worked

The metric that managers use when they don't trust outcomes. Time spent doesn't correlate with value created in knowledge work — it sometimes inversely correlates, because tired engineers produce buggy code that takes longer to fix than it took to write. Tracking hours teaches engineers to perform working hours rather than actually work. It also produces survivorship bias: engineers who can't or won't perform long hours leave, and the remaining team has the appearance of high engagement that's actually selection effect.

Put a context layer under your distributed team.

StandIn gives engineers a 60-second wrap at the end of every shift. The next shift wakes up knowing exactly what to pick up — no standup required.

Request early access

7. Calendar utilization

"How much of your calendar is in meetings?" is sometimes tracked as a meeting overhead metric, but it gets used backwards — as a sign that engineers with light calendars are underutilized. This rewards engineers who fill their calendars with meetings and punishes those who protect time for deep work. Deep work is where the value is created. The metric is exactly inverted from what produces team output.

8. Story points completed

Story points were supposed to be a planning tool, not a productivity metric. When teams track story points completed per engineer per sprint, the metric promptly inflates: engineers estimate larger, plan smaller, and the relationship between points and effort dissolves. Within a few sprints, "team velocity" is a fiction that everyone references but nobody trusts. The original planning utility is destroyed by the measurement use.

9. Code review turnaround time, measured per individual

Speed of code review matters. Measuring it per individual creates pressure to review fast rather than review well. Engineers learn to rubber-stamp PRs to keep their review-time metrics down. The team's code quality declines invisibly because rigorous review now looks like underperformance on the metric. Measure team-level review latency if you must, but never individual.

10. Number of meetings attended

Meetings attended is a measure of how much the team interrupts itself. Tracking it as a positive metric — "high collaboration!" — rewards exactly the wrong behavior. Tracking it as a negative metric is fine but rarely actionable, because the engineers in the most meetings are usually senior people whose presence is genuinely demanded. The metric provides little signal that a team retrospective wouldn't surface more usefully.

11. Issues closed per engineer

Issues closed measures throughput of ticket flow, which engineers can game by choosing easy tickets, splitting work into multiple tickets, or closing tickets prematurely. The metric rewards engineers who optimize for ticket counts and punishes engineers who take on hard, gnarly work that doesn't fit cleanly into a single ticket. Real engineering complexity routinely fails to map to ticket boundaries.

What to track instead

The metrics that actually correlate with team health: cycle time from idea to production (team-level, not individual), incident rate in the past 30 days, time to recover from incidents, customer-reported issue rate, and percentage of work that's planned versus reactive. For team health, track 1:1 cadence consistency and retention. None of these are gameable in the same way the eleven metrics above are, because they measure outcomes rather than activity. They also resist individual surveillance because they're properties of teams and systems, not engineers.

Frequently asked questions

How do you handle a leadership team that demands activity metrics?

Provide outcome metrics with clear narratives about what they tell you. "Cycle time dropped from 14 days to 9 days last quarter — here's what changed" is more compelling than "engineer X did 47 PRs and engineer Y did 32." Most leadership teams want signal, not surveillance, and accept better metrics when offered. The cases where leadership specifically demands individual activity metrics are usually a signal that broader trust is broken — and the activity dashboard is the symptom, not the cause.

What about DORA metrics?

DORA metrics (deployment frequency, lead time, change failure rate, mean time to recovery) are team-level outcome metrics and they're generally useful. They become problematic only when applied to individual engineers — at which point they game the same way every other individual metric games. Track them at the team or service level and they tell you something real.

Is it ever appropriate to track individual productivity?

Rarely, and almost never through dashboards. Individual performance shows up in 1:1s, retrospectives, and the qualitative judgment of an engineering manager who has been paying attention. Dashboards that purport to measure individual productivity are usually substituting numbers for the actual managerial work of knowing your team. The numbers are easier to produce; they're also less reliable than the judgment they're supposedly replacing.

Get async handoff insights in your inbox

One email per week. No spam. Unsubscribe anytime.

Ready to eliminate your daily standup?

Distributed teams use StandIn to start every shift with full context — no standup required. Engineers post a 60-second wrap. The next shift wakes up knowing exactly what to work on.

You might also like