Time, The Foundation for Effective Operational Capacity Planning

When it comes to managing the capacity of operational and infrastructure teams (like DBAs, DBREs, SREs, and PFEs), the traditional product development metric of using story points often doesn’t work as well. If you want clarity on consistency, predictability, and capacity, then using time will provide it.

Here is a breakdown of why time usually work better than points for infrastructure work.

1. Capacity Management Is More Efficient When Measured in Time

The most important responsibility for operational teams is keeping systems stable. That means they always need a time buffer for when things break, and the only way to plan for that buffer is in hours or minutes.

Incidents Consume Time

When a database slows down, or an outage happens, your team loses hours, not points. A DBA tracking their time can record that an unplanned incident consumed 2 hours of their working day. This interrupted work is tracked even in the middle of a sprint, giving an accurate, immediate cost.

Yes, all work should be tracked. Teams need to show leadership how much time they spend on toil, incidents, and tech debt. If that work is not visible, it can look like people are not getting anything done.

“If anything can go wrong, it will.” — Murphy’s Law

The Math is Simple

If a DBA has a 60-hour sprint and you reserve 20 hours for maintenance and unexpected issues, then you can only plan 40 hours of project work. This gives you an accurate and realistic view of remaining capacity. Infrastructure teams are measured on stability and availability, so the planning must be concrete.

“Until we can manage time, we can manage nothing else.” — Peter Drucker

2. Time Estimates Fit Technical Work Better

Story points were created for creative, uncertain feature development. They work well when teams deal with ambiguous requirements and evolving scope.

DBA and DBRE project work are fundamentally different:

The Work is Repeatable But Unpredictable

While interrupt work (incidents) is unpredictable, planned project tasks are often well-defined. Tasks such as applying patches, performing failover tests, or running backup drills are predictable. They follow runbooks and have historical estimates. Saying something will take 4 hours is often accurate because it has taken 4 hours many times before.

Tracking Estimates and Actuals

For operational work, it’s important to track both the estimated time for a task and the actual time completed. This direct comparison is the key feedback loop for improving accuracy and resource allocation in future sprints.

For example, if the patches were taking 4 hours but are now at 8 hours then it points to a bigger issue that will need to be investigated.

No Need for Relative Sizing

A database upgrade and an automation script have nothing in common. Trying to size them relative to each other based on a point system adds noise instead of clarity. A direct estimate in minutes or hours is simpler and far more meaningful.

3. Time Tracks Improvement and Staffing Needs

How do you measure real improvement and efficiency in an operations role? You measure it with time.

Time based tracking shows how work changes as people gain experience, develop better processes, and remove friction. When you compare estimated time to actual time, you can see where individuals are growing and where systems are improving.

For example, if a new team member spends thirty minutes on a task the first time and ten minutes by the third time, that is visible, measurable progress.

Time also reveals when workloads are increasing, when the team is stretched thin, and when staffing or automation is needed. It tells a complete story about growth, efficiency, and operational demand in a way that is clear and actionable.

The Automation ROI Case Study

A dramatic example of improvement and efficiency is seen when tracking time spent on repetitive tasks. If a manual process was taking the team 45 hours in total per sprint, and then once it was automated it only takes 5 minutes, that immediately frees up 45 hours of capacity every two weeks.

This time savings ties directly to Return on Investment (ROI) with numbers that can be tied to a dollar amount. Using a conservative fully loaded hourly cost of $67.30 (based on a $100,000 salary to make it easy), this single automation saves the company over $78,750 annually.

By measuring time, you prove the immense financial value of automation work that story points would simply obscure.

ROI Math

To get that final annual savings number of over 56,000, the calculation is straightforward:

Calculate the Sprint
- A standard sprint is 2 weeks there are 52 weeks in a year.
- 52/2 = 26 sprints
Calculate hours per sprint per year
- 45 hrs per sprint X by 26 sprints/year = 1,170 hours
Calculating the FULLY LOADED Annual Cost
- Base 100,000 X 1.4 multiplier = 140,000
Salary wage into hourly
- 140,000 salary/ 2080 working hours a year = ~$67.30
Step 4: Calculating the Fully Loaded Annual Cost
$67.30 X 1170 hrs = $78,750

“If you can’t measure it, you can’t improve it.” — Peter Drucker

Highlighting Unrealistic Workloads

This is necessary for new hires. If a new person is assigned 50 story points per sprint, they might work 100 hours to complete them, feeling obligated to work beyond the standard 80-hour workweek. Using points suggests the work is getting done with the current team. However, time metrics expose the truth. If team members consistently log 100 hours per sprint, that proves the need for an additional resource to make the team more efficient and sustainable.

4. Time Exposes Systemic Inefficiencies and Hidden Work

Time tracking allows managers to pinpoint why work was not completed, transforming symptoms into actionable data.

Revealing Scope Creep

Scope creep occurs when a project expands beyond the time originally planned. This can happen when new tasks or tech debt dependencies are added during the project, leading to extra work, longer timelines, and strained resources.

If a simple item like “create a SQL job” was estimated at 10 minutes but took 2 hours, a manager knows something is wrong. In the retrospective, the DBA can explain that they had to fix broken pipelines or the Ansible process to complete the work. That hidden fix is technically a separate work item that needs to be created.

Prioritizing Technical Debt

This data exposes inconsistencies and proves priorities. Suppose a pipeline continually breaks or causes other work items to grow over time. In that case, the accumulated time spent debugging indicates that automating or fixing the pipeline should be escalated to an Epic and prioritized in sprints to make the team more efficient in the long run.

5. Velocity Is Not the Right Goal

Feature teams use velocity to predict feature delivery. For infrastructure teams, velocity becomes a vanity metric that doesn’t tell you what you need to know.

Track Capacity Consumption Instead

The most important question is how much of the team’s available time was spent on planned work versus unplanned incidents.

Stronger Advocacy

Tracking time provides clear justification for why planned work items overflow to the next sprint. When you report to leadership that 60% of your team’s time was spent on unplanned issues and that this forced you to cut nearly half of your planned work, you provide a clear, data-driven justification for additional staffing or automation. Trying to translate points into hours only creates confusion about work that may or may not be finished.

How to Calculate Team Capacity

To get an honest view of what your team can deliver, you need to calculate your Net Available Project Hours. This calculation removes the guesswork and makes sure you are not overcommitting your team.

Here is the formula to use for a standard two-week sprint:

Start with Gross Hours
- Calculate the total working hours available. For a single engineer in a two-week sprint, this is usually 80 hours.
Subtract Standard Deductions
- Remove time for recurring meetings (Daily Standups, Refinement, Sprint Planning) and administrative tasks (email, ticketing). This often accounts for about 10 to 15 percent of their time.
Subtract PTO and Holidays
- Deduct any scheduled time off for that specific sprint.
Subtract the Operational Buffer
- This is the most critical step for DBAs and DBREs. You must reserve time for unplanned interruptions (incidents, ad hoc requests). A common starting point is reserving 30 percent of the total time.

The Calculation Example:

Total Sprint Hours: 80 hours
Recurring Meetings: minus 8 hours
Operational buffer (30%): minus 24 hours
Net Available Project Hours: 48 hours

In this example, you should only assign 48 hours of project work to that operations person. If you assign them 60 hours of work, you are setting them up to fail or forcing them to work nights and weekends. Each person will be different based on service tickets and on call.

If you want your DBA or DBRE team to be predictable, stable, and able to plan realistically, measure their work in hours. It keeps planning honest, highlights capacity strain early, and gives you the clearest view of what your team can deliver.

What Are Your Thoughts?

Is a high performer the person who pushes 80 hours a week just to finish everything? Or is it the person who works their full 40 hours, delivers quality work, and still cannot complete every task in their sprint because the workload is unrealistic?
How do you perceive both team members with points versus time? How have you handled this in the past?

Should everything that consumes operations people like reoccurring meetings, unplanned incidents, and critical trainings be formally documented and tracked against available Sprint capacity?

Can Story Points truly capture the operational reality of hidden toil and unpredictable interruptions, or do they simply mask the capacity problem?

How do you measure improvement, performance, and stability on your team?

I would love to hear your thoughts and real-world experiences in the comments below.

AAA-DBA.COM