ML Issue Classification

ML Issue Classification

How does Uplevel classify Issues?

Overview

Uplevel’s ML Issue Classification model provides key visibility into engineering effort and time spent on New Feature Development vs. Defects vs. Sustenance without dependencies on how teams are tagging work.


How it works

The Uplevel Machine Learning (ML) Issue Classifier uses issue metadata from standard fields like Issue Type, Summary, Description, and Assignee, as well as similar Epic metadata if the issue is linked. From there, an issue is classified as one of three categories:
  • Defect - This category contains issues for work related to bugs and defects.
  • New value creation - This category contains issues for work related to new features and enhancements.
  • Sustenance - This category contains issues for work related to operational efficiency and improvements to maintain and improve reliability and safety. Some types of work included are KTLO, maintenance, and tech debt.

Details

  • Uplevel first classifies issues as defect or not defect. Uplevel then classifies the non-defect issues as either new value creation or sustenance.
  • Uplevel utilizes a default confidence percentage of 60% to categorize the issues. If below 60% confidence for a specific issue, Uplevel will aggregate the work under “Not linked to a ML Issue Classification”. This confidence percentage is configurable.
IdeaExample: A task titled "BE: verify issue handling and NPE logic" has been flagged as Sustenance, but with low confidence it is only slightly favored to be sustenance compared to a defect. In this case, Uplevel would put it into "Uncertain Classification" by default.

Accuracy / Results

Uplevel has seen great success in the accuracy of our ML Issue Classification model with current customers to-date. We’ve directly compared our ML Issue Classification models to custom aggregations and manually tagged categories already leveraged by Uplevel customers today, which has provided the following results:

  1. The current model for defect vs not defect:
    • Overall ~93% (87%-96%) of issues have the predicted classification matching the manual classification. Looking at allocation time it is ~93% (86%-99%) matching.
  2. The current model for new value creation vs sustenance:
    • Overall ~70% (64%-77%) of issues have the predicted classification matching the manual classification. Looking at allocation time it is ~71% (69%-76%) matching.

FAQ

  • Is this an LLM?
    • No, it is not an LLM. It is an ML (machine learning) model.
  • Does my data get sent to a third party?
    No, the data stays within Uplevel using our own proprietary ML model and does not use a third party.
  • How was the model trained?
    The model was trained using labeled data. A random sample of the total volume of issues was fed to the model with the labels to train, and then the remaining labeled data was tested to check for model performanc
  • Will the results of the model change over time?
    The model is deterministic, meaning that running the same issue through the model will give the same result every time. A change to the issue metadata could cause the issue to be reclassified.

  • What if I have a very specific definition for defect, new value creation, or sustenance that I want to use instead?
    • We can apply a rule based on your definition of that category that you can see in an allocation scheme. We have plans to allow overrides to the model value.
  • What if I want to add a category (e.g. Compliance)?
    • We can apply a rule based on your definition of that category that you can see in an allocation scheme. We cannot add a category to the model output.
  • What if I think the classification is wrong for an issue/epic?
    • We can re-classify issues manually today and have plans to allow customers to re-classify issues themselves in the short term.
    • Related Articles

    • Issue Velocity

      Learn about your organization's work completion practices and how they've changed over time. What is Issue Velocity? Issue velocity is a measure of the volume of work a group has collectively completed within a given time period, normalized by the ...
    • Issue Velocity Best Practices

      How to best use (and not abuse) Issue Velocity data Velocity has taken on several different meanings in the engineering space. Some teams use it to refer to Sprint Velocity (how many story points they'll complete in a sprint), while others focus on ...
    • How to Understand Your Organization's Top-Level Investments

      Keeping Jira data tagged consistently is difficult at all levels of an organization. Some teams are on company-managed boards with required custom fields, while other teams use team-managed boards where those same rules don't apply. Some teams are on ...
    • Allocation Schemes

      Definitions of Allocation Schemes, and examples of how they can be used to calculate Time Allocation for your organization. Allocation Schemes (also thought of as Allocation Rules) are used to group similar work items into larger categories to get ...
    • Bug Rate

      Learn about your organization's bug practices and how they've changed over time. Trend This metric shows the percentage of issues closed during the time period that were bugs. An issue is considered a bug if it has an issue type that contains the ...