It is a generalized version of the -measure. The score allows you to weigh the importance of precision versus recall by introducing the parameter:

  • If , then you have the regular -measure, which treats precision and recall as equally important.
  • If , it gives more emphasis to recall than to precision.
  • If , it gives more emphasis to precision than to recall.

The score is useful when you want to fine-tune the balance between precision and recall to suit a particular application. For instance, in certain scenarios, it might be more costly to miss positive instances (low recall) than to incorrectly label negative instances as positive (low precision), and vice versa. Adjusting helps you accommodate these trade-offs in the metric itself.

python

from sklearn.metrics import precision_score, recall_score
 
def f_beta_score(P, R, beta=1.0):
    """
    Compute the F_beta score.
 
    Parameters:
    - P (float): Precision
    - R (float): Recall
    - beta (float): Weighting factor
 
    Returns:
    - float: F_beta score
    """
    if P + R == 0:
        return 0.0  # Handle edge case to avoid division by zero
    return (1 + beta**2) * P * R / (beta**2 * P + R)
 
def compute_f_beta_from_labels(true_labels, predicted_labels, beta=1.0):
    """
    Compute the F_beta score from true and predicted labels.
 
    Parameters:
    - true_labels (list): Ground truth labels
    - predicted_labels (list): Predicted labels from clustering or classification
    - beta (float): Weighting factor
 
    Returns:
    - float: F_beta score
    """
    P = precision_score(true_labels, predicted_labels, average='macro')
    R = recall_score(true_labels, predicted_labels, average='macro')
    return f_beta_score(P, R, beta)
 
# Example usage:
true_labels = [1, 0, 1, 2, 2, 0]
predicted_labels = [1, 1, 1, 2, 0, 0]
beta = 2
 
print(compute_f_beta_from_labels(true_labels, predicted_labels, beta))

Reference List

  1. A Comprehensive Survey of Clustering Algorithms Xu, D. & Tian, Y. Ann. Data. Sci. (2015)