# IoU vs F1 score

plot### Background #

Intersection over Union (IoU) and F_{1} score are commonly used evaluation metrics in binary classification, such as for object detection and image segmentation.

Denote the number of true positives by *TP*, the number of false positives by *FP* and the number of false negatives by *FN*. Note that the term for the number of true negatives *TN* is not present in the expression for either IoU or F_{1} score, which implies that neither IoU nor F1 is symmetric in the positive and negative class.

### Summary #

In the following we show that IoU and F1 score (both of which can be expressed in terms of *TP*, *FP* and *FN*) can be re-expressed in terms of:

- precision (which can be expressed in terms of
*TP*and*FP*); and - recall (which can be expressed in terms of
*TP*and*FN*).

Contour plots are made to provide visualization for intuitive understanding. Note that as per convention, variable x is on the horizontal axis, variable y is on the vertical axis, and z slices are the contour lines.

For easy comparison, the contour plots for (i) IoU, (ii) F_{1} and (iii) harmonic mean of IoU and F1 are combined to produce the following animated sequence.

This is not to be confused with F_{β} scores. An animated sequence of plots for F_{β} scores with different values of β is produced for comparison.

### IoU #

**Intersection over Union** is also known as the Jaccard index, which is generalized by the Tversky index. It can be re-expressed in terms of precision and recall.

We can observe that IoU score measures something closer to the *worst case*, i.e. the *minimum*, of precision and recall.

### F_{1} score #

**F _{1} score** is also known as the Dice coefficient. It is by definition the harmonic mean of precision and recall.

We can observe that F_{1} score measures something closer to the *average* of precision and recall. This is apparent especially for precision values and recall values that are both greater than around 0.5.

### F_{β} score #

F_{1} score is generalized by **F _{β} score**, which measures something close to the

*weighted average*of precision and recall, where the effect of change of recall is β times as much as that of precision. The proof for why β

^{2}instead of β is used in the formula for F

_{β}can be found here.

Plots are generated for F_{β} for β = 0.5 and for β = 2 together with linear plots for their arithmetic mean analogies. We can observe that the plots are asymmetric.

### New metric #

In the following we shift our focus back on the metrics that are symmetric in precision and recall. By expressing IoU and F_{1} in terms of *TP*, *FP* and *FN*, we observe that we can take the **harmonic mean of IoU and F _{1}** to devise a new metric that can be intuitively understood as something in between IoU and F

_{1}.

The new metric can also be re-expressed in terms of precision and recall for generation of the following contour plot. We can observe that the new metric, which as mentioned is defined as the harmonic mean of IoU and F1, measures something close to the average of (i) the *worst case* of precision and recall and (ii) the *average* of precision and recall.

### Inspiration #

This post is inspired by an answer in a StackExchange post:

"For any fixed ground truth, the two metrics are always positively correlated." ... "F score tends to measure something closer to average performance, while the IoU score measures something closer to the worst case performance" ... "over a set of inferences."

### Further work #

A new thesis pre-proposal in PDF format ↓ has been prepared with the gradients of and F_{β} loss and proposed G_{α} loss.

Changelog

*Feb 2021*Replaced plots with new versions generated with my new custom rainbow color map and re-exported in SVG format using my new tool svgasm.*Feb 2021*Added a sequence of plots for F_{β}scores in animated format.*Jan 2021*Added a sequence of plots in animated format in summary section.*Jan 2021*Updated all LaTeX graphics to use sans serif typeface instead of the default serif typeface with`sansmath`

. This improves readability on screen and reduces total file size of the graphics by 17%.*Dec 2020*Added link to thesis pre-proposal PDF.*Dec 2020*Added F_{β}derivation.*Dec 2020*Re-organized some paragraphs and plots.*Oct 2020*Added plots for harmonic mean of IoU and F_{1}.*Oct 2020*Improved phrasing in some paragraphs.