Which loss function to use for multi-class vs multi-label classification?

Got a two fold question here:

  1. How can one tell if they’re working with a multi-label or a multi-class classification?

  2. Which loss function should one use in either case?

Thanks!

I think @sGx_tweets might have some good insight to share here

Hi Harpreet,
I think this might help:

  1. Multi-class Multi-label problem:
    When there are more than two (multiple) classes and a data point can belong to more than one classes at a time.

BCEWithLogitsLoss (with no sigmoid() or softmax()) is the right loss function for such tasks.

torch.nn.BCEWithLogitsLoss combines a Sigmoid layer and the BCELoss, and is more stable than using a sigmoid layer with BCELoss.

  1. Single-label Multi-class problem:
    Where there are more than two (multiple) classes and a data point can belong to only one class.

CrossEntropyLoss (without softmax) is the right loss function for such tasks.

It allows for both class indices and probabilistic soft labels as the target tensor.

1 Like