Beyond black & white: leveraging annotator disagreement via soft-label multi-task learning