In three studies we investigated whether LSA cosine values estimate human similarity ratings of word pairs. In study 1 we found that LSA can distinguish between highly similar and dissimilar matches to a target word, but that it does not reliably distinguish between highly similar and less similar matches. In study 2 we showed that, using an expanded item set, the correlation between LSA ratings and human similarity ratings is both quite low and inconsistent. Study 3 demonstrates that, while people distinguish between taxonomic / thematic word pairs, LSA cosines do not. Although people rate taxonomically related items to be more similar than thematically related items, LSA cosine values are equivalent across stimuli types. Our results indicate that LSA cosines provide inadequate estimates of similarity ratings.
Using latent semantic analysis to estimate similarity.
ESTES, ZACHARY
2006
Abstract
In three studies we investigated whether LSA cosine values estimate human similarity ratings of word pairs. In study 1 we found that LSA can distinguish between highly similar and dissimilar matches to a target word, but that it does not reliably distinguish between highly similar and less similar matches. In study 2 we showed that, using an expanded item set, the correlation between LSA ratings and human similarity ratings is both quite low and inconsistent. Study 3 demonstrates that, while people distinguish between taxonomic / thematic word pairs, LSA cosines do not. Although people rate taxonomically related items to be more similar than thematically related items, LSA cosine values are equivalent across stimuli types. Our results indicate that LSA cosines provide inadequate estimates of similarity ratings.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.