Unsupervised mining of lexical variants from noisy text