This paper introduces the ldagibbs command which implements Latent Dirichlet Allocation in Stata. Latent Dirichlet Allocation is the most popular machine learning topic model. Topic models automatically cluster text documents into a user chosen number of topics. Latent Dirichlet Allocation represents each document as a probability distribution over topics, and each topic as a probability distribution over words. Thereby, Latent Dirichlet Allocation provides a way to analyze the content of large unclassified text data and an alternative to predefined document classifications.
Ldagibbs: a command for topic modeling in Stata using Latent Dirichlet Allocation
Schwarz, Carlo
2018
Abstract
This paper introduces the ldagibbs command which implements Latent Dirichlet Allocation in Stata. Latent Dirichlet Allocation is the most popular machine learning topic model. Topic models automatically cluster text documents into a user chosen number of topics. Latent Dirichlet Allocation represents each document as a probability distribution over topics, and each topic as a probability distribution over words. Thereby, Latent Dirichlet Allocation provides a way to analyze the content of large unclassified text data and an alternative to predefined document classifications.File in questo prodotto:
File | Dimensione | Formato | |
---|---|---|---|
lda_stata.pdf
non disponibili
Descrizione: articolo
Tipologia:
Pdf editoriale (Publisher's layout)
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
109.91 kB
Formato
Adobe PDF
|
109.91 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.