In this article, I introduce new commands to estimate text regressions for continuous, binary, and categorical variables based on text strings. The command txtreg_train automatically handles text cleaning, tokenization, model training, and cross-validation for lasso, ridge, elastic-net, and regularized logistic regressions. The txtreg_predict command obtains the predictions from the trained text regression model. Furthermore, the txtreg_analyze command facilitates the analysis of the coefficients of the text regression model. Together, these commands provide a convenient toolbox for researchers to train text regressions. They also allow sharing of pretrained text regression models with other researchers.
Estimating text regressions using txtreg_train
Schwarz, Carlo
2023
Abstract
In this article, I introduce new commands to estimate text regressions for continuous, binary, and categorical variables based on text strings. The command txtreg_train automatically handles text cleaning, tokenization, model training, and cross-validation for lasso, ridge, elastic-net, and regularized logistic regressions. The txtreg_predict command obtains the predictions from the trained text regression model. Furthermore, the txtreg_analyze command facilitates the analysis of the coefficients of the text regression model. Together, these commands provide a convenient toolbox for researchers to train text regressions. They also allow sharing of pretrained text regression models with other researchers.File | Dimensione | Formato | |
---|---|---|---|
schwarz-2023-estimating-text-regressions-using-txtreg-train.pdf
non disponibili
Descrizione: article
Tipologia:
Pdf editoriale (Publisher's layout)
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
398.03 kB
Formato
Adobe PDF
|
398.03 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.