Translation from narrative text to standard codes variables with Stata

Abstract

In this article, we describe screening, a new Stata command for data management that can be used to examine the content of complex narrative-text variables to identify one or more user-defined keywords. The command is useful when dealing with string data contaminated with abbreviations, typos, or mistakes. A rich set of options allows a direct translation from the original narrative string to a user-defined standard coding scheme. Moreover, screening is flexible enough to facilitate the merging of information from different sources and to extract or reorganize the content of string variables.

Publication
Stata Journal