Mapping tuberculosis in Vitória, Brazil

new article was just published in CID about TB in Vitória, Brazil. I’ve never been to Brazil but have been interested in the country since I started working in Mozambique (they share the Portuguese language). Vitória is a medium-sized city located about 500 km north of Rio de Janeiro. The TB incidence in Vitória is slightly higher than Brazil’s overall burden (51 cases/100,000 population).

In this article, the authors performed genotyping of TB isolates and geospatial analysis (i.e. mapmaking). They used “old-fashioned” genotyping techniques (IS6110 RFLP / spoligotyping, as opposed to whole genome sequencing, a more modern technique) on 503 tuberculosis samples from 2003-2007. They found 242 of those isolates were clustered into 12 RFLP families and created some maps. Isolates that belonged to identical RFLP patterns (clusters) were thought to represent recent TB infection. They concluded that ongoing transmission of TB was caused by a small subset of strains in specific neighborhoods of Vitória. They recommended that a new case-finding strategy based on screening populations in neighborhoods with high-density recent transmission TB and social network analysis. Specifically, they talk about a “territory based surveillance system,” which tries to reduce TB transmission based on locations rather than personal contacts.

A few thoughts:

  • I liked the maps, but they were made using ArcGIS ESRI software which is expensive and requires a significant amount of training to learn. Are there any simpler, smartphone-based mapping programs that can be used for those with less computer expertise (i.e. nurses who work in TB control programs doing contact investigations?)
  • The spatial analysis was restricted to residential addresses at diagnosis. This is a major limitation of the article and their proposal for a “territory based surveillance system.” TB was not necessarily acquired at home, as transmission can occur at work, on public transportation, in congregate settings, etc. Their maps represent where the TB patients lived, not where they acquired infection.
  • The study was conducted on isolates from 2003 to 2007 but was published in 2015. How to dramatically speed up future studies? An 8-12 year delay between TB cases and publication isn’t unusual in global health research but is not ideal.
  • What if they had carried out their study using whole genome sequencing rather than RFLP/Spoligotyping? How would their results have differed?
  • Privacy remains a concern any time researchers are making maps using patient information, but the authors of this study clearly protected the rights of their human subjects.
  • I don’t think it’s time to abandon the traditional methods of TB outbreak/contact investigations based on this study (those methods are to describe the epidemiology of the outbreak; determine chain(s) of transmission; identify and prioritize contacts; identify transmission sites and estimate the scope of transmission; make recommendations to stop transmission). I might feel differently if the Vitória study/mapmaking had been carried out in “real time,” i.e. 2003-2007, and the maps were used by public health authorities to identify and interrupt TB transmission chains. But that was not the case. This article was published in 2015.
  • This Brazil paper goes along with another recent article from Peru which identified MDR-TB hotspots using a mapping approach and strain genotyping. The concept of TB transmission hot spots is “hot” but I think the jury is still out regarding its practical utility in the programmatic setting. National TB programs are generally underfunded and understaffed, and these approaches aren’t simple enough.  But once “Poisson regression using Gaussian process spatial smoothing” becomes more user friendly (i.e. TB data can be rapidly built into google maps) this could really take off.
Kernel density estimation (KDE) map of clustered strains (A) and empirical Bayesian odds ratio estimates (cases:controls) (B). A, The white color represents areas with the most intense concentration of clustered strains (dots). Bayesian analysis was made on a census tract scale whereas the kernel densities used points.
Kernel density estimation map of clustered strains (A) and empirical Bayesian odds ratio estimates (cases:controls) (B). A, The white color represents areas with the most intense concentration of clustered strains (dots).
Estimated predicted probabilities of tuberculosis recent transmission per neighborhood. In most of the neighborhoods where the predicted probability of clustering is >0.55, the largest restriction fragment length polymorphism families (ES19, ES14, ES1, and ES8) represent the majority of cases.
Estimated predicted probabilities of tuberculosis recent transmission per neighborhood. In most of the neighborhoods where the predicted probability of clustering is >0.55, the largest RFLP families (ES19, ES14, ES1, and ES8) represent the majority of cases.
Screen Shot 2015-08-11 at 8.10.39 AM
RFLP band pattern of the 70 identified cluster strains
Map from the Peru study (link above), showing concentrated MDR-TB risk measured using household GPS location and MDR phenotype
Map from the Peru study (link above), showing concentrated MDR-TB risk measured using household GPS location and MDR phenotype

2 thoughts on “Mapping tuberculosis in Vitória, Brazil

  1. Hi Philip, a quick thought or two on your mapping question. Really it is two questions: one about data analysis, the other about data collection.

    These analyses can almost certainly be done in R, which is free, open-source and allows you to generate replicable code: all things ArcGIS does not do (unless you want to get into the Python that runs under the hood). However, there is more of a learning curve on R than on ArcGIS.

    Data collection can be done through smartphone-based apps: the one I know most about right now is OpenDataKit (ODK). Again, open-source, and can be built by someone with basic programming skills in Excel. And allows you to do surveys and take GPS locations very easily. In multiple languages. All of which avoids the need to geocode addresses, which is very labour-intensive, if you can visit respondents’ homes/workplaces/etc.

    Hope the question wasn’t too rhetorical…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s