Assisting forensic identification through unsupervised information extraction of free text autopsy reports: the disappearances cases during the Brazilian military dictatorship

2019. Inglés

Anthropological, archaeological and forensic studies situate enforced disappearance as a strategy associated with the Brazilian military dictatorship (1964-1985), leaving hundreds of persons without identity or cause of death identified. Their forensic reports are the only existing clue for people identification and detection of possible crimes associated with them. Their analysis requires unsupervised techniques (since their contextual annotation is extremely time-consuming, difficult to obtain and highly subjective) that allow researchers to assist in the identification and analysis in four directions: common causes of death, relevant bodies locations, personal belongings terminology and correlations between actors (e.g. doctors or police officers involved in the disappearances).

This paper analyses almost 3000 textual reports of missing persons in Sao Paulo city during the Brazilian dictatorship through unsupervised algorithms of information extraction in Portuguese, identifying named entities and relevant terminology associated with these four criteria. The analysis allowed us to observe terminological patterns relevant for people identification (e.g. presence of rings or similar personal belongings) and automatize the study of correlations between actors. The proposed system acts as a first classificatory and index middleware of the reports based on these criteria and represents a feasible system that assists researchers in the pattern search among autopsy reports.
Palabras clave
Information Extraction. Named entity recognition. Terminology extraction. Autopsy reports.
Revista o serie
2019 MDPI
Volumen 10, 7