Natural Language Processing in Information Metric Studies: an analysis of the articles indexed by the Web of Science (2000-2019)




Natural Language Processing , Information Metric Studies, Social Network Analysis, Scientific Research, Mapping of science


Objective: To identify the international scientific structure of the research on the use of natural language processing in the information metric studies area.

Methods: It follows qualitative and quantitative approaches of the information metric studies and the knowledge organization domain. The data was retrieved on 02/02/2020 from the Web of Science Core Collection using the expression "natural language processing", limited to the document types articles and reviews, the category Information Science Library Science, and the timespan of the last 20 complete years (from 2000 to 2019). A Social Networks Analysis was conducted for the visualization of the scientific collaboration, co-citation, and keywords co-occurrence networks.

Results: Out of the 552 documents retrieved, 31 papers were identified in the information metric studies area. Bibliometric indicators of production, relationship, and impact were considered in the study and showed an increase of publications in the last three years, being 2018 the most productive year.

Conclusions: The international scientific literature on the application of NLP in information metric studies is emerging. Scientometrics was identified as the source that achieved a greatest impact. Finally, the k-core of the co-citation analysis shows the existence of an important theoretical core, often cited in the international academic community. The set of NLP techniques (e.., bag of words, tokenization, word stemming, part-of-speech tagging, and SVM) allows the researcher to go beyond the traditional citation analysis and focus on content and context of the citations.


Author Biographies

Mirelys Puerta-Díaz, Universidade Estadual Paulista (Unesp)

- Doutoranda do Programa de Ciência da Informação

Universidade Estadual Paulista (UNESP)

- Professora Assistente na Faculdade de Comunicação da Universidade da Havana, Cuba

Bianca Savegnago de Mira, Universidade Estadual Paulista (Unesp)

Mestranda (PPGCI-UNESP) , Departamento Ciência da Informação, Marília

Daniel Martínez-Ávila, Carlos III University of Madrid

Doutor em Ciência da Informação, Professor do Departamento de Biblioteconomía y Documentación.

María-Antonia Ovalle-Perandones, Carlos III University of Madrid

Professora Contratada Doutora do Departamento de Biblioteconomía y Documentación.

Maria Cláudia Cabrini Grácio, Carlos III University of Madrid

Professora Doutora do Departamento Ciência da Informação.


PUERTA-DÍAZ, Mirelys; DE MIRA, Bianca Savegnago; MARTÍNEZ-ÁVILA, Daniel; OVALLE-PERANDONES, María-Antonia; GRÁCIO, Maria Cláudia Cabrini. Natural Language Processing in Information Metric Studies: an analysis of the articles indexed by the Web of Science (2000-2019). Encontros Bibli: revista eletrônica de biblioteconomia e ciência da informação, [S. l.], v. 26, p. 01–24, 2021.

