Method for direct evaluation of automatic indexing through assessment by indexers
DOI:
https://doi.org/10.5007/1518-2924.2024.e96485Keywords:
Automatic indexing, Automatic indexing system, Evaluation of automatic indexing, Direct evaluation of automatic indexing, Methodological pathwayAbstract
Objective: In order to evaluate an automatic indexing system applied to e-books, this paper proposes and applies a method for direct evaluation of automatic indexing through assessment by indexers of the quality of the index terms automatically assigned to the documents.
Method: A methodological research of descriptive and applied nature was carried out, using technical procedures of bibliographic research and empirical research. Initially, by means of a literature review, the steps of the proposed method of direct evaluation of automatic indexing through assessment by indexers were delimited, to then proceed with the construction of an instrument for data collection and application of the proposed method in the evaluation of automatic indexing of the SISTRA system in the indexing of technical-scientific e-books.
Result: The proposed evaluation method is described by means of a diagram and systematic description. The method consists first in the judgment by indexers of the quality of terms assigned by the automatic indexing system to a sample of digital documents, and then in the analysis of values calculated for quality metrics of automatic indexing. The application of the proposed method proved useful in a first evaluation of an automatic indexing system.
Conclusions: We conclude that the proposed method of direct evaluation of automatic indexing through assessment by indexers provides the standardization of evaluation and its practice by information professionals, and that direct evaluation is a necessary activity for the application and adoption of automatic indexing in indexing by subject of digital documents in the scope of information units.
Downloads
References
ABPMP (Brasil). BPM CBOK – Guia para o gerenciamento de processos de negócio: corpo comum de conhecimento. ABPMP BPM CBOK, v. 3.0, 1. ed. ABPMP, 2013. Disponível em: https://www.abpmp-br.org/educacao/bpm-cbok/. Acesso em: 24 ago. 2023.
ASULA, M.; MAKKE, J.; FREIENTHAL, L.; KUULMETS, H. A.; SIREL, R. Kratt: Developing an Automatic Subject Indexing Tool for the National Library of Estonia. Cataloging & Classification Quarterly, v. 59, n. 8, p. 775-793, 2021. DOI: 10.1080/01639374.2021.1998283.
BANDIM, M. A. S.; CORREA, R. F. A consistência na indexação automática por atribuição de artigos científicos na área de Ciência da Informação. Encontros Bibli: revista eletrônica de biblioteconomia e ciência da informação, [S. l.], v. 23, n. 53, p. 64-77, 2018. DOI: 10.5007/1518-2924.2018v23n53p64.
BANDIM, M. A. S.; CORREA, R. F. Indexação automática por atribuição de artigos científicos em português da área de Ciência da Informação. Transinformação, Campinas, v. 31, p. e180004, 2019. DOI: 10.1590/2318-0889201931e180004.
CORRÊA, R. F.; LAPA, R. C. Panorama de estudos sobre indexação automática no âmbito da ciência da informação no Brasil (1973-2012). Ciência da Informação, [S. l.], v. 42, n. 2, p. 255-273, 2013. Disponível em: https://revista.ibict.br/ciinf/article/view/1385. Acesso em: 14 set. 2023.
FUJITA, M. S. L. Representação Documental Automática e Multilíngue de Textos Científicos (SISTRA). 2020. (Projeto de pesquisa FAPESP processo 2019/25470-6). Disponível em: https://bv.fapesp.br/pt/auxilios/107480/representacao-documental-automatica-e-multilingue-de-textos-tecnico-cientificos-sistra/. Acesso em: 24 ago. 2023.
GIL-LEIVA, I.; FUJITA, M. S. L.; REDIGOLO, F. M.; SARAN, J. F. Extracción de información de documentos pdf para su uso en la indización automática de e-books. Transinformação, Campinas, v. 34, p. 1-11, 2022. DOI: 10.1590/2318-0889202234e210069.
GIL-LEIVA, I.; ORTUÑO, P. D.; CORRÊA, R. F. Indización automática de artículos científicos sobre Biblioteconomía y Documentación con SISA, KEA y MAUI. Revista Española de Documentación Científica, [S. l.], v. 45, n. 4, p. e338, 2022. DOI: 10.3989/redc.2022.4.1917.
GOLUB, K. Automated Subject Indexing: An Overview. Cataloging & Classification Quarterly, v. 59, n. 8, p. 702-719, 2021. DOI: 10.1080/01639374.2021.2012311.
GOLUB, K.; SOERGEL, D.; BUCHANAN, G.; TUDHOPE, D.; LYKKE, M.; HIOM, D. A framework for evaluating automatic indexing or classification in the context of retrieval. Journal of the Association for Information Science and Technology (JASIST), v. 67, n.1, p. 3-16, 2016. DOI: 10.1002/asi.23600.
HASAN, K. S.; NG, V. Automatic keyphrase extraction: a survey of the state of the art. In: ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 52., 2014, Baltimore. Proceedings […] Association for Computational Linguistics, 2014. (Volume 1: Long Papers), p. 1262-1273.. DOI: 10.3115/v1/P14-1119.
JUNGER, U. Automation first – the subject cataloguing policy of the Deutsche Nationalbibliothek. In: IFLA WLIC 2018 – Transform Libraries, Transform Societies, 2018, Kuala Lumpur, Malaysia. IFLA, 2018. Disponível em: https://library.ifla.org/id/eprint/2213. Acesso em: 29 maio 2023.
JUNGER, U.; SCHOLZE, F. Neue Wege und Qualitäten – Die Inhaltserschließungspolitik der Deutschen Nationalbibliothek. In: FRANKE-MAIER, M.; KASPRZIK, A.; LEDL, A.; SCHÜRMANN, H. (ed.). Qualität in der Inhaltserschließung. Berlin, Boston: De Gruyter Saur, 2021. p. 55-70. DOI: 10.1515/9783110691597-004. Disponível em: https://www.degruyter.com/document/doi/10.1515/9783110691597-004/html. Acesso em: 29 maio 2023.
KIM, S.N.; MEDELYAN, O.; KAN, M.Y.; BALDWIN, T. SemEval-2010 task 5: automatic keyphrase extraction from scientific articles. In: INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, 5., 2010, Uppsala, Sweden. Proceedings […] Association for Computational Linguistics, 2010. p. 21-26. Disponível em: https://aclanthology.org/S10-1004. Acesso em: 29 maio 2023.
LANCASTER, F. W. Indexação e resumos: teoria e prática. 2. ed. Tradução de Antonio Agenor Briquet de Lemos. Brasília: Briquet de Lemos, 2004. [Tradução de: Indexing and abstracting in theory and practice].
LAPA, R. C.; CORRÊA, R. F. Indexação automática no âmbito da ciência da informação no brasil. Informação & Tecnologia, [S. l.], v. 1, n. 2, p. 59-76, 2014. Disponível em: https://periodicos.ufpb.br/index.php/itec/article/view/21408. Acesso em: 14 set. 2023.
NARUKAWA, C. M.; GIL-LEIVA, I.; FUJITA, M. S. L. Indexação automatizada de artigos de periódicos científicos: análise da aplicação do software SISA com uso da terminologia DeCS na área de odontologia. Informação & Sociedade: Estudos, João Pessoa, v. 19, n. 2, p. 99-118, 2009. Disponível em: https://periodicos.ufpb.br/ojs2/index.php/ies/article/view/2925. Acesso em: 14 set. 2023.
ROSENBERG, V. Comparative evaluation of two indexing methods using judges. Journal of the American Society for Information Science, v. 22, n. 4, p. 251-259, 1971. DOI: 10.1002/asi.4630220404.
SILVA, S. R. B.; CORREA, R. F. Sistemas de Indexação automática por atribuição: uma análise comparativa. Encontros Bibli: revista eletrônica de biblioteconomia e ciência da informação, [S. l.], v. 25, p. 1-25, 2020. DOI: 10.5007/1518-2924.2020.e70740.
SILVA, S. R. B.; CORREA, R. F.; GIL-LEIVA, I. Avaliação direta e conjunta de Sistemas de Indexação Automática por Atribuição. Informação & Sociedade: Estudos, João Pessoa, v. 30, n. 4, p. 1-27, 2020. DOI: 10.22478/ufpb.1809-4783.2020v30n4.57259.
SUOMINEN, O. Annif: DIY automated subject indexing using multiple algorithms. LIBER Quarterly: The Journal of the Association of European Research Libraries, v. 29, n. 1, 2019. DOI:10.18352/lq.10285.
SUOMINEN, O.; INKINEN, J.; LEHTINEN, M. Annif and Finto AI: Developing and Implementing Automated Subject Indexing. JLIS.It, v. 13, n. 1, p. 265-282, 2022. DOI: 10.4403/jlis.it-12740.
VERGARA, S. C. Projetos e relatórios de pesquisa em administração. 11. ed. São Paulo: Atlas, 2009.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Renato Fernandes Correa, Mariângela Spotti Lopes Fujita
This work is licensed under a Creative Commons Attribution 4.0 International License.
The author must guarantee that:
- there is full consensus among all the coauthors in approving the final version of the document and its submission for publication.
- the work is original, and when the work and/or words from other people were used, they were properly acknowledged.
Plagiarism in all of its forms constitutes an unethical publication behavior and is unacceptable. Encontros Bibli has the right to use software or any other method of plagiarism detection.
All manuscripts submitted to Encontros Bibli go through plagiarism and self-plagiarism identification. Plagiarism identified during the evaluation process will result in the filing of the submission. In case plagiarism is identified in a manuscript published in the journal, the Editor-in-Chief will conduct a preliminary investigation and, if necessary, will make a retraction.
This journal, following the recommendations of the Open Source movement, provides full open access to its content. By doing this, the authors keep all of their rights allowing Encontros Bibli to publish and make its articles available to the whole community.
Encontros Bibli content is licensed under a Creative Commons Attribution 4.0 International License.
Any user has the right to:
- Share - copy, download, print or redistribute the material in any medium or format.
- Adapt - remix, transform and build upon the material for any purpose, even commercially.
According to the following terms:
- Attribution - You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- No additional restrictions - You may not apply legal terms or technological measures that legally restrict others from doing anything that the license permits.