Bridging the gap between machine translation output and images in multimodal documents

Thiago Blanch Pires; Augusto Velloso dos Santos Espindola

doi:10.5007/2175-7968.2021.e75483

Auteurs

Thiago Blanch Pires Universidade de Brasília https://orcid.org/0000-0002-0060-6075
Augusto Velloso dos Santos Espindola Universidade de Brasília https://orcid.org/0000-0002-7606-1913

DOI :

https://doi.org/10.5007/2175-7968.2021.e75483

Résumé

The aim of this article is to report on recent findings concerning the use of Google Translate outputs in multimodal contexts. Development and evaluation of machine translation often focus on verbal mode, but accounts by the area on the exploration text-image relations in multimodal documents translated automatically are rare. Thus, this work seeks to describe just what are such relations and how to describe them, organized in two parts: firstly, by exploring the problem through an interdisciplinary interface, involving Machine Translation and Multimodality to analyze some examples from the Wikihow website; secondly, by reporting on recent investigation on suitable tools and methods to properly annotate these issues from within a long-term purpose to assemble a corpus. Finally, this article provides a discussion on the findings, including some limitations and perspectives for future research.

Bibliographies de l'auteur

Thiago Blanch Pires, Universidade de Brasília

Professor adjunto do curso bacharelado em Línguas Estrangeiras Aplicadas ao Multilinguismo e à Sociedade da Informação (LEA-MSI) no Departamento de Línguas Estrangeiras e Tradução (LET), vinculado ao Instituto de Letras (IL) da Universidade de Brasília (UnB). Foi coordenador do curso de graduação (LEA-MSI) de março de 2018 a março de 2020. Atua no ensino da língua inglesa, dos estudos de corpora, e do tratamento automatizado das línguas naturais. Doutor em Ciência da Informação pela Universidade de Brasília, bacharel e mestre em Letras - Língua Inglesa e Literaturas pela Universidade Federal de Santa Catarina. Possui interesse nas áreas da Linguística Computacional, Multimodalidade, e Estudos de Corpora.

Augusto Velloso dos Santos Espindola, Universidade de Brasília

Bacharel em Línguas Estrangeiras Aplicadas ao Multilinguismo e à Sociedade da Informação (LEA-MSI) pela Universidade de Brasília.

Références

Baker, Mona; Saldanha, Gabriela (Orgs.). Routledge encyclopedia of translation studies. 3rd ed. London: Routledge, 2019.

Bateman, J.A. Text and Image: A Critical Introduction to the Visual/Verbal Divide. London: Routledge, 2014. Avaible to: <https://books.google.de/books?id=JvPAngEACAAJ>.

Bateman, J. A. Multimodality and Genre: A Foundation for the Systematic Analysis of Multimodal Documents. London: Palgrave MacMillan, 2008.

Caglayan, Ozan, et al. “Does Multimodality Help Human and Machine for Translation and Image Captioning?” Proceedings of the First Conference on Machine Translation: vol. 2, Shared Task Papers, (2016): 627-33. DOI:10.18653/v1/W16-2358. Avaible to: .

Caglayan, Ozan, et al. Multimodal Machine Translation. Université du Maine, 2019.

Calixto, Iacer; Liu, Qun. “An error analysis for image-based multi-modal neural machine translation.” Machine Translation, vol. 33, n. 1, (2019): 155-77. PubMed Central. DOI:10.1007/s10590-019-09226-9.

Dorr, Bonnie. “Machine Translation Divergences: A Formal Description and Proposed Solution.” Computational Linguistics, vol. 20, no.. 4, (1994): 597-634.

Dorr, Bonnie. “Solving thematic divergences in machine translation.” Proceedings of the 28th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 1990, p. 127-134. DOI:10.3115/981823.981840.

Duncan, Susan. “Multimodal annotation tools”. Body–Language–Communication: an international handbook on multimodality in human interaction. Berlin: De Gruyter Mouton, 2013, pp. 1015-1022.

Espindola, Augusto; Pires, Thiago Blanch. “Coleta, etiquetagem e anotação de incompatibilidades intersemióticas geradas por tradução automática”. Cultura e Tradução, vol. 6, no. 1, (2020): 248-264.

Halliday, M. A. K. Language as social semiotic: the social interpretation of language and meaning. London: Arnold, 1978.Hasan, R. “The texture of a text.”. Language, Context and Text: Aspects of language in a socio-semiotic perspective. Deaking: Deaking University Press, 1985.

Heo, Yoonseok, et al. “Multimodal Neural Machine Translation with Weakly Labelled Images.” IEEE Access, vol. 7, (2019): 54042-53. IEEE Xplore. DOI:10.1109/ACCESS.2019.2911656.

Hirasawa, Tosho, et al. “Multimodal Machine Translation with Embedding Prediction.” Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop. Association for Computational Linguistics, 2019, pp. 86-91. ACLWeb. DOI:10.18653/v1/N19-3012.

Hutchins, W. J. “Machine translation: A concise history.” Journal of Translation Studies, vol. 13, no. 1-2, (2010): 29-70.

Hutchins, William John; Somers, Harold L. An Introduction to Machine Translation. Massachusetts: Academic Press, 1992.

Kameyama, Megumi, et al. “Resolving Translation Mismatches with Information Flow.” Proceedings of the Annual Meeting of the Association for Computational Linguistics ACL91, 1991, pp. 193-200.

Liu, Yu; Kay L. O’Halloran. “Intersemiotic Texture: analyzing cohesive devices between language and images.” Social Semiotics, vol. 19, no. 4, (2009): 367-388. DOI:10.1080/10350330903361059.

Martin, J. R. English text: system and structure. Amsterdam: John Benjamins Pub. Co, 1992. Avaible to: <http://bangor.eblib.com/patron/FullRecord.aspx?p=861548>.

Melby, Alan K. “Future of Machine Translation.”. The Routledge Handbook of Translation and Technology, O’Hagan, Minako (Org.). London: Routledge, 2019, pp. 419–36. DOI.org (Crossref). DOI:10.4324/9781315311258-25.

Mills, Kathy A.; Len Unsworth. “Multimodal Literacy.” Oxford Research Encyclopedia of Education. Oxford: Oxford University Press, 2017. DOI.org (Crossref). DOI:10.1093/acrefore/9780190264093.013.232.

O’Donnell, Michael. “Demonstration of the UAM CorpusTool for text and image annotation”. Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics, 2008, pp. 13-16.

Pires, Thiago Blanch. Ampliando olhares sobre a tradução automática online: um estudo exploratório de categorias de erros de máquina de tradução gerados em documentos multimodais. Universidade de Brasília, 2017, Avaible to: <https://repositorio.unb.br/handle/10482/23727>.

Pires, Thiago Blanch. “Multimodality and Evaluation of Machine Translation: A Proposal for Investigating Intersemiotic Mismatches Generated by the Use of Machine Translation in Multimodal Documents”. Texto Livre: Linguagem e Tecnologia, vol. 11, no. 1, June (2018): 82-102. Avaible to: . DOI:10.17851/1983-3652.11.1.82-102.

Quah, Chiew Kin. Translation and technology. London: Palgrave Macmillan, 2006.

Royce, Terry. “Intersemiotic Complementarity: A Framework for Multimodal Discourse Analysis.”. New Directions in the Analysis of Multimodal Discourse, Royce, Terry, and Bowcher, Wendy (Orgs). London: Routledge, 2007, pp. 63-109.

Royce, Terry. “Synergy on the Page: Exploring intersemiotic complementarity in page-based multimodal text.” JASFL Occasional papers, vol. 1, no. 1, (1998): 25-49.Saçak, Begüm. “Media Literacy in a Digital Age: Multimodal Social Semiotics and Reading Media.” Handbook of Research on Media Literacy Research and Applications Across Disciplines, 2019. DOI:10.4018/978-1-5225-9261-7.ch002.

Takushima, Hiroki, et al. Multimodal Neural Machine Translation Using CNN and Transformer Encoder. EasyChair Preprints, EasyChair, April 2nd 2019. DOI.org (Crossref). DOI:10.29007/hxhn

Vilar, David, et al. “Error Analysis of Statistical Machine Translation Output.” Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), European Language Resources Association (ELRA), 2006. ACLWeb. Avaible to: <http://www.lrec-conf.org/proceedings/lrec2006/pdf/413_pdf>