Digital collections: practices of preservation and treatment of works

Given the current context, an age of technological innovations, thinking about old books in the printed format and not correlating them to the new moment is perhaps not the best of the alternatives, especially if we consider paper’s fragility. However, a question is necessary: What, after all, are the advantages of using digital support in the process of preserving a literary collection? Regarding the possible answers, we can say that there are many benefits gained from this process. Among them, we have the ease and speed of access to data and information, in addition to the democratization of knowledge. Thus, thinking of such advantages, the present study aims to present the possibilities of research in digital collections, as well as the preservation practices executed with literary collections in the city of Caxias-MA. In this article, we detail quantitative and qualitative researches that consist of data organization and systematization processes, available on the website Digital Library of Literature in Maranhão (https://www.literaturamaranhense.ufsc.br), these researches are based on the ideas of Boeres and Faria (2012), Braga and Diemer (2010), Cúrcio (2006), Di Giorgi (1980), Freitas (2007), Greenhalgh (2011), Lopes (2017), Mota (2014), Monteiro (2009), Reifschneider (2008), Rocha (2012), Santarém Segundo (2014), Sousa and Assis (2018), Sousa, Correia and Assis (2018), Scheibe (2008) and Valle and Araújo (2005).


Introduction
Many are the literary collections that today are vulnerable to the corrosive and destructive processes caused by agents of deterioration, which makes us think and question about the preservation strategies and what are the appropriate solutions to keep archives available and, at the same time, what can be done to safeguard the historical and cultural heritage they help to keep. Thinking about the economic and social development of a given community and not establishing actions that emphasizes material, cultural and historiographic preservation of their memories is, in short, an attempt to erase identity and it is also an intellectual imprisonment. Thus, one of the possible actions to safeguard literary collections is the process of digitalization, which is an important tool for the preservation and dissemination of the literary memory of a community. In addition to facilitating access to information, the systematization and digitalization of data benefit its longevity, allowing access to the content without the need to handle the original document, making it available for remote consultation and at the reach of other researchers. Also, the recognition and appreciation of authors, if we consider the barriers and weaknesses that revolve around them and their literary creations, destined, most of the time, to obliviousness and abandonment, is reinforced, bringing up writers and works that generally are not within the scope of some researches.
With the objective of safeguarding, preserving and making available the data of literary production in Maranhão, the Núcleo de Pesquisa em Literatura, Arte e Mídia -LAMID -  Therefore, thinking about the advantages acquired with this initiative, in which the book is understood from a new support, the digital one, we will highlight, in this study, the possibilities of research in digital collections, as well as the practices of preservation and dissemination of literary memory from Maranhão implemented with the aforementioned collections. Such organizational and data systematization process has been made available on a website: the Portal Maranhão, (https://www.literaturamaranhense.ufsc.br).

Methods
The methodological procedures performed are based on two basic strategies. The first is to focus on organizing and sharing the information gathered in the collection, divided into four succinct stages: the first consists of the process of cataloging and dividing the works, which are initially separated and classified according to genre and conservation condition.

Results
Founded on August 15, 1997, the Academia Caxiense de Letras, or Coelho Neto's House, aggregates a diverse number of poets and prosaists and today is one of the most important memory institutions in the State of Maranhão, having a collection formed by more than three thousand works. The collection is divided and organized in works by national and international authors, in the areas of Literature, History, Education, Law, Sociology, Philosophy, Psychology and also the newspapers Folha de Caxias (1965Caxias ( -1973, O Pioneiro (1990Pioneiro ( -1994, Jornal da Cidade (1997)(1998)(1999)(2000)(2001), among others.
The work of preserving and disseminating the data present in this collection, as stated above, follows two basic strategies. The first process is based on the selection, cataloging and systematization of the collected and organized information, and as a previous result,  Therefore, given the efforts presented in the initial stages of the preservation work of the  (2000), scholars who draw a fine line in the emergence of the stylistic studies focused on the study of language, whose object of study is style.
In Monteiro's (2009) perspective, stylistics acquired the status of a discipline, entering an effervescent phase, in an effort to assert itself before other areas that disputed the same object: style. Two directions divide its trajectory. The first was concentrated more on the components of the speech, defined as descriptive stylistics, idealized by Charles Bally, and the second, inclined towards intuition, being labeled by genetic or idealistic stylistics, linked to the linguist Leo Spitzer.
Regarding its possible developments, Lopes (2017) states that stylistics has acquired, over time, some particularities, subdividing into structural, generative, rhetorical, poetic, semiotic and statistics, resulting in the ideas of stylometry. For him, literary stylometry, also seen as an offshoot of Bally's stylistics, constitutes itself as a resource, a methodology that helps, in a more precise way, the researcher in the search for stylistic traits.
Freitas (2007)  According to Lopes (2017), the computerized tools come to assist the researchers, helping them to accelerate the investigations in the field of literature. That is, in the field of statistical studies of literary texts, the researcher has the function and task of analyzing and interpreting the data that computerized programs make available. The scholar states that the achievements and data interpretation solutions that such programs offer, without escaping criticism, make the stylist researcher's work faster and more viable.

Authorship attribution: initial results
For an appreciation of our approach, we now turn to the initial results of our research, we will present, for a better understanding, the corresponding image of our corpus: For this, as a brief understanding, the code is presented in the corpus as the first letter of the name of the work and the name of the author; in the case of the analyzed text, Crônicas de Além-Túmulo, we work with the initial hypothesis of being authored by Chico Xavier.
It is also worth noting that, all the texts selected belong to the prose category, the majority of which are chronic, inserted and organized from the following nomenclatures: UltC; Seara; Além; Mens; Lição; Crôn; Balas; Semana; Paixão.
The corpus, as a whole, has 452,759 occurrences 7 . The image below shows the layout of the texts, the number of occurrences, words 8 and its extension.   The texts of the analyzed authors, arranged in the same quadrant, present similar grammatical distribution, in terms of the use of bicodes. The author or "authors" tends to use the following grammatical elements in the composition of sentences: preposition + noun; nouns + conjunction; noun + preposition; determinant + preposition; conjunction + adjective; determinant + adjective; noun + numeral, which for the study in question becomes an important information.
Another significant data for the study of authorship attribution is the grammatical distribution of the texts; in the case of Campos, the literary criticism accentuates the style of his writing as using a high rate of nouns (ROCHA, 2012).
The image below shows us once again the approximation of Campos' texts with the work Crônicas de Além-Túmulo, as well as the grammatical forms that most appear in Da Seara de Booz (1918) andÚltimas Crônicas (1962). According to the data, the present study was able to identify that the text by Xavier exhibits more preposition + determinant; preposition + pronoun; and nouns, dialoguing with the stylistic study done by the critic (ROCHA, 2012) regarding the style of Campos, as previously highlighted. Therefore, from the data presented, we can say that this is an ongoing research and that somehow it has perceived stylistic approximations and similarities between the analyzed text Crônicas de Além-Túmulo, psychographed by Chico Xavier, and the texts by Campos.
Besides authorship attribution, several other literary researches can be done using digital collections. Assis, Sousa and Silva (2020), Assis and Lopes (2019), Assis (2017) and Assis and Sousa (2017) are just some examples of how literary studies face a field with huge opportunities.

Discussion
Returning to the premise engaged at the beginning of our article on what are the advantages of using digital support in the process of preserving a literary collection, we can say that the changes regarding the perception of the printed book are notorious, as well as the significant advantages to preserve a literary collection from a digital perspective.
Santarém Segundo (2014), in the work O uso de elementos semânticos no processo de recuperação da informação em ambientes digitais 9 says that it is currently possible to verify that several institutions and organizations have made decisions regarding processes that involve the use of information in digital format. The researcher also points out that most of these institutions have been concerned with creating documents that while persisting in the paper format can also be converted to digital format through digitization processes, considering that the book, in printed format, is materialized utilizing digital tools. Likewise, we see that, Technology has been opening space for the process of digitizing documents, enabling consultation of the digital library. It is a new way to access content; it is a new format for the document, digital format, which greatly facilitates the search and the remote access to users (BRAGA e DIEMER, 2010, p. 10, our translation).
Therefore, given the current context of technological developments, the process of digitizing bibliographic collections appears as an alternative for preserving and facilitating access to information, in which the main line of defense works with the argument that it will benefit its longevity. The digital data could also allow access to content without the need to handle the original document, in addition to facilitating contact with the books, making them available for remote consultation and at the reach of other researchers, as highlighted by Greenhalgh (2011).
Valle e Araújo (2005) point out that in the universe of conventional conservation techniques, preservation and access are dimensions that are not only distinguishable but also often opposed. Frequently, the only way to guarantee the preservation of an item is to reduce its circulation. However, with the application of digital technology, this scenario is radically transformed, as these dimensions become related and cooperative.
According to the researchers, digitization brings countless possibilities to the universe of collections preservation. However, its application in artifacts of permanent value must be conducted with care, accompanied by a long-term strategy, under the consequence of placing the collection at the mercy of the fragility of digital technology. According to 9 The use of semantic elements in the information retrieval process in digital environments them, the issue of digital longevity and access must be addressed in any reformatting plan for the digital media.
In another way, Reifschneider (2008), also revolving the problem of the "Rare Works" objects, raises the following question: how to promote access to rare works and preserve them so access can be continued? Santarém Segundo (2014) presents a possible solution for the documents and rare works promotion and preservation. The researcher recommends the digital repositories, information environments constituted from and by technological tools.
Digital repositories are information environments created of technology tools/platforms, which can also be called information systems, capable of receiving deposits of digital objects, in various formats, whether made by the authors themselves or by teams trained for this purpose, in order to store and organize these objects so that it can be recovered and mainly preserved for a long time (SANTARÉM SEGUNDO, 2014, p. 112-113, our translation).
Regarding the basic concepts of storage and preservation, Boeres and Farias (2012) say that these remain applicable in the new professional culture, with computational tools replacing the pencil and paper, generating new routines that allow other processes of storage and recovery. This concept becomes even more valid as the importance and fragility of printed literary collections are perceived. There are many possibilities for corrosion and destruction of this cultural heritage, exposed and subjected to the actions of deterioration agents, such as humidity, the temperature and the actions of man, mainly for the inadequate handling and lack of technical knowledge with the document.
Therefore, understanding all the changes and resignifications of the process of safeguarding and preserving printed literary collections, Sousa, Correia and Assis (2018)

Conclusion
As initially stated, the goal of this article is to attempt to highlight the possibilities of research in digital collections. We also emphasized the preservation practices implemented with collections, a work divided and strategically organized, understood as a selective process and systematization of data, in which information about life and work of authors from Maranhão is made available on Portal Maranhão website.
Regarding the work of safeguarding literary collections, Mota (2014) says that archival institutions of public scope, considered as instruments of democratization of information, have rich documents with social interest content in their collections that often are not accessed due to poor conservation conditions, reduced access, as well as the lack of knowledge of society about its existence.
Therefore, from such information and realizing the fragility of the printed literary collections, as well as the possibilities of corrosion and destruction of this cultural heritage, that are often exposed and subjected to the actions of agents of deterioration, we accentuate that the need and importance of the digitization process becomes evident. In addition to the dissemination and availability in digital support, there is also the material preservation of the original printed document, since there is no need for direct contact with it, which will prolong its longevity. With the availability of these works on the internet, there is what we call the democratization of knowledge, as the researcher can easily have quick and remote access to these documents, many considered rare and out of reach to most users.