Lexical-semantic integration by good and poor reading comprehenders

In this paper, we investigate the level of vocabulary knowledge and the lexical-integration ability of good and poor comprehenders at the 8th grade of Elementary School. The participants were assessed in the following tasks: reading comprehension, listening comprehension, decoding, vocabulary, lexical-semantic integration and incongruence detection. The performance comparison revealed that good comprehenders performed significantly better than poor comprehenders in the measures of vocabulary and integration. The difference in the accuracy of the integration tasks remained significant after controlling for word knowledge. The results suggest that good and poor comprehenders differentiate not only in lexical semantic knowledge but also in lexical-semantic processing.


Introduction
Lexical-semantic integration is one of the sub-processes involved in reading comprehension; it consists of the construction of an adequate meaning to every word accordingly to its discursive context (Perfetti & Stafura, 2013). In order to integrate words with the context, readers need to be able to apply their lexicalsemantic knowledge to select the proper meaning of the word. This process is like "tuning", finding the precise match between word and context, which sometimes demands not only word selection, but also meaning (re)construction. The reader elaborates word meaning through lexical integration in order to create local and global coherence. By integrating the word to the text, the reader increments and updates the textual model (Kintsch, 1998). The context functions not only as a constraint to the semantic scope of the word but also as an arrow pointing to the meaning to be selected among the others available in memory (Elman, 2009). In fact, it may also add to the word semantic-pragmatic aspects not existent in its representation, which may result in learning and, consequently, improve the quality of the reader's lexical-semantic representation. Stafura, Rickles and Perfetti (2015) have shown that throughout the process of word-to-text integration multiple processing domains take place, including message comprehension, lexical association and textual memory. This process has been underestimated by researchers who believe that reading difficulties rely at a more global level. As a consequence, studies on integration ability with good and poor comprehenders in adulthood are still scarce. Yang, Perfetti and Schmalhofer (2005) used ERP to investigate differences between good and poor comprehenders in lexical-semantic integration performance. In their study, participants read short passages composed of two sentences whose relationship establishment occurred by means of four types of connections: explicit (lexical), by paraphrasing (semantic), by inferencing (situational model), and pattern/standard (with no implicit or explicit reference -difficult integration). The passages were presented word by word on a computer screen, so as to allow the record of electric potentials generated during each word reading. After reading some passages, the participants answered true and false questions related to their content. Researchers observed a different pattern of activation in the poor comprehenders group. They interpreted it as a slower activation or either a slower information selection to integration development as compared to the good comprehender's performance.
This result may be explained by the lexical quality hypothesis (Perfetti & Hart, 2001), which postulates that reading efficiency is affected by the quality of the readers' lexical representations. While reading words, the readers with weak lexical-semantic representations take longer to activate and select the appropriate meaning to be integrated to the situational model, which may affect the comprehension process. Readers with comprehension difficulties are slower to establish semantic relationships between the words in the text. Henderson, Snowling and Clarke (2013) observed that this difficulty impairs these readers' abilities in inhibiting irrelevant information and in keeping important information active in memory. These studies not only corroborate the hypothesis that poor comprehenders present a semantic deficit (reduced knowledge of words' meanings) when compared to good comprehenders (Nation & Snowling 1998, but also bring evidence on how difficulties at the word level may generate comprehension problems at the global level, indicating, thus, the importance of research on the lexical-semantic integration processing to the study of reading comprehension difficulties. Studies with homophones in lexical decision tasks, investigating both isolated words and words in sentences, also portrait contrasts between good and poor comprehenders' processing and lexical representation (Perfetti & Hart, 2002). Results showed that good comprehenders have more integrated lexical representations; that is, orthographical knowledge is not as strongly integrated to phonological and/or semantic knowledge as it is in good comprehenders. There are associated deficits between lexical knowledge and the decoding ability. Perfetti, Yang and Schmalhofer (2008) implemented an ERP study to investigate the process of word-to-text integration and, similarly to the study previously mentioned, found differences between the two groups. Poor comprehenders presented a slower and less effective integration process. Both studies mentioned above corroborated the hypotheses of differences in the quality of lexical representations between the groups of readers. However, further research should increment the investigation on the relationship between reading comprehension difficulties and lexical-semantic knowledge, mainly concerning depth of vocabulary knowledge and the ability to use it in the process of word integration to the textual model. This study aims at investigating lexical-semantic knowledge and integration ability comparing accuracy and response time between readers with different levels of textual comprehension in two integration tasks.

Participants
Forty-nine (49) good comprehenders and thirty-seven (37) poor comprehenders participated in the study, selected from an initial group of 336 eighth graders selected as a convenience sample from state public schools in the city of Santa Cruz do Sul, in a state in the South of Brazil. They were all native speakers of Brazilian Portuguese and did not present neurological problems nor special educational needs. The good comprehenders group was composed by 16 (32,70%) boys and 33 (67,30%) girls, while the poor comprehenders group was composed by 15 (40,50%) boys and 22 (59,50%) girls. The mean age of the participants was 14 years old. Good comprehenders' mean was 14,20 (SD = 0,76) and poor comprehenders' mean was 14,51 (SD = 1,01) years old.
Participants' selection occurred in two steps. Initially, the students answered a reading comprehension task designed for the study. The task included three texts with comparable extension and readability, and were each followed by five multiple-choice questions. Based on individual performance, two groups of students were formed: one with 1 standard deviation (SD) below the average and the other group with 1 SD above the average. After the selection of the groups, they completed the isolated word reading task (Salles & Parente, 2002). Students who achieved performance adequate to their age, according to the norms described by Salles, Piccolo, Zamo and Toazz (2013), remained in the study. The selection excluded readers with decoding difficulties from the lower performance group, since the study aimed to analyze the performance of readers with difficulties exclusively in comprehension.
Still in the selection phase, we observed listening comprehension to check whether difficulties presented by the poor comprehenders were restricted to the written text, or whether they also extended to the auditory modality. The listening task followed the same format of the written one: three texts, comparable by extension and readability, followed each by five multiple-choice questions. In Table 1, we present the groups' characterization. 2

Instruments and procedures
The study included a vocabulary task and two lexical-integration tasks. Data were collected at school, individually in a quiet room. The integration tasks were administered separately in two meetings of approximately 25 minutes. The task of word integration followed the semantic judgment task, while the incongruence task followed the lexical decision task. The vocabulary task was carried out in a separate meeting. Before data collection started, a pilot study was conducted with 15 students of the 8 th grade who participated in the first selection stage but had not been included in the study. They received printed tasks with these instructions: "a) Integration -Read the texts excerpts and mark the alternative containing the meaning of the word which is in bold in the context; b) Incongruence -Read the texts excerpts. There is a word misused in each one of them. Mark the alternative that presents the word that does not make sense in the text. " Response analysis showed the necessity of substituting and shortening some excerpts, as well as modifying some alternatives. After these modifications, three specialist judges, members of the GENP (Group of Studies in Neurolinguistics and Psycholinguistics) at PUCRS evaluated and approved the tasks. Then a pilot with three participants was carried out using E-Prime (Professional 2.0.10.242) to ensure the instructions and time were adequate. The tasks were administered according to the following description: • Word definition: it assessed the semantic knowledge of words using the WISC III (Wechsler, 2002) vocabulary subtest. The task requested the oral definition of thirty words in a growing order of difficulty, being the definitions registered on a sheet to be analyzed afterwards. Before starting the task, each participant was given a practice session with the use of three words. Data were collected by undergraduate students integrating the Laboratory of Psychology. Definitions were graded 0 to 2 based on the responses provided by the sample for standardization presented in the administration manual.
• Word integration: it assessed accuracy and RT in the lexical-semantic integration task. Twenty extracts of texts of varying genres taken from newspapers, magazines and blogs were used. Their extension varied from 24 to 54 words, with a minimum of 50 and a maximum of 91 syllables. The alternatives were built by consulting a dictionary of synonyms (http://www. sinonimos.com.br/). The texts were displayed individually in black Courier New font in size 18 on the first screen and, in size 15 on the second screen, centered on a white screen of a laptop. The first screen presented only the text with one of the words in CAPITAL LETTERS. After reading the text, the participant should press the space bar for the second screen to appear, in which the same text was presented again now with five possible meanings of the word in capital letters, as shown in the example below. The participants should choose the alternative with the most adequate meaning according to the text, by pressing the number on the keyboard that corresponded to the correct alternative. We illustrate below in italics a sample of the task administered in Brazilian Portuguese. Its translation into English follows. • Word incongruence: this task also assessed lexical-semantic integration by demanding the identification of words used in an erroneous way. Twenty extracts of texts of varying genres taken from newspapers, magazines and blogs were used. Their extension varied from 25 to 50 words, with a minimum of 65 and a maximum of 99 syllables. The texts were displayed individually in black Courier New font in size 18 on the first screen and in size 15 on the second screen, centered on a white screen of a laptop. On the first screen only the text was exposed. After reading the whole text, the participant should press the space bar for the second screen to appear, in which the same text was presented again now with five words among which the student had to choose the one possibly used incorrectly in the text in terms of meaning, by pressing the correspondent number in the numeric keyboard. Below, we present in italics one of the excerpts used in the task as administered in Brazilian Portuguese, followed by its translated version into English. The E-prime software was used for stimuli presentation of both tasks and for the registration of RT (for both the text reading and the word choice task) and accuracy. A fixation point (+) was exhibited for 1000 ms after the response was given and, following, the next stimuli appeared on the screen. The order of the correct response was balanced among the positions 1 to 5. Two versions of the experiment were designed to alter the order of text presentation. The participants were requested to keep their hands on the keyboard next to the numeric keys. At the end of the experiment a screen to thank the participants was exhibited.
Results showed that only one student in the good comprehenders group scored below the expected average for his age according to the norms of the test with 9 points. In the poor comprehenders group, 10 students (27%) scored below average for their age (9, 8 and 7 points). Differently from other receptive vocabulary tasks, in which the participant only recognizes the word, this task can be considered as a measure of depth of vocabulary knowledge, since the examinees produce word definitions allowing the examiner to assess semantic knowledge depth, not only vocabulary extension. The results of the word definition task are in accordance with previous studies that demonstrated that good comprehenders present larger vocabulary knowledge than poor comprehenders do (Catts, Adlof & Weismer, 2006;Pimperton & Nation, 2010;Ricketts, Sperring & Nation, 2014;Spencer, Wagner & Petscher, 2019).
As reported on Table 3, good comprehenders scored better than poor comprehenders in the lexical-semantic integration task, even when word reading ability was controlled by an Ancova test (p = .001). The result indicates that the groups differed in the ability of integrating words to the context for requiring the readers to be able to apply their lexical-semantic knowledge when selecting the adequate meaning of the word. Response time included the time taken by the participants to read the texts when presented in the first screen and the time taken to choose the correct alternative among the five options. In order to interpret RT, the authors tested with E-Prime several timings that could possibly occur: fast reading of the text, slow reading, reading only of the alternatives, indication of the alternative without reading the remaining options, re-reading the text with alternatives reading and responses. In less than 1000 ms it would not be possible to distinguish between the options and indicate a response. Thus, RT below this value could indicate mere guessing. No one of the participants presented RTs lower than 1000 ms, then no exclusions were necessary. It was not possible to establish the maximum RT because, besides reading the alternatives, some students needed to read again parts or the whole text to take their decisions. Therefore, long RTs were not excluded. The average groups RTs exhibited on Table 3 was calculated considering only correct responses. Thus, RTs of participants who had errors equal to or higher than the percentile 97 were excluded -RTs of students that produced more than 13 errors in the task, representing six participants of the poor comprehenders group. The minimum time of 4000 ms was established for text reading, taking as a parameter the reading time for the text with the lower number of syllables. To apply this criterion, 21 (1,23%) reading times lower than the minimum established were manually excluded. One participant of the good comprehenders group was excluded due to showing reading times inferior to 4000 ms in 10 out of the 20 texts presented.
Despite the fact that good comprehenders read faster than poor comprehenders, the difference between groups was not significant. Yet the reading time per syllable reached marginal significance, with an advantage for the good comprehenders group. Two hypotheses can be drawn to explain this result. Firstly, the reduced number of stimuli -only twenty -may have caused lower statistical power to show a difference between groups. Secondly, good comprehenders' response times may indicate a more careful reading to achieve a better performance. In the same way, Finger-Kratochvil (2010), by analyzing undergraduate students' reading, also stressed that higher reading times not necessarily correspond to lower comprehension levels, since higher reading time may as well be the result of a more attentive reading.
In the word incongruence task, the reader was asked to judge which word was incorrectly being used in the text, interfering with the integration process. To do the task, the readers had to analyze word meaning in its local and global levels, verifying whether it was possible or not to integrate it. Table 4 shows that good comprehenders achieved much higher accuracy means as compared to poor comprehenders. This difference continued significant when the reading ability of isolated words was controlled by means of an Ancova test (p = .001). The task also measured the reading time of the texts presented separately in the first screen, followed by the RT taken to choose between possible five options the word that had been incorrectly used in the text. We adopted the same criterion to analyze the integration task. E-prime helped to filter and eliminate RT inferior to 1000 ms and those of incorrect responses. To calculate the median RT of readers with errors equal to or above the percentile 97 the data were filtered and 17 errors were eliminated, originated by three poor comprehenders. To measure the text reading times, a minimum of 4000 ms was established, thus resulting in 23 reading times (1.33%) filtered for being lower than the minimal established.
Differently from what happened regarding accuracy data, statistical differences were not found between groups in time measures concerning the incongruence task. The time for deciding and choosing the inadequate word was equivalent in both groups. Despite their difficulties, poor comprehenders did not devote more time to read the texts and to solve the task. The marginally significant difference found in reading time per syllable in the integration task was not replicated in the incongruence task. Therefore, the data do not confirm the difference in reading speed between groups in the integration task.
A significant correlation was found between accuracy (r s = 0.548; p = .001) and RT (r s = 0.451; p = .001) in the two integration tasks. However, in the integration task the mean reading time of the texts and the mean time per syllable were higher, especially in the poor comprehenders group, despite the similar extension of the texts used in the two tasks. In the integration task, the text was presented in the first screen with the word to be questioned appearing in capital letters. This could have led poor comprehenders to pay more attention to the word. Yet the good comprehenders, possibly for showing less difficulty or for adopting some type of strategy, did not alter their reading rhythm and presented similar a time in the two tasks.
The results showed that good and poor comprehenders differed in lexicalsemantic integration: good comprehenders were better able to integrate words to the local and global contexts. The difference in integration ability between the groups had already been identified by Yang et al. (2005) and Perfetti et al. (2008) as detailed in the introduction. The latter study developed an online investigation of the integration process by means of ERP with adult readers with distinct levels of comprehension. The time measures obtained with ERP occurred during reading time, word by word, which foster precision and reliability. The authors found evidence of different processing times, suggesting differences in lexical-semantic processing between groups. Readers with lower comprehension ability presented a more slowly and less efficient lexical-semantic identification and integration when compared to good readers.
Another study (Henderson & Snowling, 2013) analyzed the priming effect in oral sentences divided in subordinated and controls, whose ending was completed by a figure presenting a homonymic subordinate relation (appropriate) or a homonymic dominant relation (inappropriate) with a SOA of 205 and 1000 ms. Results showed that the poor comprehenders, with a mean of ten years old, were significantly slower than the good comprehenders in picture naming; in the 250 ms SOA, they exhibited priming effect in the appropriate and inappropriate sentences conditions, while in the 1000 ms SOA they did not exhibit priming in the appropriate sentences condition, although presenting in the inappropriate condition. On the other hand, the good comprehenders exhibited priming only in the appropriate sentences condition. According to the authors, the priming effect in the inappropriate sentences condition may indicate problems in inhibition of irrelevant information during the integration process. Despite the differences in the type of stimuli adopted -while Henderson and Snowling presented sentences in our study we presented text excerpts -in both studies poor comprehenders demonstrated lower lexical-semantic integration ability when compared to good comprehenders' performance.
There were no differences in time between groups in the integration tasks possibly due to the methodological choices adopted. More specifically, time registration was not taken online, but at the end of text reading and at the moment when one of the five alternative words had to be chosen. The time taken to read the alternatives and decide about word meaning, or decide which word had been incorrectly used, may not have been precise, since the reader could have re-read some parts of the text, or the whole text in order to answer.
The incongruence task, also called error judgment, is commonly used in research on monitoring ability during reading due to the fact that to perceive an incongruence it is necessary to be attentive to the meaning being constructed in the text. Ehrlich, Remond and Tardieu (1999) studied metacognitive monitoring in an error judgment task through the resolution of two types of anaphors: pronominal and lexical. Anaphor processing is a typical lexicalsemantic integration task, since to the new word to be integrated to the context it is necessary to recover the antecedent linguistic element that referred to the word. The groups with higher and lower level of reading comprehension did not exhibit difference in reading time in the error judgment task. However, the results have shown that poor comprehenders have a tendency to overestimate their comprehension ability, re-read the text fewer times and are less able to notice textual inconsistencies and, when they perceive them, many times they do not identify the incongruent word. Conversely, good comprehenders manage to adapt their reading strategies to the task, exhibiting a higher number of re-readings and a higher reading time in the errors judgment task than in the reading comprehension self-assessment task. The researchers concluded that poor comprehenders show deficiency in monitoring their reading; however, they could have concluded as well that the group presents problems in anaphora resolution and, consequently, in word-to-text integration. The higher use of re-reading strategies by good comprehenders corroborates the hypotheses that higher RTs in lexical-semantic integration tasks may indicate a more careful reading and higher use of strategies instead of difficulty in resolving the task.
The semantic incongruence task proposed in our study was possibly influenced by the use of strategies and reading monitoring processes. When analyzing time measures under the same perspective taken by Ehrlich et al. (1999), it is possible to interpret them as evidence of monitoring processes including self-assessment and revision. Thus, it seems that reading and answering times in the lexical-semantic integration tasks may have occurred because good comprehenders used strategies, including monitoring, in order to achieve good performance, while poor comprehenders, possibly less aware of their difficulties, may not have used good strategies, or may have used inadequate ones. The use of verbal protocols in this type of task, as in Ericsson and Simon (1993), could elucidate these issues unexplainable via RT analyses. We also analyzed the relationship between lexical-semantic integration ability and the other measures of reading, comprehension and vocabulary of this study. As shown in Table 5, the two tasks examining lexical-semantic integration ability moderately correlated with the other measures. The correlation value between lexical-semantic integration and isolated word reading is very similar. Also, the correlation with listening comprehension and with reading comprehension is similar, suggesting that vocabulary knowledge may influence linguistic comprehension regardless the modality (listening or reading). The correlation between lexical-semantic integration and reading comprehension remained significant when vocabulary was controlled. We also tested whether the correlation between lexical-semantic integration and reading comprehension would remain when vocabulary was controlled. The Pearson test of partial correlation showed that the relation between reading comprehension and the lexical-semantic integration task presented little reduction and remained significant (r = 0.46; p = .001) when the WISC vocabulary measure was controlled, and the same occurred regarding the semantic incongruence task (r = 0.53; p = .001). The correlation also remained when isolated word reading was controlled, presenting similar values to those obtained when vocabulary was controlled. Thus, the ability of word-to-text integration seems to be crucial to reading comprehension, which can be a major point underlying research on the bases of reading comprehension difficulties.
In order to further explore data in this study, the Ancova test was also carried out to verify whether while controlling lexical-semantic knowledge (WISC III) the difference between groups would remain. The test showed a statistical difference between groups both in the lexical-semantic integration task (p = .001) and in the incongruence task (p = .001) when vocabulary knowledge was controlled. This evidence is relevant, since it suggests that good comprehenders and poor comprehenders differentiate in lexical-semantic integration processing regardless their vocabulary knowledge, indicating that reading comprehension difficulties may originate both from vocabulary knowledge, as well as from lexical-semantic processing deficiencies.

Final considerations
Besides confirming results reported in previous literature about the difference in lexical-semantic ability between good comprehenders and poor comprehenders, this study showed that this difference occurs independently of the decoding ability and vocabulary knowledge of the groups. Moreover, the study suggested that, despite being independent, lexical-semantic ability relates to decoding ability and vocabulary knowledge, as well as to reading and oral comprehension ability. Results were not conclusive regarding reading and response time, indicating the need of more precise instruments to develop this type of research.
As demonstrated by this study, lexical-semantic integration is an important ability for reading comprehension despite being still scarcely investigated. Analyzing how and to what extent this ability relates to reading is a promising way to elucidate reading comprehension processes and the difficulties related to it. Further studies should advance searching for explanations on how lexicalsemantic integration and global integration interact and cooperate throughout the reading comprehension process. Methods diversification may contribute for a thorough examination of lexical-semantic integration under several perspectives and detail the stages of this important process.