The Effect of Altered Lexile Levels of Informational Text on Reading Comprehension

Introduction We examined how using five different simplified texts on the same subject would affect reading comprehension. 335 students in grades four through eight read one of five texts retrieved from Newsela.com and then completed a comprehension test. Results from a 3-way ANOVA showed no significant interaction among grade, reading level and text condition. Pairwise comparisons showed that below-level readers’ scores improved only with extremely lower levels of text and on-level and above-level readers’ scores did not significantly change regardless of text level. Regression analysis showed no statistically significant contribution of text level to overall comprehension scores. The findings of this study have implications for choosing leveled texts for reading instruction.

in the United States, are simplified versions of original news articles, available at different reading levels.Because there has been some controversy over whether simplified texts such as these help with reading comprehension (Hiebert, 2018;Mesmer, et al. 2012;Tortorelli, 2020), we carried out this study to determine whether different levels of a simplified text from Newsela made it easier for elementary students to comprehend the text as intended.

Background
The Simple View of Reading proposes that reading comprehension is a product of both decoding and language comprehension (Gough & Tumner, 1986).If a reader has difficulty in either process, they will be unable to fully comprehend the text.Even if students can decode the words in the text, they may have difficulty with the many aspects of language comprehension, such as background knowledge, vocabulary, language structures, verbal reasoning, and understanding of genre, especially when the texts are more complex (Scarborough, 2001).Scholars have developed models of reading comprehension that use subcomponents of comprehension, such as the direct and inferential mediation model (Elleman & Oslund, 2019).This model shows that vocabulary is the strongest predictor of reading comprehension with inference-making and background knowledge also having strong effects of comprehension.Other research shows that both vocabulary and syntactic structure affect reading comprehension (Mokhtari & Niederhauser, 2017).
In order to scaffold students' comprehension of text, publishers have adapted texts to change the linguistic features that affect comprehension (e.g., Fountas & Pinnell, 2014).In theory, this would allow differentiation by reading ability in a classroom without having to use different content for different levels of readers.In 1989, Stenner and colleagues received NIH funds to research a way to allow educators and publishers to categorically evaluate and predict the complexity of a text.They developed The Lexile Framework for Reading, a quantitative system that accounts for such factors as average sentence length, average word length, numbers of words per passage, the rareness of vocabulary, and the average number of clauses per sentence (Smith, et al., 1989).According to Metametrix (2021)

Use of Digital Leveled Texts
Recent technology has allowed companies to manufacture multiple digital versions of the same text at different Lexile levels by manipulating some of the variables that lead to text complexity.Newsela.com, for example, offers news articles and other informational texts that have been adapted to produce three to five different Lexile levels of the same article.These texts range from about 300L (about the 1st grade level) to the level of the original article, which typically measures around 1200L (about the 11th or 12th grade level) (Newsela, 2022).The articles cover many social studies and science topics and are intended to be used to teach both literacy skills and subject content knowledge.
There has been little research on the efficacy of these simplified texts for aiding comprehension.It is unclear how using these formulas to alter authentic texts to score lower on a readability scale will affect students' comprehension.From the perspective of the Simple View of Reading (Gough & Tumner, 1986), simplifying the text could improve both decoding, by using words that are phonetically simple or regular, and language comprehension, by simplifying the vocabulary and sentence structure of the text.
There has been some argument, however, that artificially changing the components of a text to decrease the complexity may inadvertently make a text more challenging to read and comprehend (Hiebert, 2018;Mesmer, et al. 2012;Tortorelli, 2020).It is possible that by shortening sentences, certain signal words are eliminated, and common syntactic structures are lost, making it more difficult to determine the relationship between ideas or recognize common grammatical patterns, especially for second-language learners (Hiebert, 2018;Mesmer, et al., 2012).By lowering the vocabulary demands of a text, the semantic complexity is changed in terms of concreteness and subtlety of meaning (Tortorelli, 2020).These simplifications of the text become especially important in upper elementary grades, when students are expected to evaluate an author's perspective or voice from a text (National Governors Association Center for Best Practices & Council of Chief State School Officers, 2010).Another concern is that simplifying texts for linguistic features as measured by Lexile does not address the effect of background knowledge or inference on comprehension (McNamara, Ozuru, & Floyd, 2011).

Current Study
The purpose of the current study was to examine whether changing the linguistic components of a text to reduce the complexity at different levels made the text easier to read and comprehend (Hiebert, 2018;Mesmer, et al. 2012;Tortorelli, 2020).We chose to examine articles simplified by Newsela because these texts have become so widely used since they launched their service in 2013.Newsela claims that their website and articles are used by more than 90% of schools, including 2 million teachers and 25 million students (Newsela, 2022).
We focused on informational texts because The Common Core State Standards call for an equal amount of literary and informational texts in grades three to five and concern has been raised about past imbalances in the amount of informational text reading in elementary classrooms.(Duke, 2000;Li, Beecher, & Cho, 2018).Informational texts may be more difficult to comprehend because of their structural complexity, abstract and logical relations, and domain-specific knowledge (McNamara, Ozuru, & Floyd, 2011), although some research shows no difference in narrative or informational text comprehension (Uysal & Bilge, 2019).
We chose to examine the effect of the text level on students in grades four through eight because in these grades, there is a higher demand for reading for understanding rather than decoding (Chall, 1996; National Reading Panel (U.S.) & National Institute of Child Health and Human Development (U.S.), 2000).Children in grades four through eight are likely to need support for unknown vocabulary and complex language structures when they read.Our goal was to gain information that would help determine the degree to which providing varying levels of the same text might improve students' reading comprehension of those texts.

Research Questions
To examine the effect of Newela's text alterations on student comprehension, we set out to answer the following research questions: 1. Is there a difference in reading comprehension scores of students who read different Lexile adaptations of the article?
2. Is there an interaction between reading ability, grade level and different Lexile versions on reading comprehension scores?
3. How much variation in reading comprehension scores is explained by the Lexile level of the text?

Subjects
We recruited all the students in grades 4, 5, 6, 7, and 8 from two urban northeastern schools in the United States.Our subjects included 335 of the possible 496 students whose parents/guardians gave consent for them to participate (67.5% of the total students).We grouped student into three categories using their scores from the previous year's state literacy assessment that is based on the Common Core State Standards.The assessment provided scores at the following levels: Level 5: Exceeded Expectations; Level 4: Met Expectations; Level 3: Approached Expectations; Level 2: Partially Met Expectations; Level 1: Did Not Yet Meet Expectations.We grouped students whose scores fell into the Level 5 as the "above-level readers", those whose scores were Level 4 as "on-level readers" and those whose scores fell into Level 1, 2 or 3 as "belowlevel readers."The numbers in each category were: above level readers (89 students), on-level readers (173 students), below-level readers (73 students).
Other demographics of the subjects can be found in Table 1.The demographics of the subjects were similar to those of the school populations.

Materials
Newsela.com provides a variety of previously published non-fiction texts that have been adapted for educational use.For this study, we chose five articles on the same topic at different Lexile levels that were based on one original science text.The original text had the following characteristics: 1. Topic of interest to the students in grades four through eight.
2. Content not directly related to the science topics covered during the school year to minimize the role of background knowledge (Smith, et al., 2021) 3.Not used by the teachers in their classrooms that year.
4. Vocabulary that included words that students of this age would be expected to understand We downloaded the five Newsela articles based on an article on about a robotic fish that spies on ocean life (Netburn, 2018) in the following five different levels from Newsela.com:  2 for the alignment between the comprehension questions and specific standards that were used in the test.Additionally, we asked a panel of five literacy experts (educators with at least a master's degree and 10 years of experience teaching literacy in this age range) to evaluate how essential each question was for measuring reading comprehension of students in grades four through eight.We then calculated the content validity ratio (Lashwe, 1975).All items reached a level higher than .5 and were retained.We averaged the Content Validity Ratio of all items to arrive at an acceptable Content Validity Index value for the assessment instrument of .84(Tilden, Nelson, & May, 1990;Wilson, Pan, & Schumsky, 2012).
In addition, we examined the construct validity by comparing the pattern of scores on our comprehension test of the subjects in the study to the expected pattern of scores by students performing above level in reading, on-level and below level.We hypothesized that if our test were valid in terms of difficulty, the above level readers would have the highest score, followed by the on-level readers and the below level readers.We found that the means of our subjects in different reading level groups were significantly different from one another, F (2, 330) = 21.497,p < .001.Out of 10 total possible points, the mean for the above level group was 8.02, the mean for the on-level group was 7.29, and the mean for the below-level group was 6.19.This demonstrates that our comprehension test functioned to differentiate student ability in the way we expected.

Procedures
First, we worked with each teacher to determine the reading level of each student in the study as described above.Each teacher had a list of the students in their class with these levels designated.We reviewed the protocol in person with each teacher and then provided a written copy of the protocol for distributing the articles and script for giving directions to students.Each teacher received a stack of articles with equal numbers of each level of article mixed into the pile so that they were distributed randomly.The texts were coded in a way that neither the teacher nor the students could tell the level of any article.Teachers passed out the articles with the assessment attached to them to the student in order of where they were sitting around the room, drawing from the top of the pile and working their way around the room.Because the piles of articles were randomly mixed, the distribution of articles to students was randomly done.
Students were given 40 minutes to read the article and answer the ten comprehension questions.They were told the questions would count as a quiz grade.
All the students in the study had previously read Newsela articles and answered comprehension questions as classroom assignments, so they were familiar with this type of assignment.The students put their name on the cover sheet.When they handed in their assignment, the teacher removed the cover sheet to protect confidentiality, and then the teacher put a code on the back of the test for the students' performance level in English Language Arts (above grade level, on grade level, below grade level).Then the teachers put the completed packets into an envelope, sealed it, and then one of the researchers picked up the envelope from the teacher.

Results
The data sources used to answer our research questions included comprehension test scores that ranged from 0-10 as the dependent variable, text level (five levels of Lexile versions), reading ability level (above level, on level, below level) and grade level (4, 5, 6, 7, or 8) as independent variables.
To answer our first research question as to whether there was a difference in reading comprehension scores of students who read different Lexile versions of the article, we conducted a three-way ANOVA to determine the effects of Reading Ability Level, Grade Level and Text Level on comprehension score.There was no statistically significant three-way interaction between Reading Ability Level, Grade, and Text Level on the 3-way ANOVA, F (24, 180) = .971,p = .507.Group means were not significantly different and, therefore, there was no evidence in this sample to show that, overall, students who read different Lexile versions of the article had significantly different comprehension scores.
To answer our second research question as to whether there was an interaction between reading ability, grade level and different Lexile versions of the article on reading comprehension, we performed pairwise comparisons using a Bonferroni correction.A multiple linear regression was run to determine how much of the variation in comprehension scores was determined by the independent variables.
All simple pairwise comparisons were run for Comprehension Score with a Bonferroni adjustment applied.There was a statistically significant simple two-way interaction between Reading Ability Level and Text Level Condition for below level readers, F (4, 264) = 3.649, p < .007,but not for on-level readers, F (4, 264) = .876,p = .479or above-level readers, F (4, 264) = .109,p = .979.See Table 3 for statistically significant pairwise comparisons of Reading Level and Test Condition on Comprehension Score.See Figure 1 for a chart of estimated marginal means by reading level for each text condition.
When we looked at only the difference between reading ability and text level, we found a significant difference in comprehension scores of below level readers between those who read articles at the 560 Lexile level and those who read articles at the 1130 Lexile level (p < .008)and between those who read 820 and those who read 1130 Lexile levels (p < .024).There was no difference in scores of below level readers who read articles at closer Lexile levels, for example 530 compared to 820 or 1060.The difference was only statistically significant when there was a larger variation in Lexile range.
To answer our third research question as to how much variation in reading comprehension scores was explained by the Lexile level of the text, we conducted a multiple linear regression to see how much difference in the comprehension scores was explained by the independent variables.The R 2 for the overall model was .211%with an adjusted R 2 of .204%.This is a small size effect according to Cohen (1988).
The three independent variables combined, Reading Ability Level, Grade Level and Text Level, statistically significantly predicted reading comprehension scores, F(3, 330) = 29.459,p < .001.However, Text Level condition was not a statistically significant predictor of Comprehension Test Score.The slope coefficient was -.109 showing that for every 1 level decrease in the Text Level (which ranged from 70-250 Lexile points), the comprehension test score increased by .109points on a 10-point scale, not enough to be statistically significant.

Discussion
This study examined the effect of Newela's text alterations on student comprehension in grades four through eight.Our first finding was that there was no overall difference in reading comprehension scores of students who read simplified versions of articles at different Lexile levels on the same topic.Our second finding was that there was a significant interaction between reading ability, grade level and different Lexile versions of the article on reading comprehension scores.For on-level and above-level readers at all grade levels from four through eight, there was no significant difference in comprehension test scores across the five different texts.However, using a lower Lexile article improved the scores of below-level readers.Finally, we found that as the articles became more simplified, this affected comprehension scores by an average of .1 on a scale of 0-10, which was not statistically significant.
There was some evidence in the current study that for below-level readers, having a lower-level text improved their comprehension scores.This also aligns with the research that shows that generally lower-leveled texts positively affect the reading comprehension of less-skilled readers (Amendum, et al., 2018;Crossley & McNamara, 2016).However, in this study, the positive effect of increased comprehension was very small and only present when the text complexity was very low compared to the original.
These results are supported by the research findings that the process of simplifying texts to lower the readability might reduce or eliminate some of the syntactic and semantic information that helps certain aspects of comprehension, especially inference and evaluation and does not attend to the background or vocabulary knowledge of the reader (Crossley & McNamara, 2016;Reed & Kershaw-Herrera, 2015;Xu, Callison-Burch, & Naples, 2015).For example, Reed & Kershaw-Herrera (2015) found a significant increase in comprehension when the simplified texts had high cohesion rather than low cohesion.In texts with high cohesion, the text explicitly provides background information and cues to help readers understand without needing to make as many inferences (McNamara, Ozuru, & Floyd, 2011).A text with low cohesion places more demands for background knowledge on the reader.Quantitatively simplifying the linguistic elements such as those measured by Lexile does not take into consideration the effects of text cohesion or the reader's background knowledge (Arya, Hiebert, & Pearson, 2011).
These results for on-level and above level readers align with Crossley, Yang and McNamara's (2014) findings, in which second-language learners with high background knowledge comprehended authentic They suggested that this was because the ability to make inferences and connections between ideas was easier in original texts than simplified ones.
Another variable to consider was the length of the articles.The results of this study showed that students scored highest on reading comprehension questions when the article was the shortest-at Lexile level 560 with 452 words.The second-longest version of the article, at Lexile level 1130 with 929 words, showed the lowest comprehension scores.Hiebert (2014) suggests that stamina becomes a factor in readability only when texts are more than about 500 words.The only version of the article that fell under that limit was the 560L version, in which students scored the highest.While length is a factor of simplification, research should examine whether the length of text is a more relevant for student comprehension than simplified language features alone.

Implications
Although this study examined a limited number of texts, it shows the importance of continued research in this area.Examining the effects of simplifying authentic texts is important because it adds to our decisionmaking ability in choosing texts for reading instruction and practice.Teachers are currently using texts for literacy and content learning, including Newsela, that have been simplified to achieve different readability levels (Amendum, Conradi, & Hiebert, 2018).Yet, there have been criticisms of these altered texts because they short-change struggling students in terms of access to high-quality and sophisticated texts, they don't account for background knowledge or vocabulary knowledge differences between students, and they may make texts more difficult, not less difficult, to read (Hiebert, 2018;Lupo, et al., 2019;Tortorelli, 2020).
Beyond the issue of diminished comprehension when reading a lower-level text, there is also an issue of the inequity caused if these texts only increase comprehension a small amount.The downside of leveling is that lower-leveled texts are shorter than their original counterparts.There was a difference of 500 words between the longest and shortest article.
If lower-level students are interacting with that much less text each day of the school year, it would amount to reading 90,000 words fewer than their on-level peers for a single year.They also have fewer academic and disciplinary vocabulary words (Beck & McKeown, 1985) and they reduce the exposure of students to sophisticated syntax.Determining the efficacy of these simplified texts is crucial because if they are not helping students in the way they were designed to, then students are simply being shortchanged on the amount of text they are given to read, the number of vocabulary words they have access to, and the level of linguistically sophisticated texts they see.
This study has limitations and therefore calls for more research to be done to add to our understanding of the effects of altering Lexile levels on comprehension.We tested only five articles that could have had anomalies, such as readability features or background knowledge not familiar to this sample of students.Different groups of students, such as secondlanguage learners, students who read several gradelevels higher or lower than expected for their grade, or students from different cultural backgrounds may comprehend these leveled texts differently.
The Common Core State Standards acknowledge that one of the aspects of text complexity is the knowledge and experience of the individual reader, yet there is little information about how the individual reader differs in their experience with simplified texts.
In addition, other websites may be using different simplification methods or use different content and achieve different rates of comprehension.
Educators need a better understanding of how reader characteristics affect readability and more research on the differences that may exist in complexity between authentic texts of a given reading level and altered texts at a parallel reading level.Moving forward, researchers should examine other types and topics of simplified versions of authentic texts with more student populations to be able to generalize any findings.More research is also needed in how other factors besides text complexity affect reading comprehension, such as the reader's background knowledge, the cultural relevance of the text, the motivation of the reader (e.g.Kasper, et al., 2018), or the reader's embodiment of the text (e.g.Glenberg, 2017).
Every day, students are being given simplified leveled texts with good intentions but little research support.
Educators need enough research to ensure that simplified texts are effective for improving students' comprehension.

Figure 1 .
Figure 1.Estimated Marginal Means of Test Score by Reading Level for Each Text Condition , the Lexile formula is used by 20 states for reporting student reading scores on state tests.In addition to being used in the Common Core State Standards (National Governors Association Center for Best Practices & Council of Chief State School Officers, 2010), Lexile levels are also used to choose texts in major published reading programs such as Amplify CKLA (Amplify Education, Inc., 2020), Into Reading (Houghton Mifflin Harcourt Publishing Company, 2020), and ReadyGen Literacy Program (Savvas Learning Company, 2016).

Table 1
Demographics of subjects.

Table 2
Alignment of Comprehension Questions and Corresponding Common Core State Standard Question 5: How do scientists want to improve SoFi?Question 10: What is most likely the reason the author wrote this article?RI.7 Media Literacy Question 9: What point in the article is most demonstrated by the illustrations?RI.8 Reason and Evidence Question 8: What evidence shows why the scientists released SoFi in Fiji in the South Pacific Ocean?

Table 3
Statistically Significant Pairwise Comparisons of Reading Level and Test Condition on Comprehension Score