Testing Reformed

Lance Knowles explains how advances in technology and cognitive neuroscience allow us to assess and modify the quality of learning and practice

A major benefit of the technology revolution in education is the ability to monitor student learning activities in much greater detail than previously possible. Not only can we measure progress, but we can, for the first time, measure the quality and efficiency of the learning process.

This article presents some of the innovations we have developed for our English language learning programs, which are now used in more than 50 countries.

Assessing the learning of any subject or skill requires a learning theory and a learning sequence to measure against. These define the standards necessary to make informed judgments. In other words, there must first be a theory about how learning takes place in the brain and how to optimize the order of learning steps and activities.

For language learning, it isn’t enough to memorize vocabulary and rules of grammar. Language fluency requires the skill to process language at a subconscious level, or language automaticity. Learning or acquiring that skill becomes the primary learning task, and that requires practice.
But what is effective practice?

In our blended learning programs, we see practice as having four dimensions: (1) amount of practice, (2) frequency of practice, (3) quality of learning activities in the practice, and (4) level and sequence of learning activities relative to student proficiency level.

By monitoring and analyzing these dimensions, we can measure the effectiveness of practice and make predictions about learning outcomes. We can also use this information to coach learners in how to modify and improve their practice.

Dimension 1: Practice Amount
In many learning programs, the amount of practice is determined by time on task or by the number of lessons completed. There is little or no distinction made about how the time is spent or how lessons are completed. Learners may practice with varying degrees of attention, sometimes just observing or watching passively, at other times performing some kind of learning task. Sometimes they are simply bored and unengaged.

What we assume is that effective language practice requires active engagement with the language, which means that learners interact with the language, such as to answer questions, repeat or record a sentence, or compare recorded language with a model. We call this kind of interaction a learning step; and it is the number of learning steps that determines the amount of practice. Passive learners will therefore take much longer to accumulate the necessary number of learning steps to complete a lesson. At any given time, learners can see their completion percentage for a lesson, which indicates their level of active engagement and progress within the lesson. The completion percentage also depends on how well the learner performs on various comprehension tasks. Learners who comprehend at a higher level accumulate learning steps at a faster rate.

To summarize, we measure practice amount by determining how often the brain actively interacts with the target language.

Dimension 2: Practice Frequency
From cognitive neuroscience, it is well known that practice frequency is important. Distributed practice is more effective than mass practice, even when the total amount of practice time is equal. Creating new neural connections requires practice over time, with enough frequency to strengthen new connections and prevent other connections from disappearing altogether. The goal is long-term learning and automaticity, not short-term memorization.

So it isn’t enough to measure only practice time. Three hours of practice in one session is less effective than three hours of practice distributed over several sessions. For language learning, we recommend that learners practice at least three times per week, and preferably more — with additional classroom or tutor support to extend and personalize the content. For learners who need to make faster progress, daily practice will reduce the total amount of time necessary to reach a specific proficiency goal.

The challenge, therefore, is to get learners to build practice into their schedules. They need to understand how important practice frequency is, and they need to be motivated to make this happen. This motivation in large part comes from classroom activities that support and extend their practice. In our blended learning model, teachers are important, as well as parents or the learner’s working environment. Having a goal is essential for learners to continue the learning process, especially since much of the practice has little intrinsic interest. Research shows that having an unconscious goal can be very effective in keeping learners engaged (Bargh & Morsella, 2008). It is important that teachers and others help to instill and reinforce the tremendous value of English fluency in terms of life goals rather than as an academic subject.

Dimension 3: Practice Quality
Assessing the quality of practice requires a learning theory. Without a learning theory, it isn’t possible to decide whether a particular learning activity facilitates or impedes learning. While developing listening comprehension, for example, the inappropriate use of text, which is spatial, can interfere with the development of language chunking skills. In some tasks, the presence of text is a distraction which desensitizes the neural pathways and subconscious processors necessary to search out, recognize, and employ language patterns to process spoken language, which is temporal, not spatial. With spoken language, there is no time to reflect or analyze the language. It must be processed subconsciously.

In our learning theory, Recursive Hierarchical Recognition (RHR) (Knowles, 2013) language chunking, which is a subconscious process, is essential for fluency, so anything that impedes its development has a negative value. We therefore suppress the initial use of text and encourage students to engage with the spoken form of English first.

This view of text is nothing new. In the classic work from almost a century ago, The Oral Method of Teaching Languages, linguist Harold Palmer wrote:

“A considerable number, probably the majority, of those who have successfully mastered the spoken form of one or more foreign languages maintain that their success is due to the fact that, when they began their study, they plunged straight into the spoken language without doing any preliminary book-work. They advise others to do the same thing.” (Palmer, 1922, pp. 1)

In line with this thesis, learning activities where learners listen to and then repeat a phrase or sentence without reference to text support are more effective than listening and repeating with text support.

In the RHR learning theory, a key neural switch is temporal tension, which automatically and unintentionally activates when processing a stream of language in working memory, which has limited capacity. For processing spoken language in working memory, the language must be chunked. One can feel temporal tension when listening to and then repeating sentences without text support. It is this temporal tension that activates neural pathways to recognize and use language patterns to chunk the language (see figure). These language chunks are constructed automatically. When text is present, the urgency of language chunking is reduced or absent, and the temporal tension switch is not activated. Text is spatial, with time for reflection. Spoken language is temporal.

Other factors in determining the quality of language input and practice include cognitive load. Extraneous information, such as distracting or unclear visuals, results in cognitive overload. Pictures may be entertaining, but may not be effective if they have extraneous information that interferes with the learning process. In many cases, simple, iconic images are more effective.

Therefore, the design of the learning materials and how they interface with learners are factors that affect the quality of learning activities and must be guided by the learning theory.

Also to be considered are the choices learners make when interacting with all modalities of the language input: visuals, audio, and text. When learner activity shows a sequence of actions that reduces temporal tension, such as overreliance on text, the activity is scored lower than activities where temporal tension is optimal.
It is well known that the brain seeks to fill in incomplete patterns. Whether patterns are visual or auditory, the unintentional action of trying to complete patterns is a learning force that we systematically use when designing our courses. Gaps create tension, and the brain responds automatically by trying to fill them, provided that they are appropriate for the learner’s proficiency level. Visual cues, life experience, and other long-term memories help the brain guess and fill in meaning and thereby bootstrap the learning process. We determine and maintain optimal temporal tension through proper placement and frequent testing in the learning sequence.

To measure the quality of practice, our system monitors and tracks every learning action or series of actions. We know when learners listen, repeat, record, or see text, and in what combination. Each learning activity is monitored and scored. If a learner is overusing text support, for example, the system catches it, adjusts the study score downward, and, at critical points, alerts the learner and teacher so that the learning pattern can be modified. The metrics used to do this are adjusted for each type of lesson and learner. The role of text, for example, is much different for young learners than for adults.

Dimension 4: Practice Level and Sequence
For language practice to be effective, activities must be at the right level and sequenced properly, which means that there is sufficient linkage between lessons and the temporal-tension level is optimal. In other words, the learning theory guides both assessment-level testing and lesson sequencing.

While some syllabi are situational or grammar based, we employ a hierarchical learning sequence, from concrete concepts to abstract concepts. The ability to chunk language around concepts is enhanced by systematically using temporal tension, which activates pattern recognition and subconscious language processors to extract meaning for insertion into working memory and further subconscious analysis. Without the ability to chunk information, much of it is missed. Even when vocabulary items are all familiar, if learners lack the ability to chunk, comprehension is partial at best. Meanings of words depend on how they are used and on the words around them. In the real world, words have multiple meanings or at least a continuum of meanings that are only decided in context. This explains why so many students with large passive vocabularies are unable to converse in real time.

In RHR, language chunks are built around concepts and language functions. They are not built around grammar rules, though an awareness of grammar is developed implicitly as a means to accurately express concepts. The focus is on meaning first and form second. Concepts are the building blocks of meaning and reflect how our brains structure our perceptions. Simple concepts include “object,” “location in time or space,” “frequency,” and “manner.” More complex concepts include how events are sequenced or connected in a conditional relation. Higher level concepts are abstract and include counterfactual suppositions and fine distinctions in logic.

So our proficiency testing must determine what level of conceptual complexity learners can comprehend and express. Once they are placed into the learning sequence, learners interact with multimodal language inputs that reinforce and then expand their ability to comprehend and express longer and more-complex phrases and sentences.
To score this dimension, we must determine whether learners are working at the right level and in the proper sequence within lessons. Learners move from a general understanding, with gaps, to a detailed understanding with fewer and fewer gaps, and then to full comprehension and the ability to express the information with little or no conscious analysis, which requires automaticity.

If learners practice outside their optimal levels, we reduce their score. If they focus on lessons out of sequence or spend too much time on one lesson, the result creates boredom and a lack of meaningful engagement, for which they are marked down. Practice must be distributed over a range of activities and lessons, minimizing boredom and cognitive overload.

To guide learners, our smart system automatically opens new lessons when certain conditions are met, including the passing of Mastery Tests, which become available only when a target-completion percentage is reached. Ideally, learners should also demonstrate their mastery in classroom activities, which is an important factor in keeping them motivated. This is an important reason why we support a blended model over a self-study model. Teachers have an important role to play.

Study Scores
In our experience, learners are extremely interested in their study scores. They notice when their scores go up or down and can view their scores at any time by accessing the Intelligent Tutor. Tutor messages might be very positive, such as the following:
Total Time: 56:16 hours

1. Not monitoring recorded voice enough in Speech Recognition
2. Good use of repeat button
3. Good use of voice record compared to the number of sentences heard
4. Good Mastery Test score(s)
5. Good study frequency in the last two weeks
6. Good success with comprehension questions
7. Good study time in the last two weeks
Total Study Score = 11

This student is practicing well. Study scores above six are good. In this case, twelve is the maximum score possible. Learners who don’t use their time well will have a low or negative score, such as:
Total time: 98:07 hours

1. Too much text button compared to repeat button
2. Not using voice record enough compared to the number of sentences heard
3. Not repeating sentences enough compared to the number of sentences heard
4. Not monitoring recorded voice enough in Speech Recognition lessons
5. Too much use of translation
6. Good study frequency in the last two weeks
7. Good study time in the last two weeks
Total Study Score = -4

Though this student is practicing enough, the quality is poor. The data shows that the student is passive and avoiding temporal tension, so we expect progress to be slow. As a result, the teacher should coach the student on how to improve practice. Overuse of the text button, for example, needs to be addressed. A further examination of the data shows that the student tends to study the same lesson too many times in succession. This means that the level of attention is probably very low and the level of boredom is probably high, which means very little learning is taking place.

By now it should be clear that time on task, though important,
doesn’t go far enough. Advances in technology and cognitive neuroscience allow us to assess and modify the quality of learning and practice. Learners who study well are more likely to study more frequently and feel their progress. Learners who don’t study with enough frequency may study well, but because of slow or no progress are more likely to give up and be demotivated. All four dimensions of practice are important and interrelated.

High rates of attrition are very common in language-learning programs (Nielson, 2011), and our data shows that learners with high study scores remain active much longer and reach their goals faster than those in traditional or self-study approaches. For large users of our system, such as ministries of education, our analytics program mines the data and makes it available through quick and easy summaries. This data shows which schools, districts, and cities are doing well or are in need of additional training or support. It shows learner trends, such as changes in study scores, and helps to identify problems in time to address them. Having access to this kind of data is revolutionary.

Bargh, J. A. & Morsella, E. (2008) The Unconscious Mind, Perspectives on Psychological Science, 3(1) 74-79
Knowles, Lance (2008) “Recursive Hierarchical Recognition: A Brain-based Theory of Language Learning,” FEELTA/NATE Conference Proceedings (pp. 28-34), Far Eastern National University, Vladivostok, Russia
Knowles, Lance (2013) “Redefining Roles: Language Learners, Teachers and Technology,” DOI: 10.1109/ICETA.2013.6674431 in Proceedings of 11th IEEE International Conference on Emerging eLearning Technologies and Applications, Slovakia
Nielson, K. B. (2011) “Self-Study with Language Learning Software in the Workplace: What Happens?” Language Learning & Technology, 15(3), 110-129
Palmer, Harold E. (1922) The Oral Method of Teaching Languages, W. Heffer & Sons, Ltd., Cambridge

Lance Knowles is president and head of Courseware Development at DynEd International (www.dyned.com). Knowles has pioneered the development and use of CALL for more than 25 years. His innovative learning theory, Recursive Hierarchical Recognition (RHR), is based on cognitive neuroscience. DynEd’s award-winning programs are used by millions of students around the world.