During the two years that I used the AAPPL, ACTFL did not provide data to correlate average scores to years of study. There was a national study (CASLS, 2010) based on STAMP test scores, however, and after talking with representatives from STAMP, I decided there should be a close enough correlation between the two tests to use the CASLS data as a reference point for expected proficiency of high school language students. IF the AAPPL scores are reliable indicators of proficiency, my students are doing amazingly well! However, we must bear in mind that while both tests reference the ACTFL scale, they are two different tests and scores may not correlate perfectly.
I had some uncertainty about the validity of AAPPL scores in 2016, but my 2017 experience was even more confusing. Although there were clear gains in listening, about a third of my Spanish II students rated LOWER on the reading test than they had a year earlier and another quarter received the same rating as they had in Spanish 1. Obviously, all these students had made large gains in their reading proficiency, as they had read at least two novels in Spanish II, besides lots of other material. It was not possible that their proficiency had decreased or had even stayed the same in one year.
I first contacted the testing company with my concern and finally was sent to Margaret Malone director of Assessment, Research and Development at ACTFL. Both LTI and Ms. Malone seemed to think I was unhappy that my students did not score as high as I wanted them too. It was hard to help them understand that I saw validity and reliability issues inherent in the test itself. This was not an issue of sour grapes. How could a student lose proficiency with a year of intense input? (And I had a few low proficiency students who scored higher than reality, which also concerned me.) My question was: is the AAPPL a valid assessment of proficiency?
My conversation with Ms. Malone left me with greater ambivalence about the AAPPL, as she was unable to point me to research that validated test scores, such as cross-testing with the OPI. She insisted that the way scores were used to motivate students was more important than the preciseness of the scores. I was also surprised and disappointed that she could not offer statistics of average scores by years of study, as the CASLS study detailed.
After I shared my concerns online, a few other teachers contacted me who had also experienced discrepancies in test scores and actual proficiency. Ms. Malone offered me an opportunity to participate in a focus group in my city sometime after our conversation. I would have loved to have done that, but it wasn't held on a day I could easily be absent. In hindsight, I regret missing a chance to understand the test better and to perhaps influence change.
Spanish teacher Jim Tripp blogged about his experience with the revised test in 2019, and came to the conclusion that specifically the Form A test scores were unreliable. Retesting his students with Form B resulted in different (in his opinion, more representative) scores. All my students, except Spanish III, used the Form A version of the test, so perhaps I would have had more valid results with the Form B test. Due to my questions about validity and reliability, I did not use the AAPPL assessment in future years. However, overall, I do think it was helpful to have an outside opinion to validate the fact that--even for those whose scores were lower than expected--my students' proficiency is soaring with communicative language teaching. (2017 test scores are shown in the slide show "What do your AAPPL scores mean 2017".)
I had some uncertainty about the validity of AAPPL scores in 2016, but my 2017 experience was even more confusing. Although there were clear gains in listening, about a third of my Spanish II students rated LOWER on the reading test than they had a year earlier and another quarter received the same rating as they had in Spanish 1. Obviously, all these students had made large gains in their reading proficiency, as they had read at least two novels in Spanish II, besides lots of other material. It was not possible that their proficiency had decreased or had even stayed the same in one year.
I first contacted the testing company with my concern and finally was sent to Margaret Malone director of Assessment, Research and Development at ACTFL. Both LTI and Ms. Malone seemed to think I was unhappy that my students did not score as high as I wanted them too. It was hard to help them understand that I saw validity and reliability issues inherent in the test itself. This was not an issue of sour grapes. How could a student lose proficiency with a year of intense input? (And I had a few low proficiency students who scored higher than reality, which also concerned me.) My question was: is the AAPPL a valid assessment of proficiency?
My conversation with Ms. Malone left me with greater ambivalence about the AAPPL, as she was unable to point me to research that validated test scores, such as cross-testing with the OPI. She insisted that the way scores were used to motivate students was more important than the preciseness of the scores. I was also surprised and disappointed that she could not offer statistics of average scores by years of study, as the CASLS study detailed.
After I shared my concerns online, a few other teachers contacted me who had also experienced discrepancies in test scores and actual proficiency. Ms. Malone offered me an opportunity to participate in a focus group in my city sometime after our conversation. I would have loved to have done that, but it wasn't held on a day I could easily be absent. In hindsight, I regret missing a chance to understand the test better and to perhaps influence change.
Spanish teacher Jim Tripp blogged about his experience with the revised test in 2019, and came to the conclusion that specifically the Form A test scores were unreliable. Retesting his students with Form B resulted in different (in his opinion, more representative) scores. All my students, except Spanish III, used the Form A version of the test, so perhaps I would have had more valid results with the Form B test. Due to my questions about validity and reliability, I did not use the AAPPL assessment in future years. However, overall, I do think it was helpful to have an outside opinion to validate the fact that--even for those whose scores were lower than expected--my students' proficiency is soaring with communicative language teaching. (2017 test scores are shown in the slide show "What do your AAPPL scores mean 2017".)
What do your AAPPL scores mean 2017.pdf |