I have read a few Tweets recently poking fun at research in education. While I will admit there is no perfect study, I warn the masses about taking such a stand over quality educational research.
Why?
There is no question that we learned a lot through the years on how to optimize student learning and teacher effectiveness. This has been done due to quality educational research. Unfortunately, unless a teacher takes research design courses at the graduate level, she/he may not be taught how to interpret educational research design and findings. Cutbacks and a tough economic time have resulted in a decrease in professional development (PD) opportunities for educators, and I highly doubt that PD was ever plentiful for the practitioner.
I took four statistics courses from the incredible Dr. Tim Konold while a student of The Curry School of Education at The University of Virginia. Dr. Konold, my doctoral dissertation co-chair, was exceptional at helping me to understand not just how a study should be conducted, but how to analyze its findings. I want to share a few key considerations in educational research and why they matter.
Before I get this party started, please be aware that I am purposefully only discussing a few elements (Note: very few...) of educational research. Below is an image of one section of one (of three) shelf in my office dedicated to research.
There is no possible way I can get into a lot detail here. Nonetheless, if one walks away from this post interested in research design and realizing how much there is to learn about it - I will be a happy camper. The books are readily accessible to me in my office because I refer to them weekly if not more. Dr. Konold is the quantitative research expert - I don’t claim to be one. But I’ve been humbled enough to know that there is a lot for most of us to learn on the topic. So, here goes!
Unit of Analysis
The unit of analysis is usually shown as “n”. Wait, isn’t sample size shown as n? They can be the same - it just depends on how the study was designed and administered.
In education, n (sample size) often represents students, teachers, or schools. In order to determine the perfect sample size? Check out this post. The problem with using students as n in a study is when the students all have the same teacher. Thus, they have experienced the same treatment (teaching). In this instance, the unit of analysis should really be the class. Example: the study may include 5 classes, each which may contain 35 students. This would result in a unit of analysis of 5 - not 35 x 5 = 175. Hence, your n = 5. I am making an educated guess that - in this instance - a n of 5 is way too small to yield reliable results. In this case, the researcher would have to invite a lot more teachers to be involved in the study (although this would actually be determined through statistics, too). It also means that if a study is shared with a n of 5 such as this, the reader should be weary of drawing too many conclusions from the work.
Why can ignoring the unit of analysis and/or sample size be dangerous?
Example: When I returned to teaching K-12, research was presented at a faculty meeting. This research showed students’ scores on standardized math tests. In this instance, if teacher was the sample size it would have been n= 3. As it were, students were the sample size in this particular instance.
As the data was broken down and shared, students’ scores (not names of students) were presented based on how they identified themselves ethnically. We (the faculty) were shown a slide that denoted n = 3 (n = Hispanic students) in the elementary school. 0% had passed the math assessment. We (entire school faculty) were shown how this number decreased from 100% the previous time results were interpreted and shared. (Note: the previous time scores were shared, the n denoted 2).
What is wrong with this?
Rather than ask questions about why this might be - I realized that these students were all siblings and the children of migrant workers and they just moved to the country/barely spoke English - we were asked why our teaching effectiveness of Hispanic children in math decreased so much. Ouch.
Can you see why standardized testing and high stakes assessment can be problematic?
The reality is, if we took the results presented to determine our next moves as a faculty, that could be scary. The former result would motivate any Hispanic parent to move to the school district - for it was being concluded that in previous years, 100% of Hispanic youth passed the math standardized test. The latter would motivate any Hispanic parent to stay far away from the school district - for it was being concluded that 0% of Hispanic youth passed the math standardized test. While I think it was imperative that support was given to these children, I did not like how the material was presented.
The unit of analysis matters greatly. I spoke up to point this out - and loved the reaction from others that it was the PE teacher questioning the statistics. I simply didn’t want the teachers being judged - or wrongfully summarized as being poor teachers to Hispanic youth. As I stated, the children did not speak English at home and recently moved to the US.
Here was an example where research was presented in a way that was not helpful - it was potentially harmful if high stakes school assessment or teacher evaluation was in place (thankfully, neither were). Of course those children should receive differentiated instruction and perhaps aids in order to increase their math competence. But, our teachers should not have been led to feel as though they were poor teachers of these youth currently, just as they may or may not have been rockstar teachers of the two Hispanic youth in the several years prior. There were too few in the sample size to make any gross generalizations about teacher effectiveness and Hispanic youth. It was not statistically sound data.
External Validity
I have conducted qualitative (interviews, words, stories) research along with the quantitative (statistic, numbers, surveys) [mixed methods design] and both independently, too. One major concern that I have observed is when people read a qualitative study and feel the findings can be applied to all other instances. This is scary wrong. The truth is, qualitative researchers make no bones about it. Qualitative design does not attempt to make generalizations about the population in its findings. The varying levels of such generalizations are known as external validity in research design.
Why does external validity matter?
Have you ever heard someone say, “When I played sports…” or “When I was studying math…” or “My child’s grade 3 teacher…”? Whenever I hear arguments begin with this, I think to myself “uh oh.” You see, your experience is - well, your experience. It does not mean it was the most optimal way to go about it. It does not mean it would be the best to go about it today. As I said, it was simply your experience. In other words, thank you for sharing. lol There is a chance that within a concrete research design study, your experiences might have been in line with many others - but that can’t be determined with your anecdotal recollection. The reality it - I hate to break it to you - without research one does not really know what is the absolute best for our students. So, I’m proud of you for walking up hill to school both ways in the snow as a youth - but, I am not sure it is what completely molded you into who you are today.
Validity
I have served as an external consultant for school district (US) and school board (Canada) reviews. One major beef I have with these evaluations is that they often skip the first steps which would set the stage to actually receive date that is helpful. They are so well intentioned in evaluating programs. They put so much time and effort into the process, but unfortunately it can be a huge waste of time if these first important steps are ignored. Sitting down and discussion potential questions is not only grueling, it can be a waste of time if you truly want valid data.
Validity means to test what you set out to test. In other words, if you want to assess students’ perceptions about their experiences in physical education you develop a survey that actually does that. For this to happen, pilot tests (small research projects) have to occur in order to develop a measure that is valid. One example of a pilot study phase is called test-retest reliability. This pilot work includes the use of statistical software. I like SPSS and I especially like this manual to sit very close to me when I am interpreting the data. I might actually (truly) love this book…
Test-Retest Reliability
When developing a survey, it’s best to give the survey to a group of people and then to give the same survey to them again 5-7 days later. Analyzing the results from both surveys will help one determine the test-retest reliability of the survey items. While one does risk the chance of bias (a person chooses the same answer he/she chose last time simply because he/she recalls choosing that item previously) it helps make a survey or questionnaire more sound.
Why? If few students chose the same item both times, the survey developer would omit that item from the question bank on the survey.
Where to get quality research?
A good personal growth target as an educator is to better understand educational research. If you are, like me, in the physical education, physical activity, sport, and coaching sectors - you might want to check out these peer reviewed journals. Note: this list is far from exhausted, but it includes my favorites (would love to hear your favorites - particularly if you live outside of North America!)
Research Quarterly for Exercise and Sport
Journal of Teaching in Physical Education
Journal of Sport & Exercise Psychology
Journal of Applied Sport Psychology
I best stop here. I don’t want you to fall asleep - or look or feel like I used to feel coming out of some of my research classes as a doctoral student - or, like this...
This being said, it is critical that when we share information, we are sharing good quality information. The beautiful thing about social media is the wide reaching audience and the potential to disseminate good quality work. The scary thing about social media is that propaganda and poor quality work is just as readily available and easy to disseminate. Scholarly work has led to amazing changes in physical education, sport, and coaching in the past several decades - in programming, resource development, and curriculum design. I am not a mathematician, but I am indeed smart enough to know what I don’t know.
Understanding the complexity of research design at all points of the education continuum (NOT ladder) as well as all stakeholders will truly serve our children and youth - well.
Thanks for reading!
How about you?
Have you taken research design courses?
If you are a professor and offer a research design course online, please share information below in comment section so others might learn from you!
Is there professional development funds for you to dig deeper into understanding educational research?