Promoting Scholarly Evaluation of Teaching: Addressing the Third Rail of Academia

Education is perhaps the fundamental form of investment societies or individuals can make in their own future welfare. It is a core enterprise of our research universities, and national discussions around improving the quality of higher education have been growing in voice and prominence. Yet evaluations of this key form of professional practice continue to lack scholarly rigor. In recent years, there have been increasing attention and interest in designing and implementing more scholarly approaches to teaching evaluation.  

Why Reconsider Teaching Evaluation?
Many current evaluation practices are flawed. The dominant form of teaching evaluation is the student end-of-term (SET) rating, and institutions typically rely on the “overall” rating of instructor and/or course. A variety of studies suggest these do not measure teaching effectiveness—no matter how “effective” is defined. The samples are small and not necessarily representative; the instruments are often biased; we apply statistics and comparisons inappropriately; and the omnibus questions do not correlate with validated measures of learning or success.

We have the opportunity to improve practices. If our measures of teaching effectiveness were more scholarly and aligned with our goals, they could be used by individuals and institutions to continuously improve. Rather than using solely reductionist and summative approaches (which are too often punitive or ignored), we could use these assessment measures to document improvement over time, and align the vast resources directed at improving teaching at our institutions with our evaluation (value) systems. 

There is a growing national movement within the academy (and, indeed, outside) to use teaching evaluations as a lever for change. National organizations—including the Association of American Universities, Cottrell Scholars, the National Academies, disciplinary societies, funders, and accreditation organizations—are attending to the need for and potential impact of improved teaching evaluations.

We know how. There are decades of scholarship on better models and processes for evaluation, many drawing from the longstanding and early work at the Carnegie Foundation for the Advancement of Teaching. In parallel, we know more than ever before about the nature of institutional change and how to implement successful and sustainable reforms in higher education. 

What Does a Scholarly Approach Look Like?

While there are multiple successful models of teaching evaluation, they share common principles that many of our institutions are well placed to enact. In fact, many of our existing practices are fruitful and can be adapted to scholarly approaches. 

Collecting appropriate data: Three voices for teaching effectiveness. While SETs themselves are often problematic, engaging students is essential. We must simply collect data that students are better suited to provide. Similarly, faculty peer observation is common but highly varied in practice. Finally, the instructors themselves are key in providing insight into their approach. Thus, some successful models of assessment involve providing better tools
and guidance for the three key voices: students, peer review, and self-reflection.

For example, instead of asking students to rate the professor or state whether they’ve learned a lot (questions they are not well prepared to answer), students can be asked to reflect on their instructor’s practices and approaches and the opportunities provided. An assessment of student work may be used in the evaluation. Peer observers, using research-based rubrics, can report on the effective use of practices that are known (or not) to impact student outcomes. Instructors reflecting on their own work provide essential insight into the design, outcomes, externalization, and revisions of their work. Notably, the very act of collecting each of these data sources can contribute to the professional development of those providing the data!

A scholarly structure for teaching evaluations. A number of models for better teaching evaluation structures exist. The model for our work at the University of Colorado, Boulder, in collaboration with teams at the University of Kansas and the University of Massachusetts, Amherst, is based on the more expansive understanding of academic scholarship first proposed by Ernest Boyer in 1990.1 Building on the early Boyer work and that of subsequent researchers in higher education,2 we identify seven categories that may be examined in the evaluation of teaching. How evaluation of these categories are realized in practice, and the relative weights across them, will depend upon the discipline. Nonetheless, we see these as spanning the space of scholarly teaching evaluation. 

  • Goals, content, and alignment: What are students expected to learn from the courses taught? Are course goals appropriately challenging? Is content aligned with the curriculum? These may be measured by peer observation of practice, review of syllabi, and a self-assessment of the faculty member.
  • Preparation for teaching: Does the instructor have the requisite content/background knowledge and understanding of classroom preparation? Again, these may be assessed from self-reflection, materials review, and peer review of the instructor.
  • Methods and teaching practices: Is class time used effectively? Are evidence-based practices used? Are these aligned with the course, department, and campus goals and appropriately designed for the whole student population? Here one may assess through student survey of practice, peer review, and faculty self-assessment.
  • Presentation and student interaction: How are the methods enacted? What are the students’ views of their learning experience? How has student feedback informed the faculty member’s teaching? One may use surveys of students, observation of practices, and reflection to assess these ends.
  • Student outcomes: What impact do these courses have on learners? What evidence shows the level of student understanding? Does this class have long-term impacts on student persistence, inclusion, etc.? Student voices, peer observation, campus data analytics, and self-reflection will inform these outcomes.
  • Mentorship and advising: How effectively has the faculty member worked individually with undergraduate or graduate students? Reports, letters, and surveys of students, peer observations, and evidence from the faculty member under review are used in measuring these outcomes.
  • Reflection and teaching service/scholarship: How has the faculty member’s teaching changed over time? How has this been informed by evidence of student learning? In what ways has the instructor contributed to the broader teaching community, both on and off campus? Reflective analysis by the instructor and material artifacts (e.g., publications, presentations, etc.) will demonstrate level of proficiency here.

As efforts currently confined to individual campuses strengthen and this movement evolves, we are in a position to engage collectively, share resources, enact locally, and demonstrate how these practices work. While significant work is going on at institutions across the country, there are opportunities for individuals to make the case for scholarly approaches to teaching evaluation and to showcase better assessment practices at their institutions and within professional societies and organizations. 

1Boyer E (1990). Scholarship Reconsidered: Priorities of the Professoriate, Jossey-Bass. (An expanded version was published in 2015.) 

2Glassick CE, Huber MT, Maeroff GI (1997). Scholarship Assessed: Evaluation of the Professoriate. Jossey-Bass.

To Learn More 

Dennin M, Schultz ZD, Feig A, Finkelstein N, Greenhoot AF, Hildreth M, Leibovich AK, Martin JD, Moldwin MB, O’Dowd DK, Posey LA, Smith TL, Miller ER (2018). Aligning practice to policies: Changing the culture to recognize and reward teaching at research universities, CBE—Life Sciences Education 16(4) es5.

Flaherty C (May 22, 2018). Teaching eval shake-up, Inside Higher Ed.

National Academies of Sciences Engineering and Medicine (2018). Indicators for monitoring undergraduate STEM education, National Academies Press.

About the Author:

Recommended Articles