A Clear Path to Success: Making Grading Motivational
When students can see their progress, they believe they can improve.
I’m a Feldman Fanboy. It’s no mystery. I’m not trying to hide it. If I heard about a Grading For Equity book signing happening at exactly the same time as a Model Train-a-Palooza in my neighborhood, I’d choose the book signing. And that’s saying something from this reformed Thomas the Tank Engine acolyte who has very seriously considered this for my first and only tattoo.
The most helpful and memorable part of Grading for Equity, Joe Feldman’s 2019 call to action, is the three pillars of equitable grading practices. If your goal is to assess students in a way that doesn’t inherently disadvantage certain groups, Feldman argues, you must make sure that your grading practices are:
Bias-Resistant
Accurate, and
Motivational
Feldman goes into all sorts of detail on these, and I can’t recommend his work higher (this is the part where you go get yourself a copy, Do-Not-Pass-Go-Do-Not-Collect-$200 style).
My beloved team of brilliant computer science teachers committed to follow the Book of Joe, and we took some major steps towards this: we made CS1, our intro class, pass-fail. We developed a set of core competencies that cut across all levels of CS in our course catalog. We especially leaned in to community-building practices (as I previously wrote about here—click that link if you want to make me feel good about myself).
But here’s what I really want to discuss: in CS2—our AP-level class—we assign a grade. This inevitably makes it complicated to assess students in a way that is bias-resistant, accurate, and motivational. Once grades are introduced into the assessment and feedback process, accuracy drops and bias creeps in1. We have all heard a student say something like, “You’re so lucky you’re in Mr. So-and-So’s class, he’s a wayyy easiser grader than Ms. Blah-di-Blah.” As is often the case, the kids have a point. This is a classic example of biased grading: often, students with identical understandings receive unequal grades simply because of who is grading them.
Now, this specific problem can be mitigated by rubrics and collaborative grading, but there are still issues of accuracy and motivation: what do we do to encourage students who fall short of their (and our) expectations? What do we build into the grading to give students honest feedback while encouraging students to keep trying?2
The answer is resolutely not a traditional grading scheme. In these systems, teachers allocate points for specific assignments, and a student’s grade comes from the points earned divided by the points possible. The problem with this approach is that it means that both students and teachers live under the tyranny of points: regardless of the story of a student’s learning and progress and the very human life they live concurrently with our classes, the points dictate the grade they receive.3
This led me and my team to a standards-driven approach.4 In a standards-driven curriculum, instead of allocating points for each event in the class (like a quiz or a project), teachers look for evidence of student learning relative to the course’s goals, called standards. In these environments, students tend to be more motivated to continue learning because the goals are crystal-clear.5 Furthermore, this allows teachers to ditch the points-driven approach and, instead, follow a data-gathering approach to grades.
In a data-gathering approach, we begin with a standard, then look for evidence of the student’s level of understanding. This evidence can come from wherever a teacher deems reliable—a quiz, a group project, a paper, a conversation—and that evidence is added to the student’s data. And then that data can be interpreted. This approach allows us to listen to the data to draw conclusions about what a student knows. As math ed researcher Peter Liljedahl puts it, “We need to let the data talk to us, rather than allowing points to rule us.”6
This approach can be empowering for students and liberating for teachers. But to make it work in CS2, we knew we needed to get our students on board. Any time you break a well-known precedent in how school works, you need to clearly communicate with your students. Furthermore, a critical component of a successful standards-based curriculum is transparency. The grading cannot be a mystery to students: they need to be able to access and understand their grades and feedback as they progress through the course.
With this goal in mind—to clearly report feedback and grades in a way that embodied transparency, accuracy, and motivation—my team and I created our system of reporting: the CS2 Dashboard.
🟥🟨 Dashboards 🟦🟩
Here’s the key idea to our approach: a student in CS2 can always access a clear visualization of their progress in the course. They can see the standards, and they can see the evidence that has been gathered. This allows them to know how they’re doing and how they can improve.
To make this a reality, we created a google sheet for each student. As evidence is collected throughout the year, it is added to the sheet. By the end of the year, they looked like this:
You can see an example dashboard here and a template dashboard here if you’d like to make your own.
Let's take a tour of the dashboard. On the left we have content standards: very specific skills and knowledge that are introduced throughout the year and re-assessed many times. Here’s an example of what a student’s content standards could look like after gathering some evidence over time:
In this student’s case, there is a clear story: while most of the skills were evaluated from one piece of evidence (the unit 05 quiz), the score for the last skill was increased in light of new evidence (the final project). Clearly, the student knows exactly where they can improve (just look for yellow or blue), and they can have faith that those skills will be re-evaluated when there is new evidence of their understanding (perhaps on a future project or the next quiz).7
On the right, we have competencies. These are larger, cross-cutting skills that we evaluate on major coding projects throughout the year:
To determine a student’s overall score for a content standard or competency, we looked at the available evidence and came to an appropriate conclusion. For the most part, we used the student’s best score to determine their overall score. This score could come at any point in the year: on a quiz, on a project, on the final exam—the timing of a student’s learning did not matter as much as how well they learned. We never penalized students for early not-knowing in a course: as long as they kept at it, they could only improve.
Finally, to calculate a grade, we weighed the content standard scores and the competency scores 50%-50%, and used a GPA scale to translate our 1-4 point system to a letter grade.
Students had access to their dashboards throughout the year, and we would update them whenever we had new evidence. We took time to explain this system to the students at the beginning of the year, and we continued to answer questions about it as the year progressed.
A few nuts & bolts:
We created one template sheet with the headers, the formulas, the formatting, and the standards, but with no scores or feedback.
Then, we duplicated this sheet for each student (we used a chrome extension to automate this for us). Each student sheet was linked to the template, so if we added a new standard to the template, it would automatically appear on every student sheet (cool, right?).
Finally, we shared each sheet with the appropriate student. The student has comment access to the sheet, but they do not have editing access (for a variety of reasons, from technical to ethical).
🔎 Major Findings:
So, how did it go? Did this system actually create a more accurate, transparent, motivational environment? As usual, I went to the students to find out. I collected survey data (n = 23) and conducted student interviews (n = 5), and here’s what I learned:
Students found the dashboards intuitive and easy to understand. 23 out of 23 students noted that they understood the information on the dashboards. Students found the dashboards easy to access, and they especially appreciated the color-coding.
That said, there was some confusion on how the scores translated to grades. Additionally, students noted that they weren’t exactly sure how many times they needed to show improvement for their scores to increase. To honor our student’s suggestions, we made some tweaks to the dashboards for next year. You can see version 2.0 here.
Students looked to their dashboards to know how to improve—and they believed that they could. 21 of 23 survey respondents noted that the dashboards helped them identify specific areas where they could improve their understanding (and, in doing so, raise their grade). Even more importantly, these same respondents noted that they believed that they could improve. For one student, this belief came from the data-collection, standards-based approach:
I know where my grade stands and I know how to improve it. Other classes don't make this effort, but in CS, it was super easy, because when starting a new project, I could look at my dashboard and know what I wanted to improve. For most classes I don’t know what’s being assessed.
For another student, it was about transparency:
Your grades are right there for you and not behind closed doors makes me feel more involved — it tells you that it’s all here, and nothing is outside of that. There’s security and transparency in that.
Or, as another student put it:
I do like the system when you know your grade at all times. I find there to be stress to be in the unknown.
Students, on the whole, thought the dashboard accurately reflected their strengths and weaknesses — and they were motivated to improve. Again, 21 of 23 survey respondents agreed that the dashboards were accurate: they felt that their strengths were clearly reflected, and the standards with lower scores were definitely the ones they didn’t fully understand. This accuracy felt more fair to the students than the grading systems in their other classes, as one student noted:
By breaking up the grade, it feels much more connected to your actual experience in the class. In math, if I have a bad day, that has a significant impact on my grade, and that doesn’t represent my capabilities—on some days you do good, on others you do bad, but your grade doesn’t necessarily represent that—but if you take lots of data points and many points in the year, it’s hard to deviate from who the student really is.
More importantly, the knowledge that they would be re-assessed as the year went on motivated students to continue learning. One interviewee, for example, noted:
I like how the dashboard assesses growth, and what really matters is that you’re making progress. I prefer that for grading because it better represents the effort you put in over the year to better understand the concepts.
And even when students received lower scores, they were less stressed and more motivated than in their other classes:
If feel like I did well and I got a 2, it’s kind of disappointing, but I use it as knowing what to focus on the next time. I don’t think it was ever a major stress thing for me.
For me it was less stressful than other classes because I know that I can improve whatever score I get. I focused less on my grade because I knew I could improve it.
I think I saw improvement, especially in the content standards, I went from 2s and I don’t think I have any left. It gives you target areas to focus on, and it’s easier to know what you’re good at and what you’re not.
I realize how much more motivated I was to get better in CS because it seemed easy — in other classes, if you get a bad score, you feel defeated, but in CS you feel like you know where you can improve. It’s a very clear path to success.
🧐 What’s Next?
Next year, to calculate a student’s overall score on a standard, I’d like to try using a student’s second-highest score from the assessments. So, for example, if a student received a 2, a 3, and a 4 when assessed on a standard, their overall score would be a 3. But if they score 4 one more time, then their overall score would rise to a 4. The idea here is that a student needs to show mastery twice to convince me that they truly understand that standard. This seems to be a more accurate way to determine a student’s level of understanding of a standard from the data.8 And, of course, we can always gather more data from a student if we feel that we have insufficient data to draw conclusions.
One last note: my students made it clear to me in their interviews that this grading and feedback stuff did not matter as much as knowing that their teacher cared about them. This reminded me that a genuine sense of trust will always matter more to students than the intricacies and nuances of our grading policies. But, at the same time, I feel even better about this approach to feedback and grading, because it sends the message to students that we care about them—that we want them to trust us. Because we want them to succeed.
Feldman, J. (2019). Grading for Equity. Corwin.
It’s tempting to implement a system like Asao Inoue’s labor-based grading, which assigns a grade simply based on the amount of time students put into an assignment. Others swear by it as a motivational practice to teach writing — check out “Labor-Based Grading: A New Ethic of Writing Feedback” by Seth Czarnecki in English Journal, July 2023 (Vol. 112, #6, pp. 56-62);
While this approach can work for writing, I’m not convinced it would work in a CS classroom. If all we do is grade students’ effort, yes, they will definitely learn. But my concern is that we lose an opportunity to clearly outline for students what we hope they learn. I think it leaves a little too much of the learning to chance, as it’s unclear to me how labor-based grading can push students toward areas of the greatest discomfort and room for improvement.
Consider, for example, three students taking six exams over a semester: over time, Alma scores 50, 60, 75, 95, 100, 100. Bryan scores 100, 100, 75, 75, 75, 55. Chloe scores 100, 60, 100, 60, 100, 60. Who would you say best mastered the material in the class? The obvious answer is Alma, while clearly something tragic happened to Bryan at some point in the semester. And Chloe must have something going on, too — perhaps she is going between two homes between divorced parents, and she performs much better when living with one parent. Yet, if we use a traditional points-driven grading system, all three of them would receive the same grade — a B-. Does that really make sense?
There are no common core state standards for computer science. So, we wrote our own standards. We worked together as a department to identify cross-cutting competencies, and we modeled the content standards off of the AP Computer Science A standards, as well as our own experience and scholarly expertise.
Muñoz, M. A., & Guskey, T. R. (2015). Standards-based grading and reporting will improve education. Phi Delta Kappan, 96(7), 64–68. https://doi.org/10.1177/0031721715579043
Liljedahl, P. (2021). Building Thinking Classrooms in Mathematics, Grades K-12. Corwin.
Rinkema, E. & Williams, S. (2019). The Standards-Based Classroom: Make Learning the Goal. Corwin.
Lilijedahl, P. (2021).
I appreciate the effort to write down your thoughts and explain your process. I’d love to connect in person sometime to check in on the broader landscape of how this is proceeding at your school. CQ