In software and I.T. as in life, things go wrong. If the only use of data is to find out who is responsible, then don't expect people to be enthusiastic about capturing and using that data. There is always a context when issues arise. Charts just showing data usually don't convey that context, leaving people feeling angry and singled-out and judged without a hearing. If success to anyone means staying OFF a naughty list, data chart, or email, then don't expect accurate or un-gamed data to emerge. The first rule of new-age data is to avoid any visualization being unnecessarily personal without purpose.
One place I was consulting with had a great idea to focus on quality. They decided that defect counts would be an indicator that teams were moving too fast and their software becoming unreliable and unreleasable. They created a dashboard that displayed proudly in public spaces the number of defects currently logged, sorted highest to lowest allocated to each developer by name. Think about this for a second — your name, listed in red because you have ten or higher defects assigned to you. And that evening, you get an email from the V.P. saying it's unacceptable that you have so many open "mistakes" in your name.
How can putting the focus on quality like this be bad? For many, many reasons -
We want defects reported. Are you more or less likely to enter bug reports into the work tracking tool if it gets you placed on the Naughty List? Bug reports assigned to your friends or yourself written on sticky-notes; Other teams bug reports entered multiple times using different grammar.
Does it matter how many defects there are if that code isn't shipping or going live in the next release? If it has to be perfect before deploying to a test environment or behind a feature flag, should you test more (say six more weeks)? If no-one can give early feedback, who finds the other ten defects or design flaws in a timely fashion?
Are all defects equal? Would you not ship a fix for a significant feature because you have some small layout defects on a specific language and browser combination? Quality is relative, not absolute. Withholding this release for a minor defect reduces quality as a whole for everyone.
Does it matter whom the defects are assigned? Would just knowing there is a spike in defects enough? The names of people are a secondary concern and a less useful one (as long as those people know, who cares if you know).
The mistake this organization made was to show data without context. No priority context. No customer impact context. No "for feedback" context. Just "you are a failure" for writing code and committing it to a test environment. The result was devastating (although the executive team never realized it) - defect counts appeared to go down. Still, the quality of the delivered product was unknowable at any point. People just adapted by hiding defect data and not integrating or committing early, getting feedback too late to implement. Yuk.
Another form of judging that causes data gaming or misrepresentation is the "need" to categorize people or teams as high, medium, or low performers. Stack ranking them. When we apply these sorts of summary labels, we are just asking for data manipulation. Most often, it isn't even necessary. If we are looking at the next step for improving, why does it matter where we are on someone else's journey? What matters is the decision applying to our context as to our "next steps" or "needed action." Worse still, those labeled "High Performers," have tough times too, but as a high performer you feel less urgency to react early to adverse trends - you are after all "high performing" (for now). Nothing is gained by these labels, and we should purge our industry of such simple (flawed) ideas.
A common example occurs where I see this happen is in team performance comparisons. Comparing a newly formed team against a long-term team doesn't give any useful information other than the gap. The newly formed team could be trending miraculously in performance, but still below the long-term team that has been declining rapidly. The correct assessment is "good job new team," but if we stack rank teams into high, medium, or low performers, the new team is still "lower." Being judged against others out of context leads to all manner of data gaming and, most importantly: stupid decisions.
In closing, there is often very little upside to showing data publically at an individual level. It has many impacting downsides that mean data becomes needlessly incomplete, gamed or both.