Testing the Limits With Cem Kaner – Part I
After almost a year of being told that we “have to interview Kaner” by previous Testing the Limits guests and readers, we exercised our listening skills and sought him out. With us this month to share his unique brand of wit and wisdom is Dr. Cem Kaner – author, lawyer, speaker, professor and one of the most respected minds in the testing world.
In part I of our interview, we ask Cem to share his thoughts on the multi-disciplinary nature of software testing. His response includes thoughts on experimental psychology; law; testing metrics; arrogance in the field of testing and more. Check back for part II and part III in the next two days.
*******
uTest: In your online bio, you say the theme of your career has been to “enhance the overall safety and satisfaction of software”. To do so, you’ve studied (and worked in) areas like psychology, law, programming, testing, technical writing and sales. Explain how working in other disciplines has helped you better understand software. And on that note, what can testers learn from lawyers, writers and salespeople?
Kaner: Let me start this by saying that almost all of the best people I know in testing have significant experience in other fields. It’s common for people to move from testing to programming or writing or marketing and then back, bringing what they’ve learned with them, to test with a richer perspective and with a much more productive vision of where testing can fit within development/marketing/support cycles.
We write software to solve problems that people need solved, to do things that people want done, or to entertain people. To understand a piece of software, I need to understand why it was written, who it was written for, why they should want to use it, and what alternatives might serve them as well or better.
So I don’t see the “other disciplines” as “other.” They all contribute to this understanding in important ways.
You asked specifically about how multiple approaches fit into my own work and understanding. That’s a more personal story…
I came into software development with a doctorate in experimental psychology. I did a lot of programming (and writing about code) as a student and was deeply interested in what made products learnable and usable. A specific interest was what made a person more or less likely to make a user error. I wrote several data entry programs for large sets of scientific data. It was remarkable how much my design choices influenced the types of errors people made. What people call “user errors” are at least as much a feature of the program’s design as they are of the people who make the errors.
When people make a repeated error using my code, instead of asking why these people are idiots, I learned to ask what’s wrong with my software that causes the nice people to look like idiots.
Let me generalize this—the quality of a program extends far beyond its functionality. There is a huge gap between “works right” and “works well.” From my viewpoint, design choices that needlessly reduce the value of the program are defects, every bit as much as coding errors.
My dissertation focused on measurement of perceptual experiences (a field called psychophysics). We asked questions like “How loud is this sound?” Hearing is an experience. The physics of the sound can help us predict some aspects of that experience, but loudness is about what it sounds like to you, not the physics.
It’s pretty easy to think about how to measure the length of a table. Use a ruler. But what ruler do we use for sweetness? In order to think clearly about this, we need to clarify for ourselves the construct (the idea of) “sweetness.” What does someone mean when they say that X is sweeter (to them) than Y? Then we have to imagine a measuring instrument (like a ruler) that would give us low numbers for not-sweet, high numbers for very-sweet, and reasonable distances between different amounts of sweetness.
“How do we know that this (ruler) measures that (construct)?” is one of the fundamental questions of measurement. It even has a name, “construct validity.” In every field that I’ve studied other than computing (I’ve looked at measurement theory in a lot of fields), when people develop measurements or “metrics” and when they teach metrics, they talk seriously about construct validity. In contrast, construct validity is almost never mentioned in computing. (Do a search for “construct validity” in ACM’s Guide to the Digital Literature! Filter out the papers in business-oriented or medicine-oriented journals. What do you have left?). We have textbooks with dozens (I taught from one text with hundreds) of “metrics” and no practically-applicable system for assessing what these “metrics” measure.
Capers Jones sometimes talks disparagingly about the (claimed) fact that 95% of American software companies have no metrics program. On the surface, this sounds terrible. But what I saw as a consultant was that most software companies have tried a measurement program, or have executives with lots of experience with metrics programs in other companies. The problem is that their experiences were bad. The measurement programs failed. Robert Austin wrote a terrific book, Measuring & Managing Performance in Organizations. When you start measuring something that people do, people will change their behavior to make their scores better. People will change what they do to get better scores on the measurements—that’s what they’re supposed to do. But they don’t necessarily change in ways that improve what you want to improve. Often, the changes make things worse instead of better (a problem commonly called “measurement dysfunction.”) This problem happens more often, and worse, if you use weak, unvalidated metrics. I keep meeting software consultants, especially software process consultants, who say that it’s better to use bad measurements than no measurement at all. I think that’s’ a prescription for disaster, and that it’s no wonder that so many software executives refuse to harm their businesses in this way.
By the way, what if one person repeatedly says X is sweeter than Y and another person says Y is sweeter than X? Are these people wrong about their experiences? Probably not. So is X sweeter than Y? Yes. Is Y sweeter than X? Yes. Have fun putting THAT on your measuring stick.
Jerry Weinberg defines quality as “value to some person.” It’s subjective, like sweetness. Different for different people. Quality is not quantity. But that doesn’t make it any less real.
In terms of the contributions of multiple fields to software, it’s no surprise that Weinberg’s doctoral work was in psychology or that he is so popular a thinker in the testing community. When we evaluate the impact of a new piece of technology on humans, we are doing applied social science research. To say that testing is part of a fundamentally different from the social science is to miss the point of much of testing AND to turn a blind eye to tools developed over centuries that can help us do our work.
Let me switch disciplines–I started working in Silicon Valley in 1983. I watched some excellent companies fail and weak competitors succeed. A striking commonality was that companies would release products with missing features or features that didn’t work, often lying about what they offered. They would gain market share so quickly that they could drive out more honest companies. Most bad companies eventually failed but they cost the marketplace its best suppliers time and again. As a tester, and sometimes as a testing consultant, I was encouraging topnotch staff to hammer products very hard, to expose bugs and argue effectively to get them fixed – to do the right thing by their customers. But the effect of this was that companies were coming to market late with a better product but a worse competitive position. Doing the “right” thing was being punished in the market.
This REALLY troubled me.
I wrote Testing Computer Software over a period of 4 years (1983 to 1986). In many ways, it was an excuse for me to ask people probing questions and think with them about answers.
In terms of markets, I learned that in mature industries, North American society places expectations on companies. We expect companies to advertise honestly, to make products that do what they are supposed to do (maybe not perfectly, but pretty well), to fix their defects for free or give customers their money back, and to allow their products to be analyzed by journalists who help customers distinguish good from bad products. As a society, we write those expectations into law. When we enforce the laws, we weed out the worst companies and create a safer market for honest businesses and their customers. These rules are less rigid for companies that sell services (e.g. custom programming), but in mature industries, service expectations (and rights) are pretty demanding too.
I came to believe that the core quality problems in software were (and still are) to a very large degree the result of loose enforcement of laws governing fraud, breach of contract, breach of warranty, and service negligence. I learn by doing. To learn more, I became an Investigator (part time, volunteer) in the County of Santa Clara’s Consumer Affairs Department. (Most of Silicon Valley is in Santa Clara County.) After a couple of years, I decided to go to law school and to focus on the law of software quality.
From 1984 to 1987, I worked in a telephone company in a user interface group. A telephone is a user interface to a richly featured, complex network of computers. Our particular system was a really early provider of integrated voice and data. You plugged your computer into the phone to be on the network, rather than into a modem or an Ethernet cable. We also delivered the first phone system that was actually for sale (rather than demonstrated in research laboratories) that had an LCD display. The menus on our 2 line x 40 character display gave access to 108 voice features and 110 data features.
In our group, we could all program, we all knew a lot about usability. We evolved our system’s user interface over years, in an evolutionary manner (add one or a few features and get them working before adding the next). We moved fluidly between customer focus and code focus, and we modified each other’s code all the time, trying out new ideas for design, which we (and the rest of our company and sometimes our external customers) assessed at the systems level in weekly deliveries of new system builds. It was a great experience.
I don’t want to downgrade the technological challenges of this work. We had to pack a ton of capability into small spaces and have it run very quickly and extremely reliability. The programming side of my work was very difficult. But the programming would have been meaningless without a focus on who the programming was for and it would have been ineffective without an ongoing stream of information about what was working (design and code) and what was not.
At another company (Power Up Software) I was a Software Development Manager. When I started, my task was to design products for small businesses, help plan the marketing of these products, and manage programmers (typically remotely located programmers) who were to write these products. I wanted to turn my products into best sellers but I didn’t know how, so I took a job at Egghead Software, then a successful software retailer, and sold my best competitors’ products to Egghead’s customers. These people didn’t just buy once and disappear. They came back to the store every week or two, buying new things or complaining about old ones. You learn a lot about what makes a product great, and not-so-great, by selling it to people and then dealing with them, face-to-face, over and over. What is quality? Quality is what made my customers happy. What’s a defect? A defect is what made my customers shout at me.
Some people would tell me that quality is conformance to a specification and a defect is a nonconformity. But really, what’s a specification? A specification is a piece of paper, often written by someone who knows nothing about quality.
About those specifications. I did a lot of school learning about software process. I came to Silicon Valley expecting to see lots of well written specifications. What I actually saw were a lot of scribbles on napkins (from the local pub or restaurant), scribbles on flipcharts, and scribbles on whiteboards. They were informal and they were updated as we came to understand our customers, our competitors, and our products better. Many of the most tedious-to-use, most poorly-designed products that I worked on or consulted about were coded to meet a detailed specification. Famous consultants would come visit our companies or teach courses that I sent staff to, and they would tell us that The Right Way to build software was to write everything down at the start and then to build to that specification. But the start is when you know the least about whatever you are creating. As you create it, you learn. Why make your decisions when you don’t know as much as you’ll know tomorrow? Why set up change control processes to make it expensive to fix all the mistakes you made when you were more ignorant than you are today?
Of course, there are reasons to set up change control processes. And there are reasons to develop a clear idea of the product’s scope early on (try designing an appropriate architecture without this).
But as the Extreme Programming advocates came to say much more eloquently than I knew how, knowing the big picture is one thing, but there is rarely a good reason to lock down details until you need to code them. And you cannot call something a “quality” process if it has you refusing to improve a product because it makes the cost of testing too high or the paperwork too expensive.
While I was dev manager (and later, director) at Power Up, I went to law school (you don’t learn much humility there, but it’s good for other things), went to University of California Extension for several courses on technical publications management (to help me work with a tech pubs group that adopted me as their manager), and wrote the second edition of Testing Computer Software.
Hung Nguyen joined me in writing TCS 2.0, but on the condition that I take courses in traditional quality control. He had a B.Sc. in Quality and wouldn’t write with me until I got ASQ-certified in quality engineering. His point was that I had become an expert in informal processes that worked on (relatively) small projects, but in his view, I needed to understand what effective control processes were and why they were important on larger projects and formally-negotiated projects (outsourced development, especially when your customer is the government). That blending helped us make TCS 2 a much better book than the original.
The diversity of my situations taught me powerful lessons about how little I know, how much the other people on the project know that I don’t know, and how complex are the tradeoffs faced by the average project manager. When I work as a tester, I understand that I am playing a supporting role. It is an important role. It calls for skilled and thoughtful work. But as James Bach puts it, I am “the headlights of the project”, not the driver, not the brakes, not the engine, and not the door to the customer. I can change roles. Maybe I can work two or three roles at the same time. But IN MY ROLE AS A TESTER, I don’t understand as much about the rest of the project as many other testers (and test consultants/trainers) seem to think they understand. There is a lot of arrogance in our field. We need to learn more humility.
I graduated law school in 1994, passed the bar exam and was sworn in as an attorney. I went back to the Santa Clara County government, this time to work as a full-time volunteer in the District Attorney’s office. My ideas about fraudulent and unfair trade practices had deepened into a passion. As a DA, I prosecuted about 130 cases, including 5 trials. None of this was white collar crime (e.g., economic crime by corporations) but I learned how prosecutions were run, how crimes were investigated, how cases were organized, negotiated, and occasionally, brought all the way to trial. I didn’t become an expert, but I gained more courtroom experience than most American lawyers gain in their professional lifetime. It was foundational knowledge.
Over the next six years, I spent a huge portion of my time as an advocate for a higher legal standard for software quality. I helped write the Uniform Electronic Transactions Act (UETA), which was adopted by almost all American state governments. The federal government renamed it the ESIGN act (electronic signatures) and applied it to all the states so that we had nationwide uniformity in some critical aspects of electronic commerce. I also tried to help write, and then led the opposition to, what became the Uniform Computer Information Transactions Act (UCITA), a bill that in essence, rewrote contract and copyright rules to benefit large software companies and large companies that embedded software in their traditional products (e.g., cars). UCITA ultimately failed despite the millions spent on it. And one of the most important legal organizations in the United States, the American Law Institute, elected me as a member. The ALI has up to (a maximum of) 3000 members. Most of them are judges or tenured law professors. Some are senior partners in large law firms, a few are famous consumer advocates. I think I was the least experienced lawyer ever elected to ALI and every time I go to one of their meetings, I know (or remember quickly) that every person in that room knows more about the law than I do. But I know a little more about computers-and-law than some of them, so sometimes, my background is useful.
ALI writes books that judges pay careful attention to. In the United States, and in all other countries whose systems evolved from British Common Law, laws passed by legislatures are more like detailed statements of principles. They do not cover every situation. They can’t. So judges have to figure out how a body of laws applies to each particular dispute before them. Across different states, judges might apply the same laws differently. ALI helps organize and make sense of the diversity of decisions. They publish the Restatements of the Law (e.g. the Restatement of Contracts, the Restatement of Torts, etc.) and Principles (e.g. Principles of the Law of Software Contracts). After UCITA failed (only two states adopted it, dozens of states rejected it), ALI let the dust settle for a couple of years and then started writing the Principles of the Law of Software Contracts. One of its most important rules is one that I advocated: a seller of software who knows about a defect of the software but does not disclose the defect to the customer will be held liable for damages caused to that customer by that defect. Note that this does not apply to free software (not sold). And if the seller discloses the defect, it becomes part of the product’s specification (it’s a feature). And if the seller doesn’t know about the defect there is no liability (once customers tell you about a defect, you have knowledge, so you cannot avoid knowledge for long by not testing). ALI adopted it unanimously last year. This is not law, but until the legislatures pass statutes, the Principles will be an important guide for judges. Even though I am a minor contributor to this work, I think the defect-disclosure requirement might be my career’s most important contribution to software quality.
The Association for Computing Machinery recently honored this aspect of my work with its “Making a Difference Award” which is “presented to an individual who is widely recognized for work related to the interaction of computers and society. The recipient is a leader in promoting awareness of ethical and social issues in computing.”
Editor’s note: Read part II of the interview.








[...] my delight reading yesterday’s uTest interview with Cem Kaner. Cem makes the following statement in his interview: ALI [American Law Institute] [...]
[...] tester certifications; the Software Consumer Bill of Rights and more. Catch up by reading part I and be sure to check back tomorrow for part III of the [...]
[...] model and exploratory test automation; the blogs he reads and much more. To recap, here’s part I and part [...]
[...] test. In his keynote at the Conference of the Association for Software Testing (CAST) this year, Cem Kaner stated that If one can become certified in their profession in three days, they are a commodity, [...]
[...] test. In his keynote at the Conference of the Association for Software Testing (CAST) this year, Cem Kaner stated that If one can become certified in their profession in three days, they are a commodity, [...]
[...] We recently interviewed your friend and colleague, Cem Kaner, with whom you collaborated on Lessons Learned in Software Testing. First, what was it like to work [...]