Testing the Limits with Cem Kaner, Author of The Domain Testing Workbook

Dr. Cem KanerThis month, we revisit Cem Kaner. Cem recently published  The Domain Testing Workbook and is working on a collection of other workbooks and projects in addition to teaching several courses at Florida Tech.

In today’s interview, Cem explains his new workbook, discusses why it’s important for experienced testers to keep studying and improving, tells us what’s wrong with the testing culture today and hints at maybe having a solution for the QA credentials battle.

*****

uTest: You’ve written quite a few books on software testing and you had a new book – The Domain Testing Workbook – come out in the past few months. Why do you enjoy writing these books and who are you trying to help?

Cem Kaner: My overall goal is to improve the state of the practice in software testing. How can we improve what working testers actually DO so that they are more effective and happier in their work?

The Domain Testing Workbook is the first of a series that focus on individual test-design techniques. Our intent is to help a tester who has some experience develop their skills so that they can apply the technique competently.

What’s wrong with the way domain testing is currently taught?

CK: There’s nothing wrong with the way domain testing is taught. Teachers introduce students to the two basic ideas: (a) subdivide a large set of possible values of a variable into a small number of equivalence classes and sample only one or a few values from each class. This reduces the number of tests to run. (b) When possible, select boundary values as your samples from each class because programs tend to fail more often at boundaries.

In general, students understand these introductions and can explain them to others.

This level of analysis works perfectly when you test Integer-valued variables one at a time. There are lots of Integer-valued variables, and it makes a lot of sense to test every variable on its own, if you can, before you design tests that vary several variables at the same time. So, I think many courses do a fine job of introducing a useful idea to students in a way that helps them use it.

However, there is much more depth to the technique than that. Here are four examples:

  • It is common to look only at the input values and decide pass/fail based on whether the program accepts “good” inputs and rejects “bad” ones. A stronger approach goes past the input filter. For example, enter the largest valid value. The program should accept this. Suppose it does. Now continue testing by considering how the program uses this value. What calculations is it used in? Where is it displayed, stored, or compared to something else? Is this largest-acceptable value actually too large for some of these later uses?
  • There are other types of variables, not just Integers. Different risks apply when you are testing floating-point numbers or strings. Dividing them into equivalence classes is a little trickier.
  • We usually test variables together. Any test of a real use of the program will involve many variables. Even if you leave most of them at their default value, the program considers the values of lots of variables when you ask it to do something meaningful. We can manage the number of combinations to test using techniques like all-pairs. In all-pairs testing, the tester chooses a set of maybe 10 variables to test, then chooses a few values of each variable to test, then uses a tool like ACTS or PICT to create a relatively small set of tests (maybe as few as 30) that will combine these values of these 10 variables in an optimized way. (ACTS and PICT are free tools from the National Institute of Standards and Technology (ACTS) and from Microsoft (PICT). One of the challenges of this type of testing is picking the best values for the individual variables—and that brings us back to domain testing.
  • We often test variables together that are related to each other. How do you choose boundary values when the boundary (such as largest valid value) of one variable depends on the value of the other? This particular issue appears often in university textbooks but there hasn’t been enough practical advice for working testers.

The Domain Testing Workbook goes beyond the perfectly-good introductory presentations that appear in many books and courses. We want to help testers apply the technique to situations that are a little more complex but still commonplace in day-to-day testing.

Why did you decide to make this newest book a workbook?

CK: People learn what they do.

When I started designing technique workbooks, back in 2001, I thought I could meet the need with a big collection of worked examples. Publish one book of examples of domain testing. Publish another book of examples of scenario testing, etc. Each example would illustrate something interesting or difficult in this technique and then show how the technique deals with it. I figured that once I had a good set of examples, I could wrap a bunch of exercises around them so that readers could try the problem themselves, then see the solution (the worked example) in the book.

I think the best way for people to develop skills is to do something, get feedback on how to do it better, improve it (or do something similar), get feedback, and keep doing this with problems that are increasingly difficult or that apply the technique in new ways.

The surprise for us was that people don’t learn as much this way as we expected. Sowmya’s M.Sc. thesis research demonstrated this and caused us to reconsider how we were going to teach techniques. We concluded that a really good collection of annotated examples (like the set in Sowmya’s research) is essential but it is not enough. That people need to combine guided practice with an underlying mental model of what they are doing, so that they can adapt what they do to a new situation. We then spent years developing a “schema” to support this.

The schema itself is a reasonably straightforward list of ideas (you could treat them as a step-by-step sequence of questions to answer) but we had to write into the book a bunch of theoretical support for it, explaining the concepts and showing how each one is (a) sound and (b) applicable.

The schema has become prominent in the book but the point of it is still to help people to learn skills, not just good ideas. For that, they need guided practice.

A reviewer said that your book covers “many practical aspects and considerations for testing … that are usually skipped over in broad testing surveys or short articles.” Why did you decide to cover these particular topics and what is the gap that you felt needed to be filled here?

CK: We’re trying to help people get really good at this one technique. If there’s something you need to know in order to apply this technique well, we wanted to teach it.

We’ve also been working on books for other techniques. (One technique per book.) Imagine a book on scenario testing. It will have to look at the theory and practicalities of designing scenarios, turning them into tests, applying relevant coverage measures (answers to the questions: what tests are left and when can we stop?), evaluating and communicating the results, deciding which ones should be made reusable for regression testing, which should be automated and how, etc. There are lots of practical issues here, but they are specific to scenarios just like the stuff in The Domain Testing Workbook was significant to domain tests.

How did you decide what to cover in general?  What was required to collect the information in the detailed examples?

CK: In general, we wanted “a bunch” of examples of domain testing that came from realistic situations or classic presentations (well-known books or courses). We wanted examples that would be different enough from each other to be instructive and we wanted a sequence from relatively easy to harder.

Choosing the examples took work but a lot of that work was tedious (digging through testing books and presentations).

The more interesting task for us was actually working the examples. Both Doug and I have been seen (and have seen ourselves and each other) as pretty skilled at domain testing for the last 25 years. We were confident, with each example, that we could analyze it. We were able to quickly outline the structure of our analysis and lay out key tests right away. What we discovered, though, was that we naturally applied different pieces of the schema to different examples. That is, we did some parts of the analysis explicitly, we skipped some tasks altogether (seeing them as irrelevant or unhelpful for that example) and we did several implicitly. We made the judgments without even noticing that we were doing it.

Working the examples for the book made us bring those parts of the analysis out to where we could see them. The results were better than our first drafts. Our analyses were more thorough and I think we generated better sets of tests. But it was more difficult than we had expected and it took a lot more time for many of the examples than we had imagined. Imagine looking at a problem, knowing that you can solve it, knowing the broad-brush solution to it, and then needing a week (a week!) to actually do it.

This slapped us in the self-image. There’s more depth to the technique than we realized and there are probably more people than we had initially anticipated who naturally practice it at a deeper level, and more quickly, than we do.

In terms of the research we had to do, we often discovered that we sort-of knew something but either couldn’t explain it well or couldn’t be sure we were correct and complete. We did a lot of digging through testing books, testing conference talks (we thank PNSQC and Stickyminds for their free archives) and lots of other material that was written for publication on the web.

What are the differences between teaching new testers and teaching people who have been in the field for awhile?

CK: New testers don’t know anything about the field. They need a survey: they need to learn working vocabulary, they need a map of the techniques, they need a sense of the norms of the culture they’re working in.

When I used to manage testers, I preferred to give everyone professional-level tasks. For example, everyone would design tests (not just follow other people’s designs or scripts), including new/junior testers. Everyone would explore the program, everyone would make status notes and reports, etc. The core distinctions that I made between new and experienced were that the new kids needed to work in simpler areas and they needed much more frequent reviews and feedback.

In university teaching, I can do this to some degree. In theory, I have to treat every student the same way. But in practice, I can make individual coaching sessions available and emphasize to some students that they really, really need to take advantage of them. Thus, even though the same opportunity is available to everyone, some students might get five times as much personalized (or small-group) guidance from me as others. That can include reviewing work they’ve already done or working problems with them on a whiteboard.

I find it harder to achieve this kind of multiple-level teaching in commercial courses. We designed the online BBST series in a way that makes this possible, but it’s a lot more work to do it than in traditional face-to-face courses.

In general, I think that commercial courses focused on developing test design skills (rather than surveying the field) are more likely to be effective with experienced testers, who have enough practical experience that they can think of their own examples where a given technique would have helped them with something that they found difficult.

What’s your favorite topic to teach?

CK: That’s a very hard question. Looking back, I think probably contract negotiation. After that, probably mathematical statistics. After that, it seems like a 4-way tie between software metrics, human perception/cognition with applications to software design, quantitative finance, and software test design.

The last time we talked to you was in 2010. Other than writing this book, what have you been up to since then?

CK: The National Science Foundation supported a lot of my research into the teaching of software testing and the creation of high-standards online courses. We met our goals in that work and everything I’m doing now is applying what we learned rather than trying to break fresh new theoretical ground. As a result, I haven’t been applying for more research funding from NSF. Instead, Rebecca Fiedler and I formed Kaner Fiedler Associates to provide corporate support for the next generation of my research/course development.

KFA teaches the BBST courses to corporate clients (soon, we’ll add the Domain Testing course). We created the BBST-Test Design course (open source like the other BBST’s but mainly funded by us rather than by NSF) and are working on the next generation of BBST courses. We formed a publishing company, Context-Driven Press, which would allow us to offer the books we want to see in print faster, better, and at a much lower price than we can do with traditional publishers. We’re just finishing the Foundations of Software Testing workbook (with a significant revision of BBST-Foundations)

At Florida Tech, there have been several developments. First, I had what felt like a breakthrough in my teaching of software metrics. Two of us (Professor Pat Bond and I) struggled for over 12 years trying to figure out how to design a good metrics course. A couple of years ago, I think we finally created a successful structure. Last year, one of our senior faculty retired and I was able to get his applied statistics course. I’ve wanted to teach a course that applies statistical analysis to computing problems for years and years and years. It’s a work in progress: it will probably take another 3 teachings before the course design stabilizes, but I like where it’s going. As I write this (January 1, 2014), I am just about to start teaching my first Requirements Engineering course. Doing this is making me revisit and integrate my work as a human factors analyst with my work on scenario-based testing with academic work on scenario-based design with the use of qualitative data analysis software with my experience writing service contracts. This is going to be a very challenging course for me (and for my students) but I’ll learn a lot from it. I’ve also been working with Carol Oliver (a doctoral student in my lab) on high-volume test automation. I think over the next few years we’ll find ways to make some of the high-volume techniques much more usable by mid-level testers.

Are there any emerging trends in testing you find interesting or of concern?

CK: On the technical side, I still think that high-volume automated testing is the wave of the future. I’ve been saying that for so many years that I feel like a broken record, but it feels as though I’m seeing progress (not just in my lab).

On the social side, my place in the software testing community has been changing. Scientific communities are sometimes characterized as moving between two dysfunctional extremes, with stagnation at one end and fragmentation at the other. In the stagnant extreme, everyone pretends to believe in the same things (or the dissenters are unable to gain attention) and the field makes slow, incremental progress. Dissenters break out of this by forming schools that gain influence and add a lot of creative tension. In the 1980’s, my sense of the field was that it was pretty stagnant. There were apparently some big conflicts between academics and what I see as heavyweight-process testing practitioners/consultants but I never quite understood them (reading books / papers from that time gives me little insight). For someone focused on what we did to make mass-market software valuable for retail customers, the testing establishment showed no apparent interest or respect for our approach. It was hard to talk about things like exploratory testing or how testers could constructively support incremental development and it was almost impossible to publish about them in traditional places. I ran into some remarkably harsh criticism for speaking about things like this as if they were worth serious note and my experiences with standards-writers were so unpleasant for so long that I now have a permanent, counterproductive chip on my shoulder when people try to talk with me about creating a new standard for software testing. Over time, dissenters did what dissenters do: “Agile Development” and “Context-Driven Testing” appeared and became popular. Not surprisingly, this led to a lot of discussion, a lot of creative thinking, a lot of evolution and learning on all sides and to the development of other groups. But over time, a few things naturally happen. First, many of the ideas that were controversial long ago have become mainstream. Almost everyone calls themselves “agile” these days. And many of the context-driven ideas are consistent enough with how people see their own practices that they wonder why there’s this group that sees itself as distinct and separate. Now I think we’re drifting toward that other dysfunctional extreme, fragmentation based on polarization. I think that as the context-driven approach has gained respectability and popularity, some of the more vocal members have adopted a harsher tone. I think there’s more ethical posturing; more posturing that the other people are not just doing something different, they are incompetent; more assertion of personal authority; more rudeness and intimidation. Twitter, of course, is the echo chamber for all things vicious and has amplified this trend enormously. I felt increasingly as though some of my ideas were out of the mainstream of my community. For example, I see more of a place for certification than some of my colleagues. And in software metrics, I’ve developed a greater respect for the impossible tradeoff between the enormous, legitimate need for metrics and our lack of progress in creating valid metrics that can meet that need. Back 25-to-15 years ago, I used to take harsh shots at the metrics advocates because their ideas were so bad. They are bad, but underlying the shots was the belief that my friends and I could do better with a reasonable amount of work. We’ve had a lot of years to deliver on that and we didn’t. At some point, it’s time to recognize that a problem is very hard, to stop badmouthing people whose record of non-success is no worse than your own record of non-success, and to mine their work as well as your own for ideas to make incremental improvements because that Magical Breakthrough might not come for a long time. Talking about things like this, even talking about the goodwill of people in “competing” areas of the community, have become as politically incorrect with “my” crowd as exploratory testing was with the harsher voices in the crowd that was dominant back in 1983. My joy lies in the creative energy that grows out of diversity, but for that we need more tolerance, not more polarization. Extreme polarization benefits some consultants who promote it, but I don’t think it benefits the field as a whole. I think that ship will keep sailing in its direction until it finally hits its glacier, but I decided to jump off a couple years ago.

How does this apply to your readers?

There are times when it makes sense to look for mentors and ignore their political/religious/potentially-controversial views. I think we enjoyed those times, in software testing, from maybe 1990-2010. As a field polarizes, it becomes more necessary to consider the reputational impact of aligning with a group, or a leader of a group. There are potential benefits and there are potential risks. If someone thinks you are an endorser of views that are extreme, harsh and divisive, they might actively recruit you to become an advocate in their company. Or they might want to keep you and your perceived divisiveness out. If you have strong views, I think you should express them. But if you don’t, you might not want to be perceived as if you do. At this point, the guidance I give my students is that there are  many paths to high skill in our field that don’t require participating in the field’s politics, and they should consider them.

Now that the new book is published, what are your plans in the coming year?

CK: We’re working on an innovative design for the Domain Testing course based on the Domain Testing Workbook. We’ll be beta testing it through the spring and summer and will hopefully start teaching a fully finished course in the summer or fall.

My university courses (Stats, Metrics, Requirements, Testing) are all challenging and are all going through new development or significant revision.

We have a few more book projects. The Foundations of Software Testing workbook is in publisher’s proofs now. The Scenario Testing Workbook is in progress, though I doubt we’ll finish it this year.

I’ve been wrestling with certification for a long time. I believe that certifications can be very valuable. I think it is appropriate to license lawyers, for example. I also think that our community has a legitimate need for credentialing and that we don’t have strong credentials in the field today. I think I finally have an outline of some ideas that might lead to a better software testing credential, and that might transcend some of the field’s political divisions. I haven’t talked with any of the field’s powerful people about these yet—I’m still polishing the proposal. But I think I’m getting close to something presentable. If it goes well, that might take a lot of my time this year and it might carry the year’s (or the decade’s) most significant impact of my work. Probably not—I try out a lot of ideas that seem good at the time and most of them don’t get very far. But some of them do. We’ll see what happens.

Essential Guide to Mobile App Testing

Comments

Trackbacks

Leave a Reply

Your email address will not be published. Required fields are marked *