-
  home |  testing services |  automation |  development |  training |  downloads |  clients |  about us |  contact us 
#
LogiGear Download FREE articles, presentations, white papers and more!

home >> resources >> Common Software Errors >> Error Handling

>>
>> Home
>> Resource Center
>>
Latest articles
Classic articles
Articles by others
Resource directory
>> White papers
>> Newsletter archives
>> RSS feed
>> Books
>> Contact us

Ask us now on live chat...


For more information:
Contact Us

Printer friendly:
PDF version

LogiGear Resource Center

AddThis Social Bookmark Button

Testing Computer Software Second Edition

Common Software Errors - Error Handling


This is the appendix from the best-selling book
Testing Computer Software, 2nd ed.

Copyright © 1988 by Cem Kaner
Copyright © 1993 by Cem Kaner, Jack Falk, Hung Quoc Nguyen

This is part 3 of 13.

[ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 6 ] [ 7 ] [ 8 ] [ 9 ]
[ 10 ] [ 11 ] [ 12 ] [ 13 ]

ERROR HANDLING

Errors in dealing with errors are among the most common bugs. Error handling errors include failure to anticipate the possibility of errors and protect against them, failure to notice error conditions, and failure to deal with detected errors in a reasonable way. Note that error messages were discussed above.

ERROR PREVENTION

Yourdon's (1975) chapter on Antibugging is a good introduction to defensive programming. The program should defend itself against bad input and bad treatment by other parts of the system. If the program might be working with bad data, it should check them before it does something terrible.

Inadequate initial state validation

If a region of memory must start with all zeros in it, maybe the program should run a spot check rather than assuming that zeros are there.

Inadequate tests of user input

It is not enough to tell people only to enter one- to three-digit numbers. Some will enter letters or ten-digit numbers and others will press five times to see what happen. If you can enter it, the program must be able to cope with it.

Inadequate protection against corrupted data

There's no guarantee that data stored on disk are any good. Maybe someone edited the file, or there was a hardware failure. Even if the programmer is sure that the file was validated before it was saved, he should include checks (like a checksum) that this is the same file coming back.

Inadequate tests of passed parameters

A subroutine should not assume that it was called correctly. It should make sure that data passed to it are within its operating range.

Inadequate protection against operating system bugs

The operating system has bugs. Application programs can trigger some of them. If the application programmer knows, for example, that the system will crash if he sends data to the printer too soon after sending it to the disk drive, he should make sure that his program can't do that under any circumstances.

Inadequate version control

If the executable code is in more than one file, someone will try to use a new version of one file with an old version of another. Customers upgrading their software make this mistake frequently enough, then don't understand what's wrong unless the program tells them. The new version should include code that checks that all code files are up to date.

Inadequate protection against malicious use

People will deliberately feed a program bad input or try to trigger error conditions. Some will do it out of anger, others because they think it's fun. Saying that "no reasonable person would do this" provides no defense against the unreasonable person.

ERROR DETECTION

Programs often have ample information available to detect an error in the data or in their operation. For the information to be useful, they have to read and act on it. A few commonly ignored symptoms or pieces of diagnostic information are described below. There are many others.

Ignores overflow

An overflow condition occurs when the result of a numerical calculation is too big for the program to handle. Overflows arise from adding and multiplying large numbers and from dividing by zero or by tiny fractions. Overflows are easy to detect, but the program does have to check for them, and some don't.

Ignores impossible values

The program should check its variables to make sure that they are within reasonable limits. It should catch and reject a date like February 31. If the program does one thing when a variable is 0, something else when it is 1, and expects that all other values are "impossible," it must make sure that the variable's value is 0 or 1. Old assumptions are unsafe after a few years of maintenance programming.

Ignores implausible values

Someone might withdraw $10,000,000 from their savings account but the program should probably ask a few different humans for confirmation before letting the transaction go through.

Ignores error flag

The program calls a subroutine, which fails. It reports its failure in a special variable called an error flag. The program can either check the flag or, as often happens, ignore it and treat the garbage data coming back from the routine as if it was a real result.

Ignores hardware fault or error conditions

The program should assume that devices it can connect to will fail. Many devices can send back messages (set bits) that warn that something is wrong. If one does, the program should stop trying to interact with it and should report the problem to a human or to a higher level control program.

Data comparisons

When you try to balance your checkbook, you have the number you think is your balance and the number the bank tells you is your balance. If they don't agree after you allow for service charges, recent checks, and so forth, there is an error in your records, the bank's, or both. Similar opportunities frequently arise to check two sets of data or two sets of calculations against each other. The program should take advantage of them.

ERROR RECOVERY

There is an error, the program has detected it, and is now trying to deal with it. Much error recovery code is lightly tested, or not tested at all. Bugs in error recovery routines may be much more serious than the original problems.

Automatic error correction

Sometimes the program can not only detect an error but correct it, without having to bother anyone about it, by checking other data or a set of rules. This is desirable, but only if the "correction" is correct.

Failure to report an error

The program should report any detected internal error even if it can automatically correct the error's consequences. It might not detect the same error under slightly different circumstances. The program might report the error to the user, to the operator of a multi-user system, to an error log file on disk, or any combination of these, but it must be reported.

Failure to set an error flag

A subroutine is called and fails. It is supposed to set an error flag when it does fail. It returns control to the calling routine without setting the flag. The caller will treat the garbage data passed back as if they were valid.

Where does the program go back to?

A section of code fails. It logs the problem, sets an error flag, then what? Especially if the failing code can be reached from several GOTO statements, how does it know where in the program to return control to?

Aborting errors

You stop the program, or it stops itself when it detects an error. Does it close any open output files? Does it log the cause of the exit on its way down? In the most general terms, does it tidy up before dying or does it just die and maybe leave a big mess?

Recovery from hardware problems

The program should deal with hardware failures gracefully. If the disk or its directory is full, you should be able to put in a new one, not just lose all your data. If a device is unready for input for a long time, the program should assume that it's off line or disconnected. It shouldn't sit waiting forever.

No escape from missing disk

Suppose your program asks you to insert a disk that has files it needs. If the inserted disk is not the correct one, it will prompt you again until the correct disk is inserted. However, if the correct disk is not available, there is no way you can escape unless you reboot your system.


-
newsletter | RSS | site map |

1 (800) 322-0333   © 2009 LogiGear Corporation. All rights reserved.   Legal Notice.   Privacy Policy.