Monday, October 1, 2007

12.WebSpiderReview

Author of package under review: Edward Kim

Introduction
When I first began reviewing this code, I was quite surprised by the lack of support classes and methods. The entire program is contained within a single class and has only 3 methods. However, when looking closer, it is quite concise and seems to accomplish the tasks. There are not many tests, but they provide good coverage. I think this is a case where more testing is needed to produce a stable program.


Installation
It was quite simple to run download and run the package from Edward Kim. The zip file contained everything I needed to run all of the tools. I simply opened my command line and ran all of the necessary files.

I used ant to run all of the provided xml build files. The output data from the junit tests were quite extensive, but I was able to run all of the tools without any errors. For simplicity of testing, it would be nice to display a simple pass or fail for each of the junit tests. I was also able to create the jar file using dist.build.xml without any problems.

I then ran the command
java -jar webspider-{classmate}.jar -totallinks http://www.hackystat.org 100

The results were that 3211 links were found. It seemed to run this task correctly.


Code Format and Conventions
Upon opening the source file, I immediately noticed that the example file was used and the additional features were added. While it is nice to use this as a foundation, it is important to remember to change the header comments and add detailed comments about added features.



FileLinesViolationComments
WebSpiderExample.java24,26,83EJS-13All letters of URL are capitalized.
WebSpiderExample.java24,25,26EJS-39Global variables not documented.
WebSpiderExample.java36,138,213EJS-56No pre/most conditions.
WebSpiderExample.java63,66,163EJS-9Variables names wc and resp not clear.
WebSpiderExample.java, TestWebSpider.javaN/AEJS-40,41No Summary of files/package



Test Case Review
There were few test cases provided for this application. The file TestWebSpider.java contained a few tests that ran the application with several different settings. Some of the tests took a long time to execute and produced excessive output because the program was run over a website with a link depth of 100. Cases could be tested much faster and cleaner with more, smaller tests.

Black Box Perspective
There was no testing of abnormal I/O combinations. Only one test was run that verified correct output. Test of each method should be done from this perspective to ensure proper execution for a wide range of input. For black box testing, the addition of tests done over a controlled set of web links would be very useful. I would add tests something similar to Randy Cox's test website, available at

http://www2.hawaii.edu/~randycox/11.WebSpider/test/main.htm

In addition, it would be helpful to test each method separately and compare the output. Tests should be done over the method mostPopular() with different lists of Strings being passed (including an empty list and a very long list). Likewise, tests could be done over the crawl() and getNumLinks() methods with different combinations of input. The resulting tests should have their results compared to the "expected" results.


White Box Perspective
Coverage was very well tested. Emma reported that coverage was well over 90%. While all the code was tested in some way, exceptions were not tested for, and abnormal cases were not considered. At the very least some tests where exceptions should be thrown are needed. Boundary conditions are also not tested, and in some cases unhandled exceptions will result. For example, if a null is given as input to one of the methods.

Some tests that should be added include testing passing null values to the methods. An example of this would be sending the mostPopular() method a null value instead of a list.

Break da buggah
To break this program, I simply began trying strange input. I first used the control website provided above to ensure it gave correct output. It passed these tests. I then tried simply providing not enough input. The command was

java -jar webspider-eykim.jar -totallinks

The result was

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 2
at edu.hawaii.webspider.WebSpiderExample.main(WebSpiderExample.java:112)

There are problems with catching erroneous I/O that could be easily fixed. Additional testing could have revealed these errors.


Lessons Learned
Doing this type of critical review, I see many errors that I made myself. For example, I had quite inadequate testing in my version of this same program. In breaking this program, I see that I should write much more extensive tests for my own future program. Black box testing, specifically, tends to be under emphasized. Catching cases where strange input can break your program is very important. You never know what the user will do with your program.

Reading this code also showed me how simply and concisely this program could be done. My version had several additional classes and many complex algorithms and data structures. This version, however, was done quite cleanly with only a few short methods. It shows me that I should spend more time on program design rather then jumping strait into coding.

No comments: