A first impression of Google's Skipfish scanner for web applications
According to a Google security blog post
by developer Michal Zalewski, Google's new, free Skipfish scanner is
designed to be fast and easy to use while incorporating the latest in
cutting-edge security logic. Felix 'FX' Lindner examines Skipfish to
see how well it compares to other tools used to check web site
integrity.
When checking the security of web applications, developers use
automated scanners to gain an overview which can then be refined by
further manual testing. Depending on the user's requirements and
expertise, this may involve minimalist basic tools such as Nikto, or
comprehensive commercial software such as Rational AppScan. The main
aim of such a scan is to reveal configuration and implementation flaws
in the interaction between web servers, application servers,
application logics and other components of modern web applications.
Typical vulnerabilities detected this way include SQL injection, cross
site scripting, cross site request forgery and code injection.
The curtain goes up
Skipfish,
released in March, supports
all the features one could wish for when doing a generic web page scan:
It can handle cookies, it can process authentication details and the
values of HTML form variables, and it can even use a single session
token to navigate the target page as an authenticated user. One of
Skipfish's specialities is to run through all the possible file and
directory names in order to detect items like script back-ups the admin
has forgotten about, compressed archives of entire web applications, or
SSH/subversion configuration files that may have accidentally been left
behind. As such items can only be tracked down by trial and error, the
scanner combines various known file extensions with all the file names
it detects by checking the normal links on the web page.
It also tries out a few hand-picked keywords (probably extracted
from Google's search index) as directory and file names. Skipfish is
especially noteworthy in this respect because it actually establishes
and checks all possible combinations. This means that every keyword
will be combined with every file extension and every actual file found
on the web server, and that the result will be tested as a file name, a
directory name, and as an argument for a HTTP POST request. This
approach generates a very large number of combinations which could
prove overwhelming. Thankfully, Skipfish provides predefined
dictionaries of varying sizes, allowing users to determine the extent
of the request flood generated. However, even the minimal starter
dictionary recommended by Zalewski includes 2,007 entries and produces
about 42,000 requests for each directory tested.
To gain an impression of Skipfish, we briefly investigated a few
typical scenarios: We used a Microsoft Internet Information Server 7.5
under Windows 7 with a ScrewTurn Wiki 3.0.2 for ASP.NET as our typical
interactive web application. A Linux-based Apache 2.2.3 server with
traditional CGI scripts written in Perl represented the more dated
approach. An old HP LaserJet printer with ChaiServer 3.0 was used as a
typical cause of false alarms and scanning problems.
Practical use
Skipfish is only available as source code and must be compiled for
the target platform. This requires the libidn library for encoding and
decoding Internationalised Domain Names (IDN) to be installed – for
example, under Ubuntu 9.10 this is easily done by running sudo apt-get install libidn11-dev
in a terminal window. After successfully compiling the source code,
users select a dictionary from the dictionary subdirectory and copy it
to the directory which contains the Skipfish binary as skipfish.wl.
Alternatively, the path can also be added as an option. It should be
noted that Skipfish will automatically extend this dictionary file, so
it is always advisable to use a copy.
./skipfish -o example-log http://www.example.com/ starts
the server scan and places a scan report in the example-log directory
after the scan is completed. To only check the "blog" subdirectory on
the server, enter "./skipfish -I /blog -o blog-report http://www.example.com/blog".
Once started, Skipfish provides network statistics, but these are of no specific benefit to users.
Skipfish produces an enormous number of requests which it processes
simultaneously at an impressive speed. During the first run in a local
Gigabit Ethernet segment against a sufficiently scaled system, it
achieved 2796 HTTP requests per second. This immediately pushed the
load of one of the scanning system's CPU cores to 100 per cent.
Although the Linux TCP/IP stack can handle this load it needs to
utilise all available means to do so, as soon becomes evident when
looking at the netstat data returned.
Once a scan has been started there is no indication of how long
Skipfish will actually take to complete it. In our test, we had to wait
for more than four hours for the scanner to complete its check of the
neighbouring IIS 7.5 with the ScrewTurn wiki. In the process, the
scanner transferred 68GB of data to send 40 million HTTP requests. One
of the scanning system's CPU cores was working to full capacity almost
all of the time, and Skipfish used 140MB of working memory when combing
through the wiki's moderate 1,195 files in 43 directories. At the same
time, the IIS used the full capacity of two Intel XEON 2.5GHz cores as
well as a further 100MB to 500MB of working memory, producing a log
file of 1.5GB in its default configuration.
Interpretation
By way of an experiment, we disabled the dictionary function for
checking the Apache web server. However, Skipfish was still allowed to
learn new words during the scan, which produced 304 new entries. This
reduced the scanning time to 20 minutes, during which 300,000 HTTP
requests were made, generating 232MB of network traffic.
Skipfish generally produces a relatively large number of results and
saves them in the defined directory as HTML, JavaScript with JSON and
raw data files. Users can then view the report in a JavaScript-enabled
browser or evaluate the raw data themselves. Unfortunately, the number
of false alarms was considerably higher than that produced by tools
such as Nikto or the Burp Suite, which we used for comparison. For
instance, some regular ASCII text files were interpreted as JSON
responses without XSSI (Cross Site Script Inclusion) protection.
Skipfish attaches particular importance to well-formed MIME type and
character set responses from the server. Every deviation will cause an
"increased risk" rating, but only some of them actually have any
substance.
Skipfish did not highlight the possibility of listing directory
contents for any of the targets we tested. The content of the
robots.txt file, which can be of particular importance for identifying
interesting server areas, are also left uncommented and presented to
the user as "interesting file" results for individual interpretation.
The curtain falls
Four hours of scanning IIS 7.5 with the ScrewTurn wiki yielded 8
high risk results, 264 medium risk results, 55 low risk results, 123
warnings and 254 informational entries. The total data volume stored on
disk was 851MB, and the web browser should have one Gigabyte of working
memory available to allow the results page to be viewed.
All three SQL injection alerts for the Typo3 system prove to be false alarms when examined manually.
Unfortunately, the 8 most important results – all presumed to be
integer overflows in HTTP-GET parameters, turned out to be false. In
the next group of results, "Interesting Server responses" are HTTP 404
errors (resource not found) and HTTP 500 errors (internal server error)
which are jumbled together, making it necessary to manually investigate
the 130 displayed results one by one. Since IIS displays a generic
error message for "HTTP 500" to avoid providing potential attackers
with further information, the remaining requests would have to be
individually correlated with the web server log data or retested
manually. However, a look at the server logs reveals that this effort
would be wasted because the flaw is always the same and has no security
relevance. That false alarms are not the exception also became apparent
during a subsequent analysis of a Typo3 system were Skipfish warned of
a critical SQL injection hole.
It is worth mentioning that Skipfish detects the presence of an
intrusion prevention system (IPS) and lists this under "Internal
Warnings". During our tests, it detected the
HttpRequestValidationException, which is issued by ASP.NET for very
obvious SQL injection and cross-site scripting attacks. This triggered
an HTTP 500 error.
Evaluation
On the whole, Skipfish's results when scanning the IIS 7.5 and
ScrewTurn combination weren't too impressive. However, it is worth
pointing out that the scanner's brutal dictionary use detected the
/wiki/ URI by itself, which allowed the URI to be examined without any
manual intervention.
When examining the Apache server and CGI scripts, and with the
dictionary function restricted to learning only, Skipfish scored
slightly better. There were only 47 medium risk results, 6 low risk
results, 1 warning and 65 informational entries, which only required
83MB. Interestingly, Skipfish isn't capable of handling classic
Perl-generated HTML forms. As a result, it doesn't enter any learned
keywords into form fields for testing. However, the scanner correctly
detects that the forms have no protection against Cross Site Request
Forgery (CSRF) attacks.
Both the "Interesting File" category and the "Incorrect or missing
MIME type" category list a URL at which flaws in the server
configuration actually display the source code of a CGI script rather
than launch the script itself. This result was achieved by combining
file names and extensions (in this case an empty extension).
Incidentally, the printer scan didn't produce any results because we
had to abort scanning after 18 hours with no end in sight. In such
situations, it's an inconvenience that Skipfish only writes the report
about its findings to the directory once a scan has been completed, or
been aborted via Control-C. There is no way of finding out what the
tool has already detected while the scan is in progress.
Assessment
Skipfish excels at discovering forgotten data and provoking
unexpected server behaviour. The permutation of keywords and file
extensions as well as learned elements is very similar to the fuzzing
process, producing similar results in terms of data volume.
In large computer centres, such as Google's, which harbour a very
heterogeneous landscape of web applications and have access to almost
unlimited processing resources and network bandwidth, using Skipfish in
its present form is already bound to help discover unexpected functions
and content.
However, using Skipfish makes less sense if there is direct access
to a web server's file system – especially when scanning via an
internet connection. The amount of data transmitted by the scanner is
almost equivalent to a complete file system p_w_picpath, which makes it
preferable to analyse an application's scripts and files on site and
with well-established tools in one's own time.
While Skipfish's report is very clear, the problems listed require considerable interpretation.
Skipfish in its current form has only very limited professional uses.
Investigating the test results, which is always required, soon becomes
a momentous task due to the flood of reports. Furthermore, results such
as "Incorrect or missing charset", "Incorrect caching directives" or
"Incorrect or missing MIME type" are very abstract and far from
non-ambiguous. Even experienced pen testers often need to research to
find the actual cause of the problem. Consequently, Skipfish is
completely unsuitable for non-professional users or "quick check-ups"
and will probably prompt unjustified panic rather than provide
reassurance.
In many cases, using the dictionary function is prohibitive just
because of the amount of data produced and the resulting target system
loads created. A fully mature Java-based business web application would
probably not stand up to a Skipfish scan with minimal dictionary
because the scanner's CPU and main memory requirements would paralyse
the system before the scan is completed. Firewalls with complex
inspection modules and IDS/IPS systems would also drown the recipients
of their log messages in extreme amounts of data or simply quit
functioning.
http://www.h-online.com/security/features/Testing-Google-s-Skipfish-1001315.html