XML, the extensible markup language, has
its share of critics as well as plenty of zealous proponents. I was
long in the former group, and only grudgingly incorporated XML into Nmap after
volunteers performed most of the work. Since then, I have learned to
appreciate the power and flexibility that XML offers, and even wrote
this book in the DocBook XML format. I strongly recommend that programmers
interact with Nmap through the XML interface rather than trying to
parse the normal, interactive, or grepable output. That format
includes more information than the others and is extensible enough
that new features can be added without breaking existing programs that
use it. It can be parsed by standard XML parsers, which are available
for all popular programming languages, usually for free. Editors,
validators, transformation systems, and many other applications
already know how to handle the format. Normal and interactive output,
on the other hand, are custom to Nmap and subject to regular changes
as I strive for a clearer presentation to end users. Grepable output
is also Nmap-specific and tougher to extend than XML. It is
considered deprecated, and many Nmap features such as MAC address
detection are not presented in this output format.
An example of Nmap XML output is shown in Example 13.9, “An example of Nmap XML output” Whitespace has been adjusted for
readability. In this case, XML was sent to stdout
thanks to the -oX -
construct. Some programs executing
Nmap opt to read the output that way, while others specify that output
be sent to a filename and then they read that file after Nmap completes.
Example 13.9. An example of Nmap XML output
# nmap -T4 -A -oX - -p1-1024 scanme.nmap.org
<?xml version="1.0" ?>
<!-- nmap 3.78 scan initiated Fri Dec 10 21:40:13 2004 as:
nmap -T4 -A -oX - -p1-1024 scanme.nmap.org -->
<nmaprun scanner="nmap" args="nmap -T4 -A -oX - -p1-1024 scanme.nmap.org"
start="1102743613" startstr="Fri Dec 10 21:40:13 2004"
version="3.78" xmloutputversion="1.01">
<scaninfo type="syn" protocol="tcp" numservices="1024" services="1-1024" />
<verbose level="0" /> <debugging level="0" /> <host><status state="up" />
<address addr="205.217.153.62" addrtype="ipv4" />
<hostnames><hostname name="scanme.nmap.org" type="PTR" /></hostnames>
<ports><extraports state="filtered" count="1019" />
<port protocol="tcp" portid="22"><state state="open" />
<service name="ssh" product="OpenSSH" version="3.1p1"
extrainfo="protocol 1.99" method="probed" conf="10" /> </port>
<port protocol="tcp" portid="25"><state state="open" />
<service name="smtp" product="qmail smtpd" method="probed" conf="10" />
</port>
<port protocol="tcp" portid="53"><state state="open" />
<service name="domain" product="ISC BIND" version="9.2.1" method="probed"
conf="10" /> </port>
<port protocol="tcp" portid="80"><state state="open" />
<service name="http" product="Apache httpd" version="2.0.39" conf="10"
extrainfo="(Unix) mod_perl/1.99_07-dev" method="probed" /> </port>
<port protocol="tcp" portid="113"><state state="closed" />
<service name="auth" method="table" conf="3" /> </port>
</ports>
<os>
<portused state="open" proto="tcp" portid="22" />
<portused state="closed" proto="tcp" portid="113" />
<osclass type="general purpose" vendor="Linux" osfamily="Linux"
osgen="2.4.X" accuracy="100" />
<osclass type="general purpose" vendor="Linux" osfamily="Linux"
osgen="2.5.X" accuracy="100" />
<osmatch name="Linux 2.4.0 - 2.5.20" accuracy="100" />
<osmatch name="Linux 2.4.18 - 2.4.20" accuracy="100" />
</os>
<uptime seconds="813079" lastboot="Fri Nov 12 15:35:00 2004" />
<tcpsequence index="1972182" difficulty="Good luck!"
values="E2E6D835,E32B1CB7,E3203691,E3740715,E36B40C8,E33B1621"/>
<ipidsequence class="All zeros" values="0,0,0,0,0,0" />
<tcptssequence class="100HZ"
values="4D8A8C7,4D8A8D3,4D8A8DF,4D8A8EB,4D8A8F7,4D8A903" />
</host>
<runstats>
<finished time="1102743614" timestr="Fri Dec 10 21:40:14 2004" />
<hosts up="1" down="0" total="1" />
<!-- Nmap run completed at Fri Dec 10 21:40:14 2004;
1 IP address (1 host up) scanned in 21.142 seconds -->
</runstats>
</nmaprun>
Another advantage of XML is that its verbose nature makes it
easier to read and understand than other formats. Readers familiar
with Nmap in general can likely understand most of the XML output in Example 13.9, “An example of Nmap XML output” without further documentation. The
grepable output format, on the other hand, is tough to decipher
without its own reference guide.
There are a few aspects of the example XML output which may not
be self-explanatory. For example, look at the two
port
elements in Example 13.10, “Nmap XML port elements”
Example 13.10. Nmap XML port elements
<port protocol="tcp" portid="22"><state state="open" />
<service name="ssh" product="OpenSSH" version="3.1p1"
extrainfo="protocol 1.99" method="probed" conf="10" />
</port>
<port protocol="tcp" portid="113"><state state="closed" />
<service name="auth" method="table" conf="3" />
</port>
The port protocol, ID (port number), state, and service name are the
same as would be shown in the interactive output port table. The
service product, version, and extrainfo come from version detection
and are combined together into one field of the interactive output
port table. The method
and conf
attributes aren't present in any other output types. The method can
be table
, meaning the service name was simply
looked up in nmap-services
based on the port
number and protocol, or it can be probed
, meaning
that it was determined through the version detection system. The
conf
attribute measures the confidence Nmap has
that the service name is correct. The values range from one (least
confident) to ten. Nmap only has a confidence level of three for
ports determined by table lookup, while it is highly confident (level
10) that port 22 of Example 13.10, “Nmap XML port elements” is SSH, because Nmap connected to the port and found a server
exhibiting the SSH protocol.
One other aspect that some users find confusing is that the
attributes nmaprun/start and finished/end hold timestamps given in
Unix time, the number of seconds January 1, 1970. This is often
easier for programs to handle. For the convenience of human readers,
versions 3.78 and newer include the equivalent calendar time written
out in the attributes nmaprun/startstr and finished/endstr.
Nmap includes a document type definition (DTD) which allows XML
parsers to validate Nmap XML output. While it is primarily intended
for programmatic use, it can also help humans interpret Nmap XML
output. The DTD defines the legal elements of the format, and often
enumerates the attributes and values they can take on. It is
reproduced in Appendix A, Nmap XML Output DTD
The Nmap XML format can be used in many powerful ways, though
few users actually take any advantage of it. I believe this is due to
inexperience of many users with XML, combined with a lack of
practical, solution-oriented documentation on using the Nmap XML
format. This chapter provides several practical examples, including
the section called “Manipulating XML Output with Perl”
the section called “Output to a Database”,
and the section called “Creating HTML Reports”
A key advantage of XML is that you do not need to write your
owner parser as you do for specialized Nmap output types such as
grepable and interactive output. Any general XML parser should do.
The XML parser that people are most familiar with is the one in
your web browser. Both IE and Mozilla/Firefox include capable parsers
that can be used to view Nmap XML data. Using them is as simple as
typing the XML filename or URL into the address bar. A document
tree-view is shown, allowing you to expand and reduce elements as
desired. It also provides syntax highlighting to quickly recognize
key elements. If you know you'll be reading Nmap output, saving
normal or interactive output as well as XML is advisable as most people find them the
easiest to read and interpret. But if you only have XML output
because you lost or didn't create the other forms, or because you are
debugging a program that uses the XML, then reading the XML in a web
browser is often preferable to using a text editor. Figure 13.1, “Reading XML in a web browser” shows Firefox rendering a
tree view of Nmap XML.
Similarly, spreadsheet programs, including Microsoft Excel, are
often able to import Nmap XML data directly for viewing.
A major problem with browsing Nmap XML logs directly through a
web browser or spreadsheet is that the logs are treaded in a generic
way, just like any other XML file. The browser doesn't understand the
relative importance of elements, nor how to organize the data for a
more useful presentation. With the help of a stylesheet specific to
Nmap, the logs can be rendered in a much more useful fashion. This is
demonstrated in the section called “Creating HTML Reports”