HelpStartConceptsSpider

Spider

The spider is a tool than is used to automatically discover new resources (URLs) on a particular Site. It begins with a list of URLs to visit, called the seeds, which depends on how the Spider is started. The Spider then visits these URLs, it identifies all the hyperlinks in the page and adds them to the list of URLs to visit and the process continues recursively as long as new resources are found.

There are 4 methods of starting the Spider, differentiated by the seed list with which it starts:

Spider Site - The seed list contains all the existing URIs already found for the selected site.
Spider Subtree - The seed list contains all the existing URIs already found and present in the subtree of the selected node.
Spider URL - The seed list contains only the URI corresponding to the selected node (in the Site Tree).
Spider all in Scope - The seed list contains all the URIs the user has selected as being 'In Scope'.
Spider all in Context... - The seed list contains all the URIs user has selected as being in the selected context.

More details can be found below, in the "Accessed via" section

During the processing of an URL, the Spider makes a request to fetch the resource and then parses the response, identifying hyperlinks. It currently has the following behavior when processing types of responses:

HTML

Processes the specific tags, identifying links to new resources:

Base - Proper handling
A, Link, Area - 'href' attribute
Frame, IFrame, Script, Img - 'src' attribute
Meta - 'http-equiv' for 'location' and 'refresh'
Form - proper handling of Forms with both GET and POST method. The fields values are generated validly, including HTML 5.0 input types.
Comments - Valid tags found in comments are also analyzed, if specified in the Options Spider screen

Robots.txt file

If set in the Options Spider screen, it also analyzes the 'Robots.txt' file and tries to identify new resources using the specified rules. It has to be mentioned that the Spider does not follow the rules specified in the 'Robots.txt' file.

OData Atom format

OData content using the Atom format is currently supported. All included links (relative or absolute) are processed.

Non-HTML Text Response

Text responses are parsed scanning for the URL pattern

Non-Text response

Currently, the Spider does not process this type of resources.

Other aspects

When checking if an URL was already visited, the behaviour regarding how parameters are handled can be configured on the Spider Options screen.
When checking if an URL was already visited, there are a few common parameters which are ignored: jsessionid, phpsessid, aspsessionid, utm_*
The Spider's behaviour regarding cookies is defined by the option present in the Edit -> Enable Session Tracking option. If that option is enabled, the Spider will properly handle any cookies received from the server and will send them back accordingly. If the option is disabled, the Spider will not send any cookies in its requests.

The spider is configured using the Spider Options screen.

ZAP User Guide
- Introduction
- Getting Started
- The User Interface
  - Overview
  - The Top Level Menu
  - The Top Level Toolbar
  - The Tabs
    - Sites
    - Request
    - Response
    - Break
    - History
    - Search
    - Break Points
    - Alerts
    - Active Scan
    - Spider
    - Params
    - HTTP Sessions
    - Output
    - Callbacks
  - The Dialogs
  - The Footer
- Command Line
- Add Ons
  - Active Scan Rules
  - Ajax Spider
    - Dialog
    - Options
    - Tab
  - Diff
  - Forced Browse
    - Options
    - Tab
  - Fuzzer
  - Getting Started Guide
  - Invoke Applications
    - Options
  - JxBrowser
  - Online Menu
  - Passive Scan Rules
  - Plug-n-Hack
    - Clients tab
  - Quick Start
    - Command Line
    - Launch Options
  - Reveal
  - Scripts
    - Console
    - Tree
  - Selenium
    - Options
    - API
  - Tips and Tricks
  - WebSockets
    - Tab
    - Options
    - Session Properties
    - Scripts
    - API
    - Passive Scan Rules
    - About
  - Zest
- Releases
  - 2.8.0
  - 2.7.0
  - 2.6.0
  - 2.5.0
  - 2.4.3
  - 2.4.2
  - 2.4.1
  - 2.4.0
  - 2.3.1
  - 2.3.0
  - 2.2.2
  - 2.2.1
  - 2.2.0
  - 2.1.0
  - 2.0.0
  - 1.4.1
  - 1.4.0
  - 1.3.4
  - 1.3.3
  - 1.3.2
  - 1.3.1
  - 1.3.0
  - 1.2.0
  - 1.1.0
  - 1.0.0
- Paros Proxy
- Credits

	UI Overview	for an overview of the user interface
	Features	provided by ZAP
	Spider Options screen	for an overview of the Spider Options

Provide feedback

Saved searches

Use saved searches to filter your results more quickly