Defined for a web application, the black/white lists provide a way to ensure that only selected parts of the web application will be scanned. You may define a black list, a white list or both together.
When the web application has a black list only (no white list), any link that matches a black list entry will not be crawled.
When the web application has both a black list and a white list, the white list entries are treated as exceptions to the black list. Any link that matches a black list entry will not be crawled unless it also matches a white list entry. Links that only match a black list entry will not be crawled. All other links including matches for white list entries will be crawled.
When the web application has a white list only (no black list), no links will be crawled unless they match a white list entry.
Important! Automated web application scanning has the potential of causing data loss. Use the black list feature to avoid data loss. This feature prevents the web crawler from making requests for certain links in your web application.
For a production web application, it's best practice to blacklist pages with certain functionality that if executed would have undesirable results, such as possibly sending out too many emails, potentially submitting a "delete all" button, or disabling/deleting accounts. See Web Crawling and Black List.
The black list identifies the links (URLs) in the web application that you do not want to be scanned. For each string specified, the crawler performs a string match against each link it encounters. When a match is found, the crawler does not submit a request for the link unless it also matches a white list entry.
The black list can consist of URLs and/or regular expressions.
URLs. Select the check box to enter URLs for the black list. Enter each URL on a new line. Each URL can have a maximum of 2048 characters. For example, specify corp for all URLs containing the string "corp".
Regular Expressions. Select the check box to enter regular expressions for the black list. Enter each regular expression on a new line. Each regular expression can have a maximum of 2048 characters. For example, specify /my/path/.* for all URLs under the /my/path/ directory.
The white list identifies the links (URLs) in the web application that you want to be scanned. For each string specified, the crawler performs a string match against each link it encounters. When a match is found, the crawler submits a request for the link. When there is a white list only (no black list), no links will be crawled unless they match a white list entry.
The white list can consist of URLs and/or regular expressions.
URLs. Select the check box to enter the URLs for the white list. Enter each URL on a new line. Each URL can have a maximum of 2048 characters. For example, specify servers for all URLs containing the string "servers".
Regular Expressions. Select the check box to enter regular expressions for the white list. Enter each regular expression on a new line. Each regular expression can have a maximum of 2048 characters. For example, specify /my/path/.* for all URLs under the /my/path/ directory.