Presentation Material

AI Generated Summary^{may contain errors}

Here is a summarized version of the content:

The speaker is discussing a PHP-based solution to prevent web crawlers from scraping their website. They explain that they have developed a system that sets a session variable to indicate when session management is on, and then checks for the existence of the session on subsequent pages. If the session doesn’t exist or has exceeded an error threshold, the user is redirected to a random URL.

The speaker also shows reports generated by testing their solution with various crawlers, including WG and Paros, and claims that it can effectively stop crawlers in their tracks.

In the Q&A session, the speaker answers questions about how their solution works, including:

How it prevents direct linking into the site
How it limits the number of links that can be crawled before the session expires
The importance of not crawling authenticated areas of a website to avoid data tampering problems

The speaker also mentions that they have written crawlers in the past but don’t like them, and encourages others to research and develop new solutions to combat web scraping.

Hackers of India

Defeating Automated Web Assessment Tools

By Saumil Shah on 28 Jul 2004 @ Blackhat

Presentation Material

AI Generated Summary^{may contain errors}

Hackers of India

Defeating Automated Web Assessment Tools

By Saumil Shah on 28 Jul 2004 @ Blackhat

Presentation Material

AI Generated Summarymay contain errors

AI Generated Summary^{may contain errors}