Add a README.

sangaline · sangaline · commit 930a12ab7dbb · 2017-03-16T04:56:13.000-07:00
diff --git a/README.md b/README.md
@@ -0,0 +1,15 @@
+# Advanced Web Scraping Tutorial Project
+
+*This repository is a companion to the article [Advanced Web Scraping: Bypassing captcha, "403 Forbidden," and more](http://sangaline.com/post/advanced-web-scraping).
+Please refer to the article for further details.*
+
+This is a [scrapy](https://scrapy.org/) web scraper for the fictional Zipru torrent site.
+It is designed to bypass four distinct anti-scraping mechanisms:
+
+1. User agent filtering.
+2. Obfuscated javascript redirects.
+3. Captchas.
+4. Header consistency checks.
+
+The scraper is not actually functional because Zipru is not a real site.
+The code, however, is otherwise complete and can easily be adapted to work on other sites.