Back in 2014, I had launched a site called estipaper.com, and I was trying to figure out ways of getting some early traffic. I had noticed that someone had posted it on a website where people would rank pointless sites (yes, I did want to get to the top of this list and I understand the irony of that). The higher the rank, the more traffic. So I started a long game of cat and mouse with the site operator where I would vote spam my site to get higher on the rankings. Eventually, they just removed Estipaper from the site entirely, but it was a lot of fun (and got tens of thousands of visits in the meantime). The cat and mouse game probably lasted 2-3 months.

read more

#!/usr/bin/python
#estimation: estipaper is 450 above cookie clicker on 12/20/14. Clicker caught up on 1/2/15. 
#estimation: estipaper is 495 above cookie clicker on 1/2/15. Clicker caught up on 1/10/15. 
#estimation: estipaper is 1229 above cookie clicker on 1/10/15. Went down to position 11 on 1/12/15. It seems as though they have frozen the vote count for estipaper. 

#appears as though they've finally implemented a check for the actual source address. 
#looking into proxies now: scrapping from hidemyass.com's free ones. 
#Low: x-forwarded is my IP, HTTP-VIA exists and sometimes ProxyID. 
#medium: IP was sometimes not even the proxy's IP, but seemed to route traffic through another proxy, with the x-forwarded being the proxy's IP. Another: Unknown x-forwarded, did show HTTP via. Another: routed through two proxies? Proxy IP wasn't visible at all. Another: proxy IP and forwarded for was random IP. Another: another proxy , x-forwarded was similar to proxy's IP. Another: proxy IP but then forwarded for was loopback (127.0.0.1). 
#high: Proxy IP and just HTTP Via. Another: no headers at all. Another: just HTTP_VIA. Another: just HTTP_VIA. Another: just HTTP_VIA. Another: just HTTP_VIA. Another: just HTTP_VIA. Another: no headers at all. Another: no headers at all.
#https: These were only high anonymity proxies and they are fully able to transmit via http. They were either Elite or anonymous proxies when tested. 
#Found a proxy that was high, https, which was actually transparent. Probably aren't many of them, though. 

#using hide-my-ass scraper: ./hide_my_python.py -o test -pr {http,https}

#test uniqueness of browser (should work with curl): https://panopticlick.eff.org

#If DNS is a concern, can specify DNS server with --dns-servers

#Looks like I can send http over https proxies. 

import os
import sys
import time
import random
import traceback

arrayOfProxies = [("58.246.199.122","3128"),("92.255.242.163","8800"), redacted]
arrayOfUserAgents = ['Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.99 Safari/537.36','Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36','Mozilla/5.0 (Windows NT 6.3; rv:36.0) Gecko/20100101 Firefox/36.0','Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10; rv:33.0) Gecko/20100101 Firefox/33.0','Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.75.14 (KHTML, like Gecko) Version/7.0.3 Safari/7046A194A','Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2227.0 Safari/537.36']
numberOfProxies = len(arrayOfProxies)
numberOfUserAgents = len(arrayOfUserAgents)
runID = sys.argv[1]
targetSiteSearch = sys.argv[2]
targetSiteID = sys.argv[3]
timeToWait = sys.argv[4]
runNumber = 0

while True:
	try: 
		runNumber = runNumber + 1
		os.system("rm cookies{0}".format(runID))
		currentProxy = arrayOfProxies[random.randint(0,numberOfProxies-1)]
		currentUserAgent = arrayOfUserAgents[random.randint(0,numberOfUserAgents-1)]
		print "Using user agent: "+currentUserAgent
		print "Using the following proxy: "+currentProxy[0]+":"+currentProxy[1]+" out of "+str(numberOfProxies)
		os.system("curl --max-time 90 --connect-timeout 150 --proxy http://{0}:{1} -b cookies{2} -c cookies{2} http://www.pointlesssites.com/site-search.asp?t={3} -H 'Accept-Encoding: gzip, deflate, sdch' -H 'Accept-Language: en-US,en;q=0.8' -H 'User-Agent: {4}' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8' -H 'Cache-Control: max-age=0' -H 'Connection: keep-alive' --compressed > homepage{2}".format(currentProxy[0],currentProxy[1],runID,targetSiteSearch,currentUserAgent))
		with open("homepage{0}".format(runID)) as homepage:
			homepageAsString = homepage.read()
		secondHalfOfHomepage = homepageAsString.split('vbut("{0}"'.format(targetSiteID))[1]
		nValuesAsString = secondHalfOfHomepage.split(')\'> response{2}".format(currentProxy[0], currentProxy[1], runID, targetSiteID, n1, n2, currentUserAgent))
		with open("response{0}".format(runID)) as response:
			responseAsString = response.read()
		print "Run: #{0}".format(runNumber) + ". Response is: " + responseAsString
		print "Avoiding rate limit and suspicion, waiting {0} seconds...".format(timeToWait)
		time.sleep(int(timeToWait))
	except Exception as e:
		traceback.print_exc()
		errorTimeToWait = random.randint(350,400)
		print "Hit exception, waiting a random {0} seconds...".format(errorTimeToWait)
		time.sleep(errorTimeToWait)