BYTE.com > Tangled in the Threads > 2001 > February
Snooping On Website APIs
By Jon Udell
February 8, 2001
(Website API Discovery
: Page 3 of 3 )
Let's return to the question of website reverse engineering.
In principle, as I've said, it's easy because the Web is pretty much an open book. But in practice, it's tedious to work out the sequences of requests and responses that define a website's API. GET requests that involve no header manipulation are a no-brainer, but POST requests that send complex form data, along with an HTTP authentication header (name/password), and maybe also a cookie header (with session state information), take a bit more doing.
You can of course issue requests from an HTTP-aware script language, for example Perl with its LWP module, and use the splendid facilities of Perl to analyze and programmatically respond to pages that you fetch from a site. Doing this on a secure site is straightforward too, thanks to Perl's Crypt::SSLeay module, which lets LWP work with encrypted https-style pages as well as normal http pages. (Alternatively, if you can't or don't want to add SSL capability to your installation of Perl, you can put stunnel between Perl and the encrypted website.) But even for Perl hackers, it's a bit tedious to use Perl to both explore and automate website APIs.
And suppose you'd like an ordinary civilian to do it. Why? One of the reasons you play this game is to develop software that drives a site through a complex series of interactions, in order to stress-test the site, or in order to develop a baseline profile that can then be used for regression testing -- that is, to ensure that the site continues to behave in the expected ways over time. It would be great if you didn't need a Perl hacker to gather this information, but could instead let an ordinary civilian with a browser, and possibly more knowledge of the application domain, do that instead.
In this case, a personal Web proxy can be a great tool. I first touched on this subject in a column last year. A personal Web proxy is a lightweight proxy server that sees and can act on traffic between your browser and the websites it connects to.
BYTE.com > Tangled in the Threads > 2001 > February
|