A list of things to try in order to scrape a webpage, I believe this was in the very early stages of Coursicle when I was trying to figure out how to make web requests in Java so we could scrape a webpage.
read more
PASSING THE CREDENTIALS WITH THIS SIMULATOR
WebClient webClient = new WebClient();
DefaultCredentialsProvider creds = new DefaultCredentialsProvider();
// Set some example credentials
creds.addCredentials("usr", "pwd");
// And now add the provider to the webClient instance
webClient.setCredentialsProvider(creds);
webClient.getPage("")
WHAT DOES THIS MEAN?
Ability to customize the request headers being sent to the server (from Htmlunit).
ANY WAY THIS CAN HELP?
HtmlPage.executeJavascript(String yourJsCode).
FORM SUBMISSION EXAMPLE
@Test
public void submittingForm() throws Exception {
final WebClient webClient = new WebClient();
// Get the first page
final HtmlPage page1 = webClient.getPage("http://some_url");
// Get the form that we are dealing with and within that form,
// find the submit button and the field that we want to change.
final HtmlForm form = page1.getFormByName("myform");
final HtmlSubmitInput button = form.getInputByName("submitbutton");
final HtmlTextInput textField = form.getInputByName("userid");
// Change the value of the text field
textField.setValueAttribute("root");
// Now submit the form by clicking the button and get back the second page.
final HtmlPage page2 = button.click();
webClient.closeAllWindows();
}
PERL CAN'T EXECUTE JAVASCRIPT, rRIGHT?
IF SO, THIS RETURNS FULL HTML:
GET 'http://yahoo.com/'
IF THERE IS A LIBRARY MAYBE IT WOULD BE EASIER WITH PERL.
LOOK AT THE BEAUTIFUL SOAP DOCS, UNLIKELY THOUGH
CAN WE CIRCUMVENT LOGIN ENTIRELY BY STORING THE COOKIES?
Session Detection
The SDK automatically detects if there is a cookie present in your registered site. If there is, it will automatically login the currrent user without the need to click the facebook login button.
IS THERE A COOKIE GENERATOR THAT TAKES THE USERNAME/PASSWORD AS INPUTS.