Screen Scraping Questions

Screen Scraping Questions

am 27.01.2008 04:33:01 von Nick

I need to write a VB.NET application for the network team at work that will
backup a firewall configuration each evening. I found that I may need screen
scraping to do this, but am not quite sure how to go about it as I have never
done it before.

The web login page is at https://firewall/auth.html and contains a form that
looks like:

action="auth.cgi" method="POST" target="authTgtFrm">

The certificate on the login page isn't trusted as it is self signed. I
have used the following to get around it.

I call this from my console app:

ServicePointManager.ServerCertificateValidationCallback = New
RemoteCertificateValidationCallback(AddressOf
CertificateHandler.ValidateServerCertificate)

Which uses this class:

Imports System
Imports System.Net.Security
Imports System.Security.Cryptography.X509Certificates

Public Class CertificateHandler

Public Shared Function ValidateServerCertificate(ByVal sender As Object,
ByVal certificate As X509Certificate, ByVal chain As X509Chain, ByVal
sslPolicyErrors As SslPolicyErrors) As Boolean
Return True
End Function

End Class

The two input boxes in question on the auth page are named userName and pwd.
If the login is successful I get redirected to http://firewall/main.html.
From there I need to download the configuration http://firewall/config.bin.

Does anyone have an idea how I can go about this?

Re: Screen Scraping Questions

am 29.01.2008 23:38:49 von alex_f_il

On Jan 26, 10:33=A0pm, Nick wrote:
> I need to write a VB.NET application for the network team at work that wil=
l
> backup a firewall configuration each evening. =A0I found that I may need s=
creenscrapingto do this, but am not quite sure how to go about it as I have =
never
> done it before.
>
> Theweblogin page is athttps://firewall/auth.htmland contains a form that
> looks like: "
> action=3D"auth.cgi" method=3D"POST" target=3D"authTgtFrm">
>
> The certificate on the login page isn't trusted as it is self signed. =A0I=

> have used the following to get around it.
>
> I call this from my console app:
>
> ServicePointManager.ServerCertificateValidationCallback =3D New
> RemoteCertificateValidationCallback(AddressOf
> CertificateHandler.ValidateServerCertificate)
>
> Which uses this class:
>
> Imports System
> Imports System.Net.Security
> Imports System.Security.Cryptography.X509Certificates
>
> Public Class CertificateHandler
>
> =A0 =A0 Public Shared Function ValidateServerCertificate(ByVal sender As O=
bject,
> ByVal certificate As X509Certificate, ByVal chain As X509Chain, ByVal
> sslPolicyErrors As SslPolicyErrors) As Boolean
> =A0 =A0 =A0 =A0 Return True
> =A0 =A0 End Function
>
> End Class
>
> The two input boxes in question on the auth page are named userName and pw=
d.
> =A0If the login is successful I get redirected tohttp://firewall/main.html=
.. =A0
> From there I need to download the configurationhttp://firewall/config.bin.=

>
> Does anyone have an idea how I can go about this?

You can also try SWExplorerAutomation from http://webius.net/ to
record and generate VB.NET automation/scrapping code.