URLScan

URLScan

am 09.01.2008 15:50:57 von Kenny

Hello,

URLScan breaks the formatting of the IIS 5.0 logs by including a
single space character in it's entry in the IIS log, for example, the
following entry:

/ ~/

As each column in the IIS 5.0 log is delimited by the space character,
I can find no way to load the IIS log into SQL Server.

My IIS log file is rather too big to load into a text editor and
perform a find / replace, and I don't have access to sed or awk.


Is it possible to configure URLScan so that it leaves a different
message (with no whitespace) in the IIS log, such that the structure
of the log file is kept intact?

Or is it possible to configure IIS 5.0 to use Tabs to delimit the
columns in the log file?

Many thanks

Kenny

Re: URLScan

am 09.01.2008 21:49:13 von David Wang

On Jan 9, 6:50=A0am, Kenny wrote:
> Hello,
>
> URLScan breaks the formatting of the IIS 5.0 logs by including a
> single space character in it's entry in the IIS log, for example, the
> following entry:
>
> / ~/
>
> As each column in the IIS 5.0 log is delimited by the space character,
> I can find no way to load the IIS log into SQL Server.
>
> My IIS log file is rather too big to load into a text editor and
> perform a find / replace, and I don't have access to sed or awk.
>
> Is it possible to configure URLScan so that it leaves a different
> message (with no whitespace) in the IIS log, such that the structure
> of the log file is kept intact?
>
> Or is it possible to configure IIS 5.0 to use Tabs to delimit the
> columns in the log file?
>
> Many thanks
>
> Kenny


The delimiter of the log file is defined by W3C specification, so
there is no way that IIS can be configured to use Tabs to delimit
columns.

Are you sure that URLScan is inserting the white space? Where is the
~/ coming from -- it seems like you have something else modifying the
log entry.

URLScan does not insert white spaces anywhere. It does the fast path
rejection by rewriting the URL to: / (no spaces
nor ~/). Thus, if you see any other characters in the log for that log
field, it is not coming from URLScan.

Now, you can configure URL to rewrite the URL to a different value
(look inside URLSCAN.INI for the property -- it is visible and
documented), and if that still has " ~/" trailing it, then the problem
is definitely not with URLScan because it does not append what you
claim.

Honestly, I do not see URLScan do what you claim, so I think you have
some other ISAPI Filter causing this issue.


//David
http://w3-4u.blogspot.com
http://blogs.msdn.com/David.Wang
//

Re: URLScan

am 10.01.2008 11:15:11 von Kenny

David,

Thanks for your time.

You state that "The delimiter of the log file is defined by W3C
specification". In fact, the W3C specification for field delimiters is
whitespace, not space characters: "Fields are separated by whitespace,
the use of tab characters for this purpose is encouraged" -
http://www.w3.org/TR/WD-logfile.html

So the W3C in fact encourage anyone implementing the standard to do so
with tab characters delimiting fields.

You are quite correct - URLscan was not adding the trailling "~/" -
this just appears in another data field that I was not expecting.

Many thanks for your help.

Kind regards,

Kenny

On Jan 9, 8:49 pm, David Wang wrote:
> On Jan 9, 6:50 am, Kenny wrote:
>
>
>
> > Hello,
>
> > URLScan breaks the formatting of the IIS 5.0 logs by including a
> > single space character in it's entry in the IIS log, for example, the
> > following entry:
>
> > / ~/
>
> > As each column in the IIS 5.0 log is delimited by the space character,
> > I can find no way to load the IIS log into SQL Server.
>
> > My IIS log file is rather too big to load into a text editor and
> > perform a find / replace, and I don't have access to sed or awk.
>
> > Is it possible to configure URLScan so that it leaves a different
> > message (with no whitespace) in the IIS log, such that the structure
> > of the log file is kept intact?
>
> > Or is it possible to configure IIS 5.0 to use Tabs to delimit the
> > columns in the log file?
>
> > Many thanks
>
> > Kenny
>
> The delimiter of the log file is defined by W3C specification, so
> there is no way that IIS can be configured to use Tabs to delimit
> columns.
>
> Are you sure that URLScan is inserting the white space? Where is the
> ~/ coming from -- it seems like you have something else modifying the
> log entry.
>
> URLScan does not insert white spaces anywhere. It does the fast path
> rejection by rewriting the URL to: / (no spaces
> nor ~/). Thus, if you see any other characters in the log for that log
> field, it is not coming from URLScan.
>
> Now, you can configure URL to rewrite the URL to a different value
> (look inside URLSCAN.INI for the property -- it is visible and
> documented), and if that still has " ~/" trailing it, then the problem
> is definitely not with URLScan because it does not append what you
> claim.
>
> Honestly, I do not see URLScan do what you claim, so I think you have
> some other ISAPI Filter causing this issue.
>
> //Davidhttp://w3-4u.blogspot.comhttp://blogs.msdn.com/David. Wang
> //

Re: URLScan

am 10.01.2008 13:04:35 von David Wang

Ok... since UrlScan is not inserting space characters when rejecting
requests (which would corrupt log field lines otherwise delimited by
space characters), what is your actual question?

I'm assuming your problem was with the space between
/
and
~/

And since you said that ~/ is another field, it seems that the space
between the two is perfectly justified.


//David
http://w3-4u.blogspot.com


Ok... then where is UrlScan inserting space characters into the log
file to confuse the log delimiter?

On Jan 10, 2:15=A0am, Kenny wrote:
> David,
>
> Thanks for your time.
>
> You state that "The delimiter of the log file is defined by W3C
> specification". In fact, the W3C specification for field delimiters is
> whitespace, not space characters: "Fields are separated by whitespace,
> the use of tab characters for this purpose is encouraged" -http://www.w3.o=
rg/TR/WD-logfile.html
>
> So the W3C in fact encourage anyone implementing the standard to do so
> with tab characters delimiting fields.
>
> You are quite correct - URLscan was not adding the trailling "~/" -
> this just appears in another data field that I was not expecting.
>
> Many thanks for your help.
>
> Kind regards,
>
> Kenny
>
> On Jan 9, 8:49 pm, David Wang wrote:
>
>
>
> > On Jan 9, 6:50 am, Kenny wrote:
>
> > > Hello,
>
> > > URLScan breaks the formatting of the IIS 5.0 logs by including a
> > > single space character in it's entry in the IIS log, for example, the
> > > following entry:
>
> > > / ~/
>
> > > As each column in the IIS 5.0 log is delimited by the space character,=

> > > I can find no way to load the IIS log into SQL Server.
>
> > > My IIS log file is rather too big to load into a text editor and
> > > perform a find / replace, and I don't have access to sed or awk.
>
> > > Is it possible to configure URLScan so that it leaves a different
> > > message (with no whitespace) in the IIS log, such that the structure
> > > of the log file is kept intact?
>
> > > Or is it possible to configure IIS 5.0 to use Tabs to delimit the
> > > columns in the log file?
>
> > > Many thanks
>
> > > Kenny
>
> > The delimiter of the log file is defined by W3C specification, so
> > there is no way that IIS can be configured to use Tabs to delimit
> > columns.
>
> > Are you sure that URLScan is inserting the white space? Where is the
> > ~/ coming from -- it seems like you have something else modifying the
> > log entry.
>
> > URLScan does not insert white spaces anywhere. It does the fast path
> > rejection by rewriting the URL to: / =A0(no spaces
> > nor ~/). Thus, if you see any other characters in the log for that log
> > field, it is not coming from URLScan.
>
> > Now, you can configure URL to rewrite the URL to a different value
> > (look inside URLSCAN.INI for the property -- it is visible and
> > documented), and if that still has " ~/" trailing it, then the problem
> > is definitely not with URLScan because it does not append what you
> > claim.
>
> > Honestly, I do not see URLScan do what you claim, so I think you have
> > some other ISAPI Filter causing this issue.
>
> > //Davidhttp://w3-4u.blogspot.comhttp://blogs.msdn.com/David. Wang
> > //- Hide quoted text -
>
> - Show quoted text -

Re: URLScan

am 10.01.2008 13:11:51 von David Wang

And as you noted, it is not required to delimit by tabs, so there is
no option in IIS to change the log delimiter for W3C format.

And URLScan allows configuration of the logged URL, so that answers
your other question

And I don't see where URLScan is breaking the formatting of the log
file based on your rescinded remark of the example.
If / is one field and ~/ is another field and
there is a single space in between them, that sounds like proper
delimiting to me.

Please clarify your question because I really don't see one.


//David
http://w3-4u.blogspot.com
http://blogs.msdn.com/David.Wang
//






On Jan 10, 4:04=A0am, David Wang wrote:
> Ok... since UrlScan is not inserting space characters when rejecting
> requests (which would corrupt log field lines otherwise delimited by
> space characters), what is your actual question?
>
> I'm assuming your problem was with the space between
> /
> and
> ~/
>
> And since you said that ~/ is another field, it seems that the space
> between the two is perfectly justified.
>
> //Davidhttp://w3-4u.blogspot.com
>
> Ok... then where is UrlScan inserting space characters into the log
> file to confuse the log delimiter?
>
> On Jan 10, 2:15=A0am, Kenny wrote:
>
>
>
> > David,
>
> > Thanks for your time.
>
> > You state that "The delimiter of the log file is defined by W3C
> > specification". In fact, the W3C specification for field delimiters is
> > whitespace, not space characters: "Fields are separated by whitespace,
> > the use of tab characters for this purpose is encouraged" -http://www.w3=
..org/TR/WD-logfile.html
>
> > So the W3C in fact encourage anyone implementing the standard to do so
> > with tab characters delimiting fields.
>
> > You are quite correct - URLscan was not adding the trailling "~/" -
> > this just appears in another data field that I was not expecting.
>
> > Many thanks for your help.
>
> > Kind regards,
>
> > Kenny
>
> > On Jan 9, 8:49 pm, David Wang wrote:
>
> > > On Jan 9, 6:50 am, Kenny wrote:
>
> > > > Hello,
>
> > > > URLScan breaks the formatting of the IIS 5.0 logs by including a
> > > > single space character in it's entry in the IIS log, for example, th=
e
> > > > following entry:
>
> > > > / ~/
>
> > > > As each column in the IIS 5.0 log is delimited by the space characte=
r,
> > > > I can find no way to load the IIS log into SQL Server.
>
> > > > My IIS log file is rather too big to load into a text editor and
> > > > perform a find / replace, and I don't have access to sed or awk.
>
> > > > Is it possible to configure URLScan so that it leaves a different
> > > > message (with no whitespace) in the IIS log, such that the structure=

> > > > of the log file is kept intact?
>
> > > > Or is it possible to configure IIS 5.0 to use Tabs to delimit the
> > > > columns in the log file?
>
> > > > Many thanks
>
> > > > Kenny
>
> > > The delimiter of the log file is defined by W3C specification, so
> > > there is no way that IIS can be configured to use Tabs to delimit
> > > columns.
>
> > > Are you sure that URLScan is inserting the white space? Where is the
> > > ~/ coming from -- it seems like you have something else modifying the
> > > log entry.
>
> > > URLScan does not insert white spaces anywhere. It does the fast path
> > > rejection by rewriting the URL to: / =A0(no space=
s
> > > nor ~/). Thus, if you see any other characters in the log for that log=

> > > field, it is not coming from URLScan.
>
> > > Now, you can configure URL to rewrite the URL to a different value
> > > (look inside URLSCAN.INI for the property -- it is visible and
> > > documented), and if that still has " ~/" trailing it, then the problem=

> > > is definitely not with URLScan because it does not append what you
> > > claim.
>
> > > Honestly, I do not see URLScan do what you claim, so I think you have
> > > some other ISAPI Filter causing this issue.
>
> > > //Davidhttp://w3-4u.blogspot.comhttp://blogs.msdn.com/David. Wang
> > > //- Hide quoted text -
>
> > - Show quoted text -- Hide quoted text -
>
> - Show quoted text -