regExp Experts.....
am 17.01.2007 23:07:06 von Russell
hey,
I'm struggling trying to get the concepts of the regExp function down....
What i'm trying to achieve is to remove all white space from html formatted
source code.
I have the following regExp search string to remove all html source code but
that is not what i require..
RegExp.Pattern = "\<.*?\>"
I want to store all html source code and its text/image contents into a DB.
For that reason i want to remove all line breaks and spacing/tabs within the
source code so as to only have one exteremely long single line leaving the
formatting of the contents of the html code alone all so that i can stuff
into a varchar(MAX) field
thanks!!!
R
Re: regExp Experts.....
am 17.01.2007 23:30:36 von Anthony Jones
"Russell" wrote in message
news:umz9UPoOHHA.4940@TK2MSFTNGP03.phx.gbl...
> hey,
>
> I'm struggling trying to get the concepts of the regExp function down....
>
> What i'm trying to achieve is to remove all white space from html
formatted
> source code.
>
> I have the following regExp search string to remove all html source code
but
> that is not what i require..
> RegExp.Pattern = "\<.*?\>"
>
> I want to store all html source code and its text/image contents into a
DB.
>
> For that reason i want to remove all line breaks and spacing/tabs within
the
> source code so as to only have one exteremely long single line leaving the
> formatting of the contents of the html code alone all so that i can stuff
> into a varchar(MAX) field
Why is removing all this whitespace important?
Are you sure that all the whitespace is insignficant? there are times where
certain markup fails to render quite right when whitespace typically present
is removed.
Re: regExp Experts.....
am 17.01.2007 23:38:33 von Russell
I guess removing it is not that important and i am aware of how XML
whitespace is imoportant though for standard HTML. its nothing more than a
formatting language.
Would you happen to know of the regExp.Pattern string i should use?
"Anthony Jones" wrote in message
news:uE33dcoOHHA.4940@TK2MSFTNGP03.phx.gbl...
>
> "Russell" wrote in message
> news:umz9UPoOHHA.4940@TK2MSFTNGP03.phx.gbl...
>> hey,
>>
>> I'm struggling trying to get the concepts of the regExp function down....
>>
>> What i'm trying to achieve is to remove all white space from html
> formatted
>> source code.
>>
>> I have the following regExp search string to remove all html source code
> but
>> that is not what i require..
>> RegExp.Pattern = "\<.*?\>"
>>
>> I want to store all html source code and its text/image contents into a
> DB.
>>
>> For that reason i want to remove all line breaks and spacing/tabs within
> the
>> source code so as to only have one exteremely long single line leaving
>> the
>> formatting of the contents of the html code alone all so that i can stuff
>> into a varchar(MAX) field
>
> Why is removing all this whitespace important?
> Are you sure that all the whitespace is insignficant? there are times
> where
> certain markup fails to render quite right when whitespace typically
> present
> is removed.
>
>
>
Re: regExp Experts.....
am 18.01.2007 10:05:19 von Anthony Jones
"Russell" wrote in message
news:eFlj6goOHHA.4604@TK2MSFTNGP06.phx.gbl...
> I guess removing it is not that important and i am aware of how XML
> whitespace is imoportant though for standard HTML. its nothing more than a
> formatting language.
>
> Would you happen to know of the regExp.Pattern string i should use?
>
None that you can be sure 100% won't be causing a problem. However you can
be sure that storing the HTML with it's existing whitespace is 100% problem
free. What benefit is derived from this procedure do you get that is worth
the risk introduced into the system?