how does this regular expression works

how does this regular expression works

am 13.07.2011 14:42:43 von win.acc

--20cf303f6948ac312304a7f2c086
Content-Type: text/plain; charset=UTF-8

Reading some code from the web spider,got the such a expression :
preg_match("@]*>(.*?)<\/head>@si",$file, $regs);
It's about get the head content of a website,i wanna know the match detail
..Could someone give some tips ,
thanks in advance .

All you best
------------------------
What we are struggling for ?
The life or the life ?

--20cf303f6948ac312304a7f2c086--

RE: how does this regular expression works

am 13.07.2011 15:04:39 von shehi

------=_NextPart_000_0139_01CC4176.9164F200
Content-Type: text/plain;
charset="UTF-8"
Content-Transfer-Encoding: 7bit

That Regex actually has an unnecessary part to it - it would be better to
write it like:

/\(?P.*?)\<\/head\>/sim

I added /m modifier, telling it that the search is performed in Multiline
basis. Added "head_tag_innerHTML" named reference for you, so writing this
Regex in PHP like this:

preg_match( '/\(?P.*?)\<\/head\>/sim' ,
$subject , $matches )

will place search results inside $matches and you can access your captured
content, i.e. HEAD's value in $matches['head_tag_innerHTML'].

I also replaced your delimiters for better clearance...

It will fetch EVERYTHING that comes inside HEAD tag, i.e. its value/innerHTML.
Putting ? after * makes it to perform *ungreedy* scan. And escaping all
possible Regex characters [including <, >, ? etc] is a good practice.


Shehi





-----Original Message-----
From: who.cat@gmail.com [mailto:who.cat@gmail.com] On Behalf Of who.cat
Sent: 13 iyul 2011 15:43
To: php-db@lists.php.net
Subject: [PHP-DB] how does this regular expression works

Reading some code from the web spider,got the such a expression :
preg_match("@]*>(.*?)<\/head>@si",$file, $regs); It's about get the
head content of a website,i wanna know the match detail .Could someone give
some tips , thanks in advance .

All you best
------------------------
What we are struggling for ?
The life or the life ?

------=_NextPart_000_0139_01CC4176.9164F200
Content-Type: application/pkcs7-signature;
name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
filename="smime.p7s"

MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEH AQAAoIIPjDCCBDIw
ggMaoAMCAQICAQEwDQYJKoZIhvcNAQEFBQAwezELMAkGA1UEBhMCR0IxGzAZ BgNVBAgMEkdyZWF0
ZXIgTWFuY2hlc3RlcjEQMA4GA1UEBwwHU2FsZm9yZDEaMBgGA1UECgwRQ29t b2RvIENBIExpbWl0
ZWQxITAfBgNVBAMMGEFBQSBDZXJ0aWZpY2F0ZSBTZXJ2aWNlczAeFw0wNDAx MDEwMDAwMDBaFw0y
ODEyMzEyMzU5NTlaMHsxCzAJBgNVBAYTAkdCMRswGQYDVQQIDBJHcmVhdGVy IE1hbmNoZXN0ZXIx
EDAOBgNVBAcMB1NhbGZvcmQxGjAYBgNVBAoMEUNvbW9kbyBDQSBMaW1pdGVk MSEwHwYDVQQDDBhB
QUEgQ2VydGlmaWNhdGUgU2VydmljZXMwggEiMA0GCSqGSIb3DQEBAQUAA4IB DwAwggEKAoIBAQC+
QJ30buHqdoccTUVEjr5GyIMGncEq/hgfjuQC+vOrXVCKFjELmgbQxXAizUkt VGPMtm5oRgtT6stM
JMC8ck7q8RWu9FSaEgrDerIzYOLaiVXzIljz3tzP74OGooyUT59o8piQRoQn x3a/48w1LIteB2Rl
gsBIsKiR+WGfdiBQqJHHZrXreGIDVvCKGhPqMaMeoJn9OPb2JzJYbwf1a7j7 FCuvt6rM1mNfc4za
BZmoOKjLF3g2UazpnvR4Oo3PD9lC4pgMqy+fDgHe75+ZSfEt36x0TRuYtUfF 5SnR+ZAYx2KcvoPH
Jns+iiXHwN2d5jVoECCdj9je0sOEnA1e6C/JAgMBAAGjgcAwgb0wHQYDVR0O BBYEFKARCiM+lvEH
7OKvKe+CpX/QMKS0MA4GA1UdDwEB/wQEAwIBBjAPBgNVHRMBAf8EBTADAQH/ MHsGA1UdHwR0MHIw
OKA2oDSGMmh0dHA6Ly9jcmwuY29tb2RvY2EuY29tL0FBQUNlcnRpZmljYXRl U2VydmljZXMuY3Js
MDagNKAyhjBodHRwOi8vY3JsLmNvbW9kby5uZXQvQUFBQ2VydGlmaWNhdGVT ZXJ2aWNlcy5jcmww
DQYJKoZIhvcNAQEFBQADggEBAAhW/ALwm+j/pPrWe8ZEgM5PxMX2AFjMpra8 FEloBHbo5u5d7AIP
YNaNUBhPJk4B4+awpe6/vHRUQb/9/BK4x09a9IlgBX9gtwVK8/bxwr/EuXSG ti19a8zS80bdL8bg
asPDNAMsfZbdWsIOpwqZwQWLqwwv81w6z2w3VQmH3lNAbFjv/LarZW4E9hvc POBaFcae2fFZSDAh
ZQNs7Okhc+ybA6HgN62gFRiP+roCzqcsqRATLNTlCCarIpdg+JBedNSimlO9 8qlo4KJuwtdssaMP
nr/raOdW8q7y4ys4OgmBtWuF174t7T8at7Jj4vViLILUagBBUPE5g5+V6TaW mG4wggTdMIIDxaAD
AgECAhBxkvvmGV+sTRKFdHE0ohinMA0GCSqGSIb3DQEBBQUAMHsxCzAJBgNV BAYTAkdCMRswGQYD
VQQIDBJHcmVhdGVyIE1hbmNoZXN0ZXIxEDAOBgNVBAcMB1NhbGZvcmQxGjAY BgNVBAoMEUNvbW9k
byBDQSBMaW1pdGVkMSEwHwYDVQQDDBhBQUEgQ2VydGlmaWNhdGUgU2Vydmlj ZXMwHhcNMDQwMTAx
MDAwMDAwWhcNMjgxMjMxMjM1OTU5WjCBrjELMAkGA1UEBhMCVVMxCzAJBgNV BAgTAlVUMRcwFQYD
VQQHEw5TYWx0IExha2UgQ2l0eTEeMBwGA1UEChMVVGhlIFVTRVJUUlVTVCBO ZXR3b3JrMSEwHwYD
VQQLExhodHRwOi8vd3d3LnVzZXJ0cnVzdC5jb20xNjA0BgNVBAMTLVVUTi1V U0VSRmlyc3QtQ2xp
ZW50IEF1dGhlbnRpY2F0aW9uIGFuZCBFbWFpbDCCASIwDQYJKoZIhvcNAQEB BQADggEPADCCAQoC
ggEBALI5haTyfatBO2JGN67NwWB1vDll+UoaR6K5zEjMapjVTTUZuaRC5c5J 4oovHnzSMQfHTrSD
ZJ0uKdWiZMSFvYVRNXmkTmiQexx6pJKoF/KYFfKTzMmkMpW7DE8wvZigC4vl bhuiRvp4vKJvq1le
pS/Pytptqi/rrKGzaqq3Lmc1i3nhHmmI4uZGzaCl6r4LznY6eg6b6vzaJ1s9 cx8i5khhxkzzabGo
Lhu21DEgLLyCio6kDqXXiUP8FlqvHXHXEVnauocNr/rz4cLwpMVnjNbWVDre CqS6A3ezZcj9HtN0
YqoYymiTHqGFfvVHZcv4TVcodNI0/zC27vZiMBSMLOsCAwEAAaOCAScwggEj MB8GA1UdIwQYMBaA
FKARCiM+lvEH7OKvKe+CpX/QMKS0MB0GA1UdDgQWBBSJgmd9xJ0mcABLtFBI fN49rgRufTAOBgNV
HQ8BAf8EBAMCAQYwDwYDVR0TAQH/BAUwAwEB/zAdBgNVHSUEFjAUBggrBgEF BQcDAgYIKwYBBQUH
AwQwEQYDVR0gBAowCDAGBgRVHSAAMHsGA1UdHwR0MHIwOKA2oDSGMmh0dHA6 Ly9jcmwuY29tb2Rv
Y2EuY29tL0FBQUNlcnRpZmljYXRlU2VydmljZXMuY3JsMDagNKAyhjBodHRw Oi8vY3JsLmNvbW9k
by5uZXQvQUFBQ2VydGlmaWNhdGVTZXJ2aWNlcy5jcmwwEQYJYIZIAYb4QgEB BAQDAgEGMA0GCSqG
SIb3DQEBBQUAA4IBAQCdlcs8uH6lCcQevwvCx3aOOTyUxhCqTwzJ4KuEXYlU 4GU7820cfDcsJVRf
liH8N4SRnRXcFE+Bz1Qda2xFYMct+ZdRTPlmyjyggoymyPDi6dRK+ew/Vsnd dozDggFPbADzHhph
dARHA6nGQFeRvGUixSdnT1fbZFrZjR+6hi/0Bq6cae3p9M8pF9jgSp8aIC+X TFG7RgfEijdOIOMJ
MWjHnsSLneh+EbwyaBCWEZhE2CpRYE2I63Q630MGMsg5Vow6EVLTQaRDA/Tt 7zMn2zngFE4mydj1
OeKJuJNdtykmQeqzm66D/Hd1yujKtf7iZUpjPkTE0MNeh3OpmByvfxV/MIIG cTCCBVmgAwIBAgIR
AOwmNBG9EDI2bZZbNHIwM6owDQYJKoZIhvcNAQEFBQAwga4xCzAJBgNVBAYT AlVTMQswCQYDVQQI
EwJVVDEXMBUGA1UEBxMOU2FsdCBMYWtlIENpdHkxHjAcBgNVBAoTFVRoZSBV U0VSVFJVU1QgTmV0
d29yazEhMB8GA1UECxMYaHR0cDovL3d3dy51c2VydHJ1c3QuY29tMTYwNAYD VQQDEy1VVE4tVVNF
UkZpcnN0LUNsaWVudCBBdXRoZW50aWNhdGlvbiBhbmQgRW1haWwwHhcNMTEw MzA5MDAwMDAwWhcN
MTIwMzA4MjM1OTU5WjAiMSAwHgYJKoZIhvcNAQkBFhFzaGVoaUBpbWFub3Yu bmFtZTCCAiIwDQYJ
KoZIhvcNAQEBBQADggIPADCCAgoCggIBAKX/v5iEneXqEkTYuWKAZy7Cvw4m D5TXRIig6JZgqM8L
Zy0Vt8/Y4g0ffLAXa1kgqfsFlD1eeEQfyZfyapBZFIV8vgpBBXiEnCSbXnYd iWIon25cu5if9Kmb
sDyI5P15uFGjfcDgBMhzyG/XuXimGVDpxzwxQJA+Sw8ug6oBfM9IsO0wtE2e JX1s84hlc2LVNehF
7BHcUzjkjhKgsCY0XuxN5Im4ikzI6b5XXWTBqRySh2eRjyTNq/2zJDooIJ2O HTnosWUmeWCV3As0
NSLbhpHtz7CxUuYYeBuAK9MkCu5vY8Xk79uR2Cx093yhtwjOkkdN8Ei3k/P+ rMQeXR64Dz9av/GU
UN0IKj2OKeGNfRLCXBXCFEjFon8Vctoy11t+4WckeZXO9dcSIEuij1CCQZof K4U7qcPnu6pXl46F
u7LJJIF1w6B7qkYgB1ioz9cmnVl3sslftCXRaKHqZ1WLJGTAZSpPDSK1ik7l DECk4nNXVC285uIh
MuzZfAsVfm12Ef8wTbgLXgXcrEn+OxVnd6xP+lZSifxxu7R+q61CvzJucgle gHGFGyWImd7xX1uN
uufaVYDzoNpTPc2TMr+sKGDeTVwPxNZm4EtTcLZjaYpb6A2Imhy8gSL7U9iW YGV8Jy0OXYXUBoE7
j9ORUuX2Lye0KFhKAhq7O7j1jZ8QSSuxAgMBAAGjggITMIICDzAfBgNVHSME GDAWgBSJgmd9xJ0m
cABLtFBIfN49rgRufTAdBgNVHQ4EFgQUdoKVt7OxxSoA/FS6rTO9E3Sjla8w DgYDVR0PAQH/BAQD
AgWgMAwGA1UdEwEB/wQCMAAwIAYDVR0lBBkwFwYIKwYBBQUHAwQGCysGAQQB sjEBAwUCMBEGCWCG
SAGG+EIBAQQEAwIFIDBGBgNVHSAEPzA9MDsGDCsGAQQBsjEBAgEBATArMCkG CCsGAQUFBwIBFh1o
dHRwczovL3NlY3VyZS5jb21vZG8ubmV0L0NQUzCBpQYDVR0fBIGdMIGaMEyg SqBIhkZodHRwOi8v
Y3JsLmNvbW9kb2NhLmNvbS9VVE4tVVNFUkZpcnN0LUNsaWVudEF1dGhlbnRp Y2F0aW9uYW5kRW1h
aWwuY3JsMEqgSKBGhkRodHRwOi8vY3JsLmNvbW9kby5uZXQvVVROLVVTRVJG aXJzdC1DbGllbnRB
dXRoZW50aWNhdGlvbmFuZEVtYWlsLmNybDBsBggrBgEFBQcBAQRgMF4wNgYI KwYBBQUHMAKGKmh0
dHA6Ly9jcnQuY29tb2RvY2EuY29tL1VUTkFBQUNsaWVudENBLmNydDAkBggr BgEFBQcwAYYYaHR0
cDovL29jc3AuY29tb2RvY2EuY29tMBwGA1UdEQQVMBOBEXNoZWhpQGltYW5v di5uYW1lMA0GCSqG
SIb3DQEBBQUAA4IBAQCwXRe+NgLY3bf8+I3E8lbuRkaxtfCWUZY6S0e+biWB gOh/Y7dRzhmSG6Qo
iS0Jk65N1AjnnMljoPw+n0Tv5tnMPHjMIFpWKyn0VU1RPqoWLgHzUeb6AQmi 4LebtZNOY3OSbx+k
f4+RAYcPFv+K8pq4R9BK5ujgZtBLpo0uuFV9UiVCNeh39K0O5VGsdNve8cR1 ARIHUendxO0esfSB
ZVmD+4LsLyQgr3kXL7Fn77lmnGKxuBQodMjkcAQlB2aMBtbHr69r6bfYgYdO g6KiR1RdAOLvUJVc
ijMQ2/gsxIwqDQAYFbZHVLBjAZ11Fe3rIIRe6ibuROPInEIImtz58zKqMYIF rTCCBakCAQEwgcQw
ga4xCzAJBgNVBAYTAlVTMQswCQYDVQQIEwJVVDEXMBUGA1UEBxMOU2FsdCBM YWtlIENpdHkxHjAc
BgNVBAoTFVRoZSBVU0VSVFJVU1QgTmV0d29yazEhMB8GA1UECxMYaHR0cDov L3d3dy51c2VydHJ1
c3QuY29tMTYwNAYDVQQDEy1VVE4tVVNFUkZpcnN0LUNsaWVudCBBdXRoZW50 aWNhdGlvbiBhbmQg
RW1haWwCEQDsJjQRvRAyNm2WWzRyMDOqMAkGBSsOAwIaBQCgggK9MBgGCSqG SIb3DQEJAzELBgkq
hkiG9w0BBwEwHAYJKoZIhvcNAQkFMQ8XDTExMDcxMzEzMDQzOFowIwYJKoZI hvcNAQkEMRYEFF+w
kylBFe3YHAfydnkXSe9DrqQ2MIGrBgkqhkiG9w0BCQ8xgZ0wgZowCwYJYIZI AWUDBAEqMAsGCWCG
SAFlAwQBFjAKBggqhkiG9w0DBzALBglghkgBZQMEAQIwDgYIKoZIhvcNAwIC AgCAMAcGBSsOAwIH
MA0GCCqGSIb3DQMCAgFAMA0GCCqGSIb3DQMCAgEoMAcGBSsOAwIaMAsGCWCG SAFlAwQCAzALBglg
hkgBZQMEAgIwCwYJYIZIAWUDBAIBMIHVBgkrBgEEAYI3EAQxgccwgcQwga4x CzAJBgNVBAYTAlVT
MQswCQYDVQQIEwJVVDEXMBUGA1UEBxMOU2FsdCBMYWtlIENpdHkxHjAcBgNV BAoTFVRoZSBVU0VS
VFJVU1QgTmV0d29yazEhMB8GA1UECxMYaHR0cDovL3d3dy51c2VydHJ1c3Qu Y29tMTYwNAYDVQQD
Ey1VVE4tVVNFUkZpcnN0LUNsaWVudCBBdXRoZW50aWNhdGlvbiBhbmQgRW1h aWwCEQDsJjQRvRAy
Nm2WWzRyMDOqMIHXBgsqhkiG9w0BCRACCzGBx6CBxDCBrjELMAkGA1UEBhMC VVMxCzAJBgNVBAgT
AlVUMRcwFQYDVQQHEw5TYWx0IExha2UgQ2l0eTEeMBwGA1UEChMVVGhlIFVT RVJUUlVTVCBOZXR3
b3JrMSEwHwYDVQQLExhodHRwOi8vd3d3LnVzZXJ0cnVzdC5jb20xNjA0BgNV BAMTLVVUTi1VU0VS
Rmlyc3QtQ2xpZW50IEF1dGhlbnRpY2F0aW9uIGFuZCBFbWFpbAIRAOwmNBG9 EDI2bZZbNHIwM6ow
DQYJKoZIhvcNAQEBBQAEggIAOus5ng5L30/lToPfKCkU+ZTKvHEGfLE/4014 1OV/Xhll+Ohjko7U
F2K7fT8NV8W4Nox2wHW1TXndOm1fcZT5nYjBt9y5ay5pKnxmlvrNtOzl9Ncg nvMcUo35lgeuQM62
7iVfwJ5PlRF0zdVrAZ1gkMJNRTjHuPgwnt4uxAVbANlHUvgcgi936LSo1Xmv 14AD6zdVK8BAruEU
JINs9Rnj3vFs8TerBT8z/h3ra5G28SjxjPmzyhF6DtgEvRqr2iQdxvc+jwsK uhzQ2y9nHJ+1IKJ6
r8Fe9hPa0giBIxLmvqpqSEPYiwfBnCwVDuhakh+Bu37haD3EEQOluExUrpRf 6w7PQwE6+57g0CvQ
gHimEPRJI6/QhKEndL26H+CX2UOkE5W8Qj5aaznIyQaENS0AEvvkvXBxNrpR ZJh1kVdGXtpLAjU6
QabeRkqLN7/as1m0rVsCvUdW4jCX6I5nr8sMWaZvIiNyeRs1dHadl5ZgW/Ns tfs6LhrOpf7mr4y1
1FyLa+Kk21oGHRjnmVYT0Xw0IleKRAzqhAolU1gtoogTpt3ZbvlwNdQoDgyd Hwt1sLxlywG8ZUvu
HyPsp8xGhTzX8Bx4wsSJEBAaVObWf5Odb458gFJBKI/6HPslaMxaJrYy3msV feava1aA8FU16f0e
0e4rKRYcE3k84d7N22QFIWsAAAAAAAA=

------=_NextPart_000_0139_01CC4176.9164F200--

Re: how does this regular expression works

am 14.07.2011 12:27:54 von Richard Quadling

On 13 July 2011 14:04, Shahriyar Imanov wrote:
> \(?P.*?)\<\/head\>



(?P.*?)

Options: case insensitive; ^ and $ match at line breaks

Match the characters â€=9C Match any single character that is not a line break character «.*?=C2=
=BB
Between zero and unlimited times, as few times as possible,
expanding as needed (lazy) «*?»
Match the character â€=9C>â€=9D literally «>»
Match the regular expression below and capture its match into
backreference with name â€=9Chead_tag_innerHTMLâ€=9D
«(?P.*?)»
Match any single character that is not a line break character «.*?=
»
Between zero and unlimited times, as few times as possible,
expanding as needed (lazy) «*?»
Match the characters â€=9Câ€=9D literally «=C2=
=BB


Created with RegexBuddy


Escaping < > ? at the wrong time is simply a redundancy. Learning when
to escape is just like learning a new language.

You only need to escape anything if you need it to be a literal character.

e.g.

something.html vs something\.html


--=20
Richard Quadling
Twitter : EE : Zend : PHPDoc
@RQuadling : e-e.com/M_248814.html : bit.ly/9O8vFY : bit.ly/lFnVea

--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: how does this regular expression works

am 14.07.2011 12:29:01 von Richard Quadling

On 13 July 2011 13:42, who.cat wrote:
> ]*>(.*?)



]*>(.*?)

Options: case insensitive; ^ and $ match at line breaks

Match the characters â€=9C Match any character that is NOT a â€=9C>â€=9D «[^>]*»
Between zero and unlimited times, as many times as possible, giving
back as needed (greedy) «*»
Match the character â€=9C>â€=9D literally «>»
Match the regular expression below and capture its match into
backreference number 1 «(.*?)»
Match any single character that is not a line break character «.*?=
»
Between zero and unlimited times, as few times as possible,
expanding as needed (lazy) «*?»
Match the characters â€=9Câ€=9D literally «=C2=
=BB


Created with RegexBuddy
--=20
Richard Quadling
Twitter : EE : Zend : PHPDoc
@RQuadling : e-e.com/M_248814.html : bit.ly/9O8vFY : bit.ly/lFnVea

--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php