1- PHP Domain Parser
2- =================
1+ # PHP Domain Parser
32
43** PHP Domain Parser** is a [ Public Suffix List] ( http://publicsuffix.org/ ) based
54domain parser implemented in PHP.
65
76[ ![ Build Status] ( https://travis-ci.org/jeremykendall/php-domain-parser.png?branch=master )] ( https://travis-ci.org/jeremykendall/php-domain-parser )
8- [ ![ SensioLabsInsight] ( https://insight.sensiolabs.com/projects/13310245-48b5-43a2-ac30-269e059305e1/mini.png )] ( https://insight.sensiolabs.com/projects/13310245-48b5-43a2-ac30-269e059305e1 )
97[ ![ Total Downloads] ( https://poser.pugx.org/jeremykendall/php-domain-parser/downloads.png )] ( https://packagist.org/packages/jeremykendall/php-domain-parser )
108[ ![ Latest Stable Version] ( https://poser.pugx.org/jeremykendall/php-domain-parser/v/stable.png )] ( https://packagist.org/packages/jeremykendall/php-domain-parser )
119
12- Motivation
13- ----------
10+ ## Motivation
1411
1512While there are plenty of excellent URL parsers and builders available, there
1613are very few projects that can accurately parse a url into its component
@@ -20,13 +17,12 @@ Consider the domain www.pref.okinawa.jp. In this domain, the
2017* public suffix* portion is ** okinawa.jp** , the * registerable domain* is
2118** pref.okinawa.jp** , and the * subdomain* is ** www** . You can't regex that.
2219
23- Other similar libraries focus primarily on URL building, parsing, and manipulation and
24- additionally include public suffix domain parsing. PHP Domain Parser was built around
25- accurate Public Suffix List based parsing from the very beginning, adding a URL
26- object simply for the sake of completeness.
20+ Other similar libraries focus primarily on URL building, parsing, and
21+ manipulation and additionally include public suffix domain parsing. PHP Domain
22+ Parser was built around accurate Public Suffix List based parsing from the very
23+ beginning, adding a URL object simply for the sake of completeness.
2724
28- Installation
29- ------------
25+ ## Installation
3026
3127The only (currently) supported method of installation is via
3228[ Composer] ( http://getcomposer.org ) .
@@ -53,10 +49,9 @@ require_once 'vendor/autoload.php'
5349
5450You're now ready to begin using the PHP Domain Parser.
5551
56- Usage
57- -----
52+ ## Usage
5853
59- ### Parsing URLs ###
54+ ### Parsing URLs
6055
6156Parsing URLs into their component parts is as simple as the example you see below.
6257
@@ -104,8 +99,9 @@ class Pdp\Uri\Url#6 (8) {
10499
105100### Convenience Methods
106101
107- A magic __ get() method is provided to access the above object properties. Obtaining the
108- public suffix for a parsed domain is as simple as:
102+ A magic [ ` __get() ` ] ( http://php.net/manual/en/language.oop5.overloading.php#object.get )
103+ method is provided to access the above object properties. Obtaining the public
104+ suffix for a parsed domain is as simple as:
109105
110106``` php
111107<?php
@@ -120,8 +116,8 @@ $publicSuffix = $url->host->publicSuffix;
120116### IDNA Support
121117
122118[ IDN (Internationalized Domain Name)] ( http://en.wikipedia.org/wiki/Internationalized_domain_name )
123- support was added in version ` 1.4.0 ` . Both unicode domains and their ASCII equivalents
124- are supported.
119+ support was added in version ` 1.4.0 ` . Both unicode domains and their ASCII
120+ equivalents are supported.
125121
126122** IMPORTANT** :
127123
@@ -132,13 +128,13 @@ required for [mb_strtolower](http://php.net/manual/en/function.mb-strtolower.php
132128
133129#### Unicode
134130
135- Parsing IDNA hosts is no different that parsing standard hosts. Setting ` $host = 'Яндекс.РФ'; ` (Russian-Cyrillic)
136- in the * Parsing URLs* example would return:
131+ Parsing IDNA hosts is no different that parsing standard hosts. Setting `$host
132+ = 'Яндекс.РФ';` (Russian-Cyrillic) in the * Parsing URLs* example would return:
137133
138134```
139135class Pdp\Uri\Url#6 (8) {
140136 private $scheme =>
141- string(4 ) "http "
137+ string(0 ) ""
142138 private $host =>
143139 class Pdp\Uri\Url\Host#5 (4) {
144140 private $subdomain =>
@@ -203,7 +199,8 @@ class Pdp\Uri\Url#6 (8) {
203199
204200### IPv6 Support
205201
206- Parsing IPv6 hosts is no different that parsing standard hosts. Setting ` $host = 'http://[2001:db8:85a3:8d3:1319:8a2e:370:7348]:8080/'; `
202+ Parsing IPv6 hosts is no different that parsing standard hosts. Setting `$host
203+ = 'http://[ 2001:db8:85a3:8d3:1319:8a2e:370:7348] :8080/';`
207204in the * Parsing URLs* example would return:
208205
209206```
@@ -242,7 +239,7 @@ will not be parsed properly otherwise.
242239> Hat tip to [ @geekwright ] ( https://github.com/geekwright ) for adding IPv6 support in a
243240> [ bugfix pull request] ( https://github.com/jeremykendall/php-domain-parser/pull/35 ) .
244241
245- ### Parsing Domains ###
242+ ### Parsing Domains
246243
247244If you'd like to parse the domain (or host) portion only, you can use
248245` Parser::parseHost() ` .
@@ -279,7 +276,8 @@ var_dump($parser->isSuffixValid('www.example.com.au');
279276// true
280277```
281278
282- A suffix is considered invalid if it is not contained in the [ Public Suffix List] ( http://publicsuffix.org/ ) .
279+ A suffix is considered invalid if it is not contained in the
280+ [ Public Suffix List] ( http://publicsuffix.org/ ) .
283281
284282> Huge thanks to [ @SmellyFish ] ( https://github.com/SmellyFish ) for submitting
285283> [ Add a way to validate TLDs] ( https://github.com/jeremykendall/php-domain-parser/pull/36 )
@@ -306,7 +304,7 @@ string(16) "scottwills.co.uk"
306304string(5) "co.uk"
307305```
308306
309- ### Sanity Check ###
307+ ### Sanity Check
310308
311309You can quickly parse a url from the command line with the provided ` parse `
312310vendor binary. From the root of your project, simply call:
@@ -318,7 +316,8 @@ $ ./vendor/bin/parse <url>
318316If you pass a url to ` parse ` , that url will be parsed and the output printed
319317to screen.
320318
321- If you do not pass a url, ` http://user:pass@www.pref.okinawa.jp:8080/path/to/page.html?query=string#fragment ` will be parsed and the output printed to screen.
319+ If you do not pass a url, ` http://user:pass@www.pref.okinawa.jp:8080/path/to/page.html?query=string#fragment `
320+ will be parsed and the output printed to screen.
322321
323322Example:
324323
@@ -342,12 +341,12 @@ Array
342341Host: http://www.waxaudio.com.au/
343342```
344343
345- ### Example Script ###
344+ ### Example Script
346345
347346For more information on using the PHP Domain Parser, please see the provided
348347[ example script] ( https://github.com/jeremykendall/php-domain-parser/blob/master/example.php ) .
349348
350- ### Refreshing the Public Suffix List ###
349+ ### Refreshing the Public Suffix List
351350
352351While a cached PHP copy of the Public Suffix List is provided for you in the
353352` data ` directory, that copy may or may not be up to date (Mozilla provides an
@@ -367,13 +366,23 @@ You may verify the update by checking the timestamp on the files located in the
367366** Important** : The vendor binary ` update-psl ` depends on an internet connection to
368367update the cached Public Suffix List.
369368
370- Contributing
371- ------------
369+ ## Possible Unexpected Behavior
370+
371+ PHP Domain Parser is built around PHP's
372+ [ ` parse_url() ` ] ( http://php.net/parse_url ) function and, as such, exhibits most
373+ of the same behaviors of that function. Just like ` parse_url() ` , this library
374+ is not meant to validate URLs, but rather to break a URL into its component
375+ parts.
376+
377+ One specific, counterintuitive behavior is that PHP Domain Parser will happily
378+ parse a URL with [ spaces in the host part] ( https://github.com/jeremykendall/php-domain-parser/issues/45 ) .
379+
380+ ## Contributing
372381
373382Pull requests are * always* welcome! Please review the CONTRIBUTING.md document before
374383submitting pull requests.
375384
376- #### Heads up: BC Break In All 1.4 Versions
385+ ## Heads up: BC Break In All 1.4 Versions
377386
378387The 1.4 series introduced a backwards incompatible change by adding PHP's ` ext-mbstring `
379388and ` ext-intl ` as dependencies. This should have resulted in a major version
@@ -383,11 +392,10 @@ I highly recommend reverting to 1.3.1 if you're running into extension issues an
383392do not want to or cannot install ` ext-mbstring ` and ` ext-intl ` . You will lose
384393IDNA and IPv6 support, however. Those are only available in versions >= 1.4.
385394
386- Version 2 is currently in the works. Please keep an eye out. I apologize for any
387- issues you may have encountered due my [ semver] ( http://semver.org/ ) error.
395+ I apologize for any issues you may have encountered due my
396+ [ semver] ( http://semver.org/ ) error.
388397
389- Attribution
390- -----------
398+ ## Attribution
391399
392400The HTTP adapter interface and the cURL HTTP adapter were inspired by (er,
393401lifted from) Will Durand's excellent
0 commit comments