Wednesday 15 June 2011

file get contents - PHP file_get_contents() behaves differently to browser -


I am trying to download the contents of a web page using PHP. When I issue an order:

  $ f = file_get_contents ("http://mobile.mybustracker.co.uk/mobile.php?searchMode=2");   

This gives a page that reports that the server is down, even when I paste the same URL into my browser, I get the page I expected.

Does anyone know what is the reason for this? Does file_get_contents transmit any header that differentiates the browser request?

Yes, there are differences - the browser sends a lot of additional additions, I will say; And those sent by both, they probably do not respect the same.

Here, after performing some tests, it seems that the accept is an HTTP header;

This additional rule can be specified by using the third parameter of file_get_contents :

  $ Opts = array (' Http '= & gt; array (' method '= & gt;' GET ', //' user_agent '=> "Mozilla / 5.0 (X11; U; Linux x86_64; en-US; rv: 1.9.2) Gecko / 20100301 Ubuntu / 9 10 (activator) Firefox / 3.6 ", 'header' => array ('Accept: text / html, app / xhtml + xml, app / xml; q = 0.9, * \ / * ; Q = 0.8 '),)); $ Context = stream_context_create ($ opts); $ F = file_get_contents ("http://mobile.mybustracker.co.uk/mobile.php?searchMode=2", false, $ reference); Echo $ F;   

With it, I'm able to get the HTML code of the page.


Note:

  • The first test that passed through I user-agent , but it is not necessary - This is the reason that the related line is in the form of a comment here
  • Before trying for file_get_contents , when Firefox is requested from that page, then < Values ​​are used for the value> accepted header.
    • Some other values ​​may be fine, but I have not been tested to determine which value is necessary.

      • - This is an interesting page, here; -)

No comments:

Post a Comment