Networking with PHP - PHP Advanced and Object-Oriented Programming (2013) - Visual Quickpro Guide

PHP Advanced and Object-Oriented Programming (2013)

Visual Quickpro Guide

10. Networking with PHP


In This Chapter

Accessing Other Web Sites

Working with Sockets

Performing IP Geolocation

Using cURL

Creating Web Services

Review and Pursue


The vast bulk of what PHP is used to do is based on taking information from the server (like a database or a text file) and sending it to the client (the end-user’s Web browser), or vice versa. But PHP also supports a slew of features for the purpose of interacting with other Web sites, communicating with other servers, and even FTP’ing files. In this chapter, I’ll discuss and demonstrate some of PHP’s network-related functions and capabilities.

In the first example, you’ll see how to read data from another Web site as if it were any old text file. In the second, a Web site verifier will be created (a tool for checking whether a link is good). In the third section of the chapter, you’ll learn how to identify from what country a user is connecting to your server. The fourth example introduces cURL, a powerful networking utility. And finally, you’ll learn how to start creating your own Web services using PHP.

Accessing Other Web Sites

Even though PHP itself is normally used to create Web sites, it can also access and interact with Web pages on its own. This can be useful for retrieving information, writing spiders (applications that scour the Internet for particular data), and more. Surprisingly, you can access other Web sites in much the same way you would access a text file on your hard drive: by using fopen():

fopen ('http://www.example.com/', 'r');

The same fopen() function used for opening files can also open Web pages because they are, after all, just files on a server. The parameters for using fopen() are the same (r, w, and a), although you will be limited to opening a file only for reading.

One caveat, though, is that you must use a trailing slash after a directory because fopen() will not support redirects. The preceding example and this one are fine:

fopen ('http://www.example.com/index.php', 'r');

But this will fail:

fopen ('http://www.example.com/dir', 'r');

(Many people are unaware that the URL www.example.com/dir is redirected to www.example.com/dir/.)

Another caveat is that PHP must be configured to allow for fopen() calls over a network, which not all PHP installations are image.

image

image If your PHP installation does not allow fopen() to be used over a network, you’ll see errors like these.

Once you have opened a file, you can treat it as you otherwise would, using file(), fgets(), etc., to retrieve the data.

I’ll demonstrate this concept by making use of Yahoo!’s financial pages that return New York Stock Exchange quotes for different stocks. Before proceeding, I should state that the legality of retrieving information from another Web site is an issue you would want to investigate before permanently implementing something like this. Most sites contain copyrighted information, and using it without permission would be a violation. This demonstration with Yahoo! is just a demonstration, not a suggestion that you make a habit of this!

To read a Web site with PHP

1. Create a new PHP document in your text editor or IDE, to be named get_quote.php, beginning with the HTML (Script 10.1):

<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Get Stock Quotes</title>
<link rel="stylesheet" href="style.css">
</head>
<body>
<?php # Script 10.1 - get_quote.php

The CSS script, available in the downloads at the book’s corresponding Web site (www.LarryUllman.com), defines two classes that will be used to format the results.

2. Check if the form has been submitted:

if (isset($_GET['symbol']) && !empty($_GET['symbol'])) {

This page will both display and handle a form. The form itself takes just one input: the symbol for a stock. As the form uses the GET method, the handling PHP code checks for the presence of a $_GET['symbol'].

Script 10.1. The code in this example will retrieve stock quotes by opening up Yahoo!’s quote page and parsing the data therein.


1 <!doctype html>
2 <html lang="en">
3 <head>
4 <meta charset="utf-8">
5 <title>Get Stock Quotes</title>
6 <link rel="stylesheet" href="style.css">
7 </head>
8 <body>
9 <?php # Script 11.1 - get_quote.php
10 // This page retrieves a stock price from Yahoo!.
11
12 if (isset($_GET['symbol']) && !empty($_GET['symbol'])) { // Handle the form.
13
14 // Identify the URL:
15 $url = sprintf('http://quote.yahoo.com/d/quotes.csv?s=%s&f=nl1', $_GET['symbol']);
16
17 // Open the "file".
18 $fp = fopen($url, 'r');
19
20 // Get the data:
21 $read = fgetcsv($fp);
22
23 // Close the "file":
24 fclose($fp);
25
26 // Check the results for improper symbols:
27 if (strcasecmp($read[0], $_GET['symbol']) !== 0) {
28
29 // Print the results:
30 echo '<div>The latest value for <span class="quote">' . $read[0] . '</span> (<span class="quote">' . $_GET['symbol'] . '</span>) is $<span class="quote">' . $read[1] . '</span>.</div>';
31
32 } else {
33 echo '<div class="error">Invalid symbol!</div>';
34 }
35
36 } // End of form submission IF.
37
38 // Show the form:
39 ?><form action="get_quote.php" method="get">
40 <fieldset>
41 <legend>Enter a NYSE stock symbol to get the latest price:</legend>
42 <p><label for="symbol">Symbol</label>: <input type="text" name="symbol" size="5" maxlength="5"></p>
43 <p><input type="submit" name="submit" value="Fetch the Quote!" /></p>
44 </fieldset>
45 </form>
46 </body>
47 </html>


3. Define the URL to be opened:

$url = sprintf('http://quote.yahoo.com/d/quotes.csv?s=%s&f=nl1', $_GET['symbol']);

The most important consideration when accessing and reading other Web pages is to know exactly what data will be there and in what form. In other words, unless you are merely copying the entire contents of a file, you’ll need to develop some system for gleaning the parts of the page you want according to how the data is structured.

In this example, a URL such as http://quote.yahoo.com/d/quotes.csv takes two arguments: the stock (or stocks) to check and the formatting parameters.

It will then return CSV (comma-separated value) data.

For this example, I want to know the stock’s name and the latest price, so the formatting would be nl1 (see www.gummy-stuff.org/Yahoo-data.htm for the options and what they mean). That gets added to the URL, along with the ticker symbol.

If you were to run this URL directly in a Web browser—a good debugging step—you’d see that the result will be in the format (where XX.XX is the price):

"STOCK NAME",XX.XX

4. Open the Web page and read in the data:

$fp = fopen($url, 'r');
$read = fgetcsv($fp);
fclose($fp);

Now that the URL is defined, I can open the “file” for reading. Since I know that the returned data is in CSV form, I can use fgetcsv() to read it. This function will automatically turn the line it reads into an array, using commas as the delimiter. Then I close the file pointer. Note that if the URL were a proper HTML document (this one is not), the first line read would be something like

<!doctype html...

5. Validate that a legitimate stock symbol was used:

if (strcasecmp($read[0], $_GET['symbol']) !== 0) {

If an invalid stock symbol is used, then the Yahoo! page will return that symbol as the stock name and $0.00 as the price. To weed out these instances, check if the returned name is the same as the symbol. I use the strcasecmp() function to perform a case-insensitive equality check between them. If they are the same, the function will return 0. If they are not the same, a nonzero value is returned, meaning it’s safe to print the result.

6. Print the stock’s value:

echo '<div>The latest value for <span class="quote">' . $read[0] . '</span> (<span class="quote">' . $_GET['symbol'] . '</span>) is $<span class="quote">' . $read[1] . '</span>.</div>';

The code in Step 4 takes the information retrieved (e.g., “STOCK NAME”,24.34) and turns it into an array. The first element in the array is the stock’s name, and the second is the current stock value. Both are printed, along with the stock’s symbol, within some CSS formatting image. Note that the fgetcsv() function will strip the quotes from around the stock’s name.

image

image The script has determined, by accessing the Yahoo! page, that Apple Computer is currently at $582.10.

7. Complete the strcasecmp() conditional:

} else {
echo '<div class="error">Invalid symbol!</div>';
}

8. Complete the $_GET['symbol'] conditional and the PHP section.

} // End of form submission IF.
?>

9. Create the HTML form:

<form action="get_quote.php" method="get">
<fieldset>
<legend>Enter a NYSE stock symbol to get the latest price:</legend>
<p><label for="symbol"> Symbol</label>: <input type="text" name="symbol" size="5" maxlength="5"></p>
<p><input type="submit" name="submit" value="Fetch the Quote!" /></p>
</fieldset>
</form>

The form takes just one input: a text box for the stock’s symbol image.

image

image The form takes just a stock symbol from the user.

10. Complete the page:

</body>
</html>

11. Save the file as get_quote.php, place it in your Web directory, and test in your Web browser image.

image

image The result if an invalid ticker symbol is entered.


Tip

PEAR (PHP Extension and Application Repository) contains dozens of networking-related classes. See http://pear.php.net for more.



Tip

The Zend Framework (http://framework.zend.com) has some network-related classes as well. As of this writing, classes are available specifically for connecting to Amazon, Flickr, and Yahoo!.



Tip

More complex Web pages might require use of regular expressions to retrieve the particular pieces you want from the returned data.


Working with Sockets

The fopen() function is one way to access Web pages, but a more sophisticated method for interacting with another server is to use sockets. A socket, in case you are not familiar, is a channel through which two computers can communicate with each other. To open a socket in PHP, usefsockopen():

$fp = fsockopen ($url, $port, $error_number, $error_string, $timeout);

You use fsockopen() to establish a file pointer, just as you would use fopen(). The parameters the function takes are the URL, the port, an error number variable, an error string variable, and the timeout (only the first argument is required).

In layman’s terms, a port is the door through which different protocols (methods of communication) go. For Web pages, the port is normally 80 (see Table 10.1, which lists the most commonly used of the more than 60,000 ports in existence). The error number and string variables are interesting in that they are not really sent to the function (as they have no value initially) so much as they are a way for the function to return error information should one occur. Finally, the timeout simply states for how many seconds the function should try to connect.

Table 10.1. Some Common Ports

image

Once the file has been successfully opened, you can again use fwrite(), fgets(), and so forth to manipulate the data.

Another function I’ll explain before writing the fsockopen() example is parse_url(). This function takes a URL and turns it into an associative array by breaking the structure into its parts:

$url_pieces = parse_url($url);

The primary pieces of the URL will be scheme, host, port, path, and query. Table 10.2 shows how the URL

http://www.example.com/view.php?week=1#demo

Table 10.2. parse_url() Example

image

would be broken down by parse_url(). The user and pass indexes would have values if the URL were of the format http://username:password@www.example.com.

The parse_url() function can be handy in all sorts of instances. I’ll demonstrate one example in the following script. The code developed there will run through a list of URLs and check each to make sure they are still active. To do so, a user-defined function will take a provided URL, parse it, and then use fsockopen() to connect to it. The server’s HTTP response code will indicate the validity of that link. (Table 10.3 lists some common HTTP status codes, which you can also find by searching the Web.)

Table 10.3. Common HTTP Status Codes

image

To use fsockopen()

1. Create a new PHP document in your text editor or IDE, to be named check_urls.php, beginning with the HTML (Script 10.2):

<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Validate URLs</title>
<link rel="stylesheet" href="style.css">
</head>
<body>
<?php # Script 10.1 - check_urls.php

Again, download the CSS script from www.LarryUllman.com to properly format the results.

2. Begin defining the check_url() function:

function check_url($url) {

The function takes one argument: the URL to be validated.

Script 10.2. By making a socket connection, this script can quickly check if a given URL is still valid.


1 <!doctype html>
2 <html lang="en">
3 <head>
4 <meta charset="utf-8">
5 <title>Validate URLs</title>
6 <link rel="stylesheet" href="style.css">
7 </head>
8 <body>
9 <?php # Script 11.2 - check_urls.php
10 // This page validates a list of URLs. It uses fsockopen() and parse_url() to do so.
11
12 // This function will try to connect to a URL:
13 function check_url($url) {
14
15 // Break the URL down into its parts:
16 $url_pieces = parse_url($url);
17
18 // Set the $path and $port:
19 $path = (isset($url_pieces['path'])) ? $url_pieces['path'] : '/';
20 $port = (isset($url_pieces['port'])) ? $url_pieces['port'] : 80;
21
22 // Connect using fsockopen():
23 if ($fp = fsockopen ($url_pieces['host'], $port, $errno, $errstr, 30)) {
24
25 // Send some data:
26 $send = "HEAD $path HTTP/1.1\r\n";
27 $send .= "HOST: {$url_pieces['host']}\r\n";
28 $send .= "CONNECTION: Close\r\n\r\n";
29 fwrite($fp, $send);
30
31 // Read the response:
32 $data = fgets($fp, 128);
33
34 // Close the connection:
35 fclose($fp);
36
37 // Return the response code:
38 list($response, $code) = explode(' ', $data);
39 if ($code == 200) {
40 return array($code, 'good');
41 } else {
42 return array($code, 'bad');
43 }
44
45 } else { // No connection, return the error message:
46 return array($errstr, 'bad');
47 }
48
49 } // End of check_url() function.
50
51 // Create the list of URLs:
52 $urls = array(
53 'http://www.larryullman.com/',
54 'http://www.larryullman.com/wp-admin/',
55 'http://www.yiiframework.com/tutorials/',
56 'http://video.google.com/videoplay?docid=-5137581991288263801&q=loose+change'
57 );
58
59 // Print a header:
60 echo '<h2>Validating URLs</h2>';
61
62 // Kill the PHP time limit:
63 set_time_limit(0);
64
65 // Validate each URL:
66 foreach ($urls as $url) {
67 list($code, $class) = check_url($url);
68 echo "<p><a href=\"$url\" target=\"_new\">$url</a> (<span class=\"$class\">$code</span>)</p>\n";
69 }
70 ?>
71 </body>
72 </html>


3. Parse the URL:

$url_pieces = parse_url($url);

4. Set the proper path and port values:

$path = (isset($url_pieces['path'])) ? $url_pieces['path'] : '/';
$port = (isset($url_pieces['port'])) ? $url_pieces['port'] : 80;

I want to make sure that I’ve got the right path and port when testing the connection later on, so I set the $path variable to be either the existing path, if any, or a slash, as the default. For the URL www.example.com/dir, the path would be /dir. For www.example.com, the path would be /.

The same treatment is given to the $port, with the default as 80.

5. Attempt to connect using fsockopen():

if ($fp = fsockopen($url_pieces['host'], $port, $errno, $errstr, 30)) {

6. If a connection is established, write some data to the server:

$send = "HEAD $path HTTP/1.1\r\n";
$send .= "HOST: {$url_pieces['host']}\r\n";
$send .= "CONNECTION: Close\r\n\r\n";
fwrite($fp, $send);

These lines may seem confusing, but what they are essentially doing is sending a series of HTTP headers to the server to initiate communication. The type of request being made is HEAD image. Such a request is like GET, except that the server will only return a response and not the entire page image. The fsockopen() line connects to the server; the HEAD $path line here requests a specific page. This could be just / or /somefolder/somepage.php.

image

image A HEAD request returns only the basic headers for a page.

image

image A normal (GET) request returns the entire page (this figure just shows the first few lines of the HTML source code returned).

The \r\n code is required for properly formatting the request.

7. Retrieve the response code:

$data = fgets($fp, 128);
fclose($fp);
list($response, $code) = explode(' ', $data);

Once the URL has been hit with a header, it will respond with its own HTTP headers image. This code will read in the first 128 characters of the response and then break that string down into an array. The second element returned will be the HTTP code. Table 10.3 lists some of the possible response codes.

8. Return the code and a class message:

if ($code == 200) {
return array($code, 'good');
} else {
return array($code, 'bad');
}

This function should indicate, via its return values, what code was received and whether that code is good or bad (these strings match up to the CSS classes). An HTTP status code of 200 is considered normal (OK, technically); anything else indicates some sort of problem.

Reasonably, other status codes are considered to be acceptable, including other numbers in the 200s and 300s.

9. Finish the conditional begun in Step 5 and the function:

} else {
return array($errstr, 'bad');
}
} // End of check_url() function.

If a socket connection was not made, the returned error message will be sent back from the check_urls() function.

10. Create a list of URLs:

$urls = array(
'http://www.larryullman.com/',
'http://www.larryullman.com/wp-admin/',
'http://www.yiiframework.com/tutorials/',
'http://video.google.com/videoplay?docid=-5137581991288263801&q=loose+change'
);

For sake of simplicity, I’m creating an array of hard-coded URLs. You might retrieve your own URLs from a database or from a file instead.

11. Print a header and adjust the PHP scripts’ time limit:

echo '<h2>Validating URLs</h2>';
set_time_limit(0);

Making these socket connections can take some time, especially if you have a lot of URLs to validate. By calling the set_time_limit() function with a value of 0, the PHP script is given limitless time to do its thing.

12. Validate each URL:

foreach ($urls as $url) {
list($code, $class) = check_url($url);
echo "<p><a href=\"$url\"target=\"_new\">$url</a> (<span
class=\"$class\">$code</span>)</p>\n";
}

The foreach loop goes through each URL in the array. Then the check_url() function is called. It returns two values: the code (or an error message) and the CSS class name to use (either good or bad). Then the URL is printed, as a link, followed by the code or error message.

13. Finish the PHP and the HTML:

?>
</body>
</html>

14. Save the file as check_urls.php, place it in your Web directory, and test in your Web browser image.

image

image How the validation panned out for the provided four URLs.


Tip

Another benefit that fsockopen() has over the fopen() method used in the first section of the chapter is that the fopen() technique will fail unless PHP’s allow_url_fopen setting is true.



Tip

This is just one example of using sockets in PHP. You can create your own socket server using PHP and the socket functions. If you don’t already know why you might want to do this, you’ll likely never need to touch these functions. But for more information, seewww.php.net/sockets.


Performing IP Geolocation

One of the questions that I am commonly asked is how to identify in which country a user resides. Although the server where your PHP script is housed could be anywhere in the world and the user could be located anywhere in the world, it is still possible to make a geographic match.

The premise is this: Every computer must have an IP address to have Internet access (or to connect to any network). An Internet service provider (ISP) assigns a computer an IP address from a pool of valid addresses only they have access to. By knowing a computer’s IP address, which PHP stores in $_SERVER['REMOTE_ADDR'], you can determine the ISP and, therefore, the country—hence, the name IP geolocation. New GeoIP databases can even predict the city and state (or territory or whatnot), although with less accuracy.

To perform IP geolocation, you must have access to a GeoIP database (see the sidebar). In this section, I’ll use a simple online service, provided by freegeoip. net (www.freegeoip.net). Besides being free and not requiring any installation on your server, the service is easy to use: just perform a GET request of a specific URL. As you’ve now seen in the chapter, such a request can be made using either fopen() or fsockopen().

This next example will make use of another network-related PHP function. The gethostbyname() function returns the IP address for a given domain name.

To find a user’s location

1. Create a new PHP script in your text editor or IDE, to be named ip_geo.php, beginning with the HTML (Script 10.3):

<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>IP Geolocation</title>
<link rel="stylesheet" href="style.css">
</head>
<body>
<?php # Script 10.3 - ip_geo.php

2. Begin defining a function:

function show_ip_info($ip) {
$url = 'http://freegeoip.net/csv/' . $ip;

This function will perform the request of the service and output the results. It takes an IP address as its lone argument. That IP address is added to the URL to be requested.

As you can see at the freegeoip.net Web site, the URL should be in the format http://freegeoip.net/{data_format}/{IP_address}. The data formats returned by the service can be CSV, XML (Extensible Markup Language), or JSON (JavaScript Object Notation). Here, the request is for the data to be returned in CSV format.

Script 10.3. This script fetches geographic location information about the current user based on an IP address.


1 <!doctype html>
2 <html lang="en">
3 <head>
4 <meta charset="utf-8">
5 <title>IP Geolocation</title>
6 <link rel="stylesheet" href="style.css">
7 </head>
8 <body>
9 <?php # Script 11.3 - ip_geo.php
10 // This page uses a Web service to retrieve a user's geographic location.
11
12 // This function will perform the IP Geolocation request:
13 function show_ip_info($ip) {
14
15 // Identify the URL to connect to:
16 $url = 'http://freegeoip.net/csv/' . $ip;
17
18 // Open the connection:
19 $fp = fopen($url, 'r');
20
21 // Get the data:
22 $read = fgetcsv($fp);
23
24 // Close the "file":
25 fclose($fp);
26
27 // Print whatever about the IP:
28 echo "<p>IP Address: $ip<br>
29 Country: $read[2]<br>
30 City, State: $read[5], $read[3]<br>
31 Latitude: $read[7]<br>
32 Longitude: $read[8]</p>";
33
34 } // End of show_ip_info() function.
35
36 // Get the client's IP address:
37 echo '<h2>Our spies tell us the following information about you</h2>';
38 show_ip_info($_SERVER['REMOTE_ADDR']);
39
40 // Print something about a site:
41 $url = 'www.entropy.ch';
42 echo "<h2>Our spies tell us the following information about the URL $url</h2>";
43 show_ip_info(gethostbyname($url));
44
45 ?>
46 </body>
47 </html>


3. Make the request and read in the result:

$fp = fopen($url, 'r');
$read = fgetcsv($fp);
fclose($fp);

Since the service returns CSV data, the same code as in get_quote.php can be used to read and parse it.

4. Print the results:

echo "<p>IP Address: $ip<br>
Country: $read[2]<br>
City, State: $read[5], $read[3]<br>
Latitude: $read[7]<br>
Longitude: $read[8]</p>";

The Web service returns CSV data, which the code in Step 3 turns into an array image. Once you know that result, outputting the information is just a matter of referencing the correct indexes.

image

image The array of data returned by the Web service.

5. Complete the function:

}

6. Get the user’s IP address:

echo '<h2>Our spies tell us the following information about you</h2>';
show_ip_info($_SERVER ['REMOTE_ADDR']);

Again, PHP will store the user’s IP address in $_SERVER['REMOTE_ADDR']. This value just needs to be passed to the function that will use it for the IP geolocation call.

7. Identify a URL to report on and get its IP address:

$url = 'www.entropy.ch';
echo "<h2>Our spies tell us the following information about the URL $url</h2>";
show_ip_info(gethostbyname($url));


Choosing an IP Geolocation Option

In this chapter, I chose to use the freegioip.net Web service as the IP geolocation source for two reasons: it’s free and it’s easy to use. But free comes at a cost: this service is unlikely to be as accurate or fast as some other options. If you’re using IP geolocation on a live site, particularly an active and/or commercial one, you’ll want to consider other sources.

If you search the Internet, you’ll find plenty of alternatives, but the one I most commonly use is MaxMind (www.maxmind.com). MaxMind provides both free and commercial versions of an IP database that can be downloaded and installed on your computer. By using a local database, you’ll get better performance and not be susceptible to network issues.

MaxMind’s databases are not that hard to install and use, and there are instructions on their site. The free database is perfectly fine for most people, but you can pay a modest amount to use the more accurate commercial version.


While playing around with IP geolocation, the script will also fetch the information for a Web site. This is to say that the script will try to identify the physical location of the server on which that particular site is running. In this case, I’m choosing Marc Liyanage’s invaluable site,www.entropy.ch.

8. Complete the page:

?>
</body>
</html>

9. Save the file as net_geo.php, place it in your Web directory, and test in your Web browser image.

image

image The IP geolocation results for my IP address and the URL www.entropy.ch.

10. Hop into a plane, train, or automobile; travel to another country; get online; and retest the script in your Web browser image.

image

image Running the script again, after flying to Saudi Arabia. I also changed the URL to www.LarryUllman.com to see those results.

Alternatively, you could insert another IP address in place of $_SERVER['REMOTE_ADDR'].


Tip

The trick to using any Web service is understanding what URL to use and what the result will be. For debugging purposes, try to load the service in your Web browser to confirm the results, or have the PHP script output them.



Tip

IP addresses aren’t always reliable because, for example, multiple users on the same network could potentially be presented as having the same IP address. In this particular case, however, that particular problem wouldn’t be a hindrance.



Tip

One resource I found suggested that IP geolocation is very accurate on the country level, probably close to 95 percent. On the city and state level, that accuracy may dip down to 50–80 percent, depending on the database being used. In my case, it did not accurately pick the city but suggested one about 20 miles away. As I suggest in the sidebar, using a commercial database would garner more accurate results.



Tip

If you have the need to find out the host-name associated with an IP address, use the corresponding gethostbyaddr() function.



Tip

If a URL might be on multiple servers, the gethostbynamel() function returns all the possible IP addresses. You can then check one or every IP.


Using cURL

The next network-related topic to be discussed in this chapter is a technology called cURL. This utility, which stands for client URLs (and is also written as just curl or Curl), is a command-line tool for working with URLs. With cURL you can access Web sites and FTP files, and do much, much more. cURL provides an excellent way to interact with payment gateways for an e-commerce site. You can even use cURL to update your Facebook status or post to your blog!

PHP can invoke cURL via the shell_ exec() and other system functions. But PHP also supports libcurl, a cURL library, which I’ll talk about here.

The process starts by calling curl_init(), providing to this function the name of the URL to be accessed:

$curl = curl_init('www.example.com');

The value returned by the function should be assigned to a variable, which will act as a pointer or a handle to the transaction.

Next, the curl_setopt() function is used (a lot) to set any options for the request. The syntax is

curl_setopt($curl, CONSTANT, value);

Unfortunately, there are way too many options to even provide a subset here. In the following example I’ll highlight a handful of them. If you like cURL, check out the PHP manual for the full list of settings.

After setting all the options (and note that you can set them in any order), use curl_exec() to execute the transaction:

$result = curl_exec($curl);

You should assign the result of the curl_exec() command to a variable, in case you need to print the result.

Finally, close the connection:

curl_close($curl);

The great thing about cURL is that it can be used to do everything that the other examples in the chapter also accomplish. But for this next example, let’s use it for something that fopen(), fsockopen(), and the rest can’t do: open a Web page and post data to it (as if the script submitted a form via the POST method).

This first script will post arbitrary data to another page. That receiving page will be written in the subsequent example.

To use cURL

1. Create a new PHP script in your text editor or IDE, to be named curl.php, beginning with the HTML (Script 10.4):

<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Using cURL</title>
<link rel="stylesheet" href="style.css">
</head>
<body>
<h2>cURL Results:</h2>
<?php # Script 10.4 - curl.php

Script 10.4. The cURL library is used by PHP to post data to a page.


1 <!doctype html>
2 <html lang="en">
3 <head>
4 <meta charset="utf-8">
5 <title>Using cURL</title>
6 <link rel="stylesheet" href="style.css">
7 </head>
8 <body>
9 <h2>cURL Results:</h2>
10 <?php # Script 11.4 - curl.php
11 // This page uses cURL to post data to a Web service.
12
13 // Identify the URL:
14 $url = 'http://localhost/service.php';
15
16 // Start the process:
17 $curl = curl_init($url);
18
19 // Tell cURL to fail if an error occurs:
20 curl_setopt($curl, CURLOPT_FAILONERROR, 1);
21
22 // Allow for redirects:
23 curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1);
24
25 // Assign the returned data to a variable:
26 curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
27
28 // Set the timeout:
29 curl_setopt($curl, CURLOPT_TIMEOUT, 5);
30
31 // Use POST:
32 curl_setopt($curl, CURLOPT_POST, 1);
33
34 // Set the POST data:
35 curl_setopt($curl, CURLOPT_POSTFIELDS, 'name=foo&pass=bar&format=csv');
36
37 // Execute the transaction:
38 $r = curl_exec($curl);
39
40 // Close the connection:
41 curl_close($curl);
42
43 // Print the results:
44 print_r($r);
45
46 ?>
47 </body>
48 </html>


2. Begin the cURL transaction:

$url = 'http://localhost/service.php';
$curl = curl_init($url);

You don’t have to assign the URL to a variable prior to the curl_init() line, of course. But this URL does need to be a valid value. The destination itself, service.php, will be written in the next sequence of steps. If you’re not using localhost, with the default port, and the Web root directory, you’ll need to change the URL accordingly.

3. Tell cURL to fail if an error occurs:

curl_setopt($curl, CURLOPT_FAILONERROR, 1);

The first of the options is CURLOPT_FAILONERROR. By setting this to true (or 1), you tell cURL to stop the process if an error occurs (rather than continuing on blindly).

4. Tell cURL to allow for redirects:

curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1);

This second option sets whether or not server redirections—think of a PHP header('Location: somepage.php') call—should stop the transaction or redirections should be followed. Here, I’m saying to follow redirections.

5. Opt to assign the returned data to a variable:

curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);

If you won’t use the data that would be returned by a cURL request, you don’t need to enable this option. In this script, that data will be printed for verification, so this value is set to 1.

6. Set the timeout:

curl_setopt($curl, CURLOPT_TIMEOUT, 5);

This is the maximum amount of time to attempt the transaction, in seconds. Five seconds may not seem like much in the real world, but in Internet time, it’s an eon!

7. Tell cURL to use the POST method:

curl_setopt($curl, CURLOPT_POST, 1);

In this example, data will be posted to the page (http://localhost/service.php) as if a form were submitted. Alternatively, some of the other examples in this chapter would perform a GET request.

8. Set the POST data:

curl_setopt($curl, CURLOPT_POSTFIELDS, 'name=foo&pass=bar&format=csv');

The CURLOPT_POSTFIELDS option is where you set the POST data. The syntax is a series of name=value pairs, separated by ampersands. For the service.php script (to be written next), you can pass almost anything and the service will report back the results.

Note that if you wanted to send more complex values through cURL, you’d want to URL-encode the data.

9. Execute the transaction:

$r = curl_exec($curl);

10. Close the connection:

curl_close($curl);

11. Print the results:

print_r($r);

Everything returned by the request is assigned to the $r variable. I’ll just dump that out for debugging purposes.

12. Complete the page:

?>
</body>
</html>

13. Save the file as curl.php and place it in your Web directory.

Note that you can’t run it yet, because you have to write service.php first.


Tip

If a page is protected by HTTP authentication, use this option (obviously replacing the username and password values with proper ones):

curl_setopt($curl, CURLOPT_USERPWD, 'username:password');



Tip

The curl_getinfo() function returns an array of information about the transaction image. If you want to use it, you must call it before closing the connection.

image

image Some of the information about the most recent cURL request.



Tip

The cURL utility can also be used to send and receive cookies, handle file uploads, work over SSL connections, even FTP files.



Tip

Use the curl_errno() and curl_error() functions to retrieve the error number and message, should one occur.


Creating Web Services

The previous four examples in this chapter all demonstrate various ways you can use PHP to interact with another server. The IP geolocation example specifically makes use of a Web service. A Web service is a generic term for a server resource that provides a function that, unlike a normal Web page, is meant to be accessed directly by another computer, not a user. For example, a PHP script on server A (such as ip_geo.php) would make a request of a script on server B (in that case, the freegeoip.net server).

The principle behind a service is simple, but services can be implemented in many different ways. In fact, entire books have been written on the subject. But in this chapter, I’ll present an overview of the Web services field and then show you how to implement a simple type of Web service.

Introduction to Web services

Web services can vary greatly in their complexity. In fact, that complexity is well represented by the multiple acronyms involved. Loosely speaking, you can group all Web services into two broad categories: complex and simple.

Complex services are what have been historically known as true “Web services” (as opposed to the generic term). In a complex service, the client is able to dynamically discover and use the service. For example, the server may use Web Services Description Language (WSDL) to describe the service provided. That WSDL document is then readable by clients tapping into the service.

Complex Web services often transmit data using custom, non-scalar types. This might require a protocol such as Simple Object Access Protocol (SOAP). This means that instead of just transmitting, say, plain text or XML between the client and the server, the server may send back data in an agreed-upon object format.

As you can tell, complex Web services tend to have tighter integration between the client and the server. And there may be several iterations of communication over the course of a transaction.

Conversely, simple Web services are stateless: just a basic request-response dynamic. The client makes standard requests in the hopes that the server understands that request and is able to reply. These are also known as application programming interface (API)-based services, where the developer has to find the documentation that explains how to use the service. All of the examples in this chapter have been of this type.

A popular type of simple service is called REST-ful, short for Representational State Transfer (REST). These are normally HTTP requests, with data often being passed to the service, which will in turn impact the data returned by the service. For example, an IP address and a returned data format are sent to the IP geolocation service, which then returns information in the given format.

Now that you’ve seen different ways of having a PHP script interact with a simple Web service (i.e., having PHP act as the client), let’s create a PHP script to act as a barebones service. First, though, you need to know how to return different types of data using a PHP script.

Returning types of data

For the most part, Web developers use PHP to create HTML content. But a service does not normally create HTML; it outputs data. In both cases, PHP is still just printing the output, but when using a script as a service, the PHP script has to take the extra step of indicating its alternative usage. In other words, the PHP script has to communicate the type of content being outputted. Normally a server associates the content type of a PHP script with HTML. To change that, send a content-type header, indicating the type to be expected.

If the data being returned by the service is in plain text format, you would use

header('Content-Type: text/plain');

Note that, as with any time you use header(), this line must be called prior to anything being sent out.

If the PHP script is outputting data in CSV format, you’d use

header('Content-Type: text/csv');

More complex data is normally transmitted using either XML or JSON. Those content types are

header('Content-Type: text/xml');

and

header('Content-Type: application/json');

XML, which has long been the standard for representing complex data, is discussed in great detail in Chapter 13, “XML and PHP.” In that chapter, you’ll see how to have a PHP script return XML data, so I won’t explain how to do so any further here.

The JSON format is most commonly used in services that expect to be accessed via JavaScript, although that is not required. To have PHP output data in JSON format, you just need to invoke the json_encode() function:

echo json_encode($data);

The json_encode() function is part of the JSON extension, built into PHP as of version 5.2.

Creating a simple service

With a quick explanation of how you output different, standard data types from a PHP script, let’s create an example. To demonstrate as much information in one fell swoop, this next script will be quite flexible, able to output data in one of four possible formats:

• Plain text

• CSV

• JSON

• XML (after some additional work)

For the data itself, the script will merely return the data provided to the service during the request. This is somewhat trivial, although it does make for a good debugging tool. Still, with a basic understanding of PHP and MySQL, however, it would be easy to have this script return information from a database or other source.

Script 10.5. This simple Web service returns data in different formats, based on how it’s accessed.


1 <?php # Script 11.5 - service.php
2 // This script acts as a simple Web service.
3 // The script only reports back the data received, along with a bit of extra information.
4
5 // Check for proper usage:
6 if (isset($_POST['format'])) {
7
8 // Switch the content type based upon the format:
9 switch ($_POST['format']) {
10 case 'csv':
11 $type = 'text/csv';
12 break;
13 case 'json':
14 $type = 'application/json';
15 break;
16 case 'xml':
17 $type = 'text/xml';
18 break;
19 default:
20 $type = 'text/plain';
21 break;
22 }
23
24 // Create the response:
25 $data = array();
26 $data['timestamp'] = time();
27
28 // Add back in the received data:
29 foreach ($_POST as $k => $v) {
30 $data[$k] = $v;
31 }
32
33 // Format the data accordingly:
34 if ($type == 'application/json') {
35 $output = json_encode($data);
36
37 } elseif ($type == 'text/csv') {
38
39 // Convert to a string:
40 $output = '';
41 foreach ($data as $v) {
42 $output .= '"' . $v . '",';
43 }
44
45 // Chop off the final comma:
46 $output = substr($output, 0, -1);
47
48 } elseif ($type == 'text/plain') {
49 $output = print_r($data, 1);
50 }
51
52 } else { // Incorrectly used!
53 $type = 'text/plain';
54 $output = 'This service has been incorrectly used.';
55 }
56
57 // Set the content-type header:
58 header("Content-Type: $type");
59 echo $output;


To create a service

1. Create a new PHP script in your text editor or IDE, to be named service.php (Script 10.5):

<?php # Script 10.5 - service.php

This script will not output any HTML!

2. Check that a format was passed to the service in POST:

if (isset($_POST['format'])) {

The only requirement of this script is that a desired data format is sent via POST.

3. Identify the content type based on the desired format:

switch ($_POST['format']) {
case 'csv':
$type = 'text/csv';
break;
case 'json':
$type = 'application/json';
break;
case 'xml':
$type = 'text/xml';
break;
default:
$type = 'text/plain';
break;
}

These values match those already explained.

4. Start building up the response:

$data = array();
$data['timestamp'] = time();

The response will be an array of data, starting with the current timestamp.

5. Add the received data to the array:

foreach ($_POST as $k => $v) {
$data[$k] = $v;
}

To give this service something to do, it will report back the data received.

6. Create the output in the proper format:

if ($type == 'application/json') {
$output = json_encode($data);

The next step is to turn the data into the proper format based on what the requester submitted. For JSON data, this just means running the data through the json_encode() function.

7. Create the output in CSV format:

} elseif ($type == 'text/csv') {
$output = '';
foreach ($data as $v) {
$output .= '"' . $v . '",';
}
$output = substr($output, 0, -1);

The CSV data should obviously have each piece of information separated by commas. To make the data more reliable, however, each value will be wrapped in quotes (e.g., in case there are commas within the values).

8. Create the output in plain text format:

} elseif ($type == 'text/plain') {
$output = print_r($data, 1);
}

If the requested data should be in plain text format, it’ll just be a variable dump.

9. Complete the conditional begun in Step 2:

} else {
$type = 'text/plain';
$output = 'This service has been incorrectly used.';
}

If no format was provided via POST, a plain text message will indicate that the service was incorrectly used.

10. Set the content-type header:

header("Content-Type: $type");

11. Send the data:

echo $output;

Remember that “sending” is really just printing, as the output of the script is what the requesting script will receive.

12. Save the file as service.php and place it in your Web directory.

13. Test the service by running curl.php in your Web browser image.

image

image The script displays the service results, which is in the requested CSV format.

14. Change the POST data in curl.php and rerun the script image.

image

image Different data from another service request, now in JSON format.


Tip

The curl.php and service.php scripts would not normally be on the same server, but it’s fine for them to be together for testing purposes.


Review and Pursue

If you have any problems with these sections, either in answering the questions or pursuing your own endeavors, turn to the book’s supporting forum (www.LarryUllman.com/forums/).

Review

• How do you use fopen() to access another Web site? What restrictions exist when you do so? (See page 328.)

• What are sockets? What is a port? What PHP function do you use to communicate over sockets? (See page 333.)

• What are HTTP status codes? (See page 334.)

• What does the parse_url() function do? (See page 333.)

• What does the set_time_limit() function do? Why is it necessary in the check_urls.php script? (See page 337.)

• In what variable will PHP store a user’s IP address? (See page 339.)

• What PHP function can be used to find the IP address associated with a domain name? (See page 339.)

• What is cURL? What kinds of things can it be used for? (See page 343.)

• What are Web services? How does a PHP script used as a Web service, as opposed to an HTML page, differ? (See page 347.)

Pursue

• If you want to practice your OOP skills, rewrite any of this chapter’s examples using classes.

• Look into the Zend Framework’s tools for interacting with other Web sites.

• Learn more about sockets and ports.

• Update check_urls.php to qualify a wider range of status codes as “good.”

• If you want, update the ip_geo.php script to use fsockopen() instead of fopen(). Note that you’ll want to perform a GET request, not a HEAD request as in check_urls.php.

• Learn more about using cURL.

• Rewrite any of the other chapter examples using cURL.

• Create a new, more useful service script.