Tuesday, January 16, 2007

Getting started with objects with PHP V5

Why you need to know objects and classes, and how to use them

This article describes the fundamentals of objects and classes in PHP V5, from the very basics through to inheritance, for experienced object-oriented programmers and those who have not yet been introduced to objects.

As a PHP programmer, you know variables and functions inside-out. Classes and objects, though, might be a different matter. It's possible to create perfectly good systems without defining a single class. Even if you decide to stay away from object-oriented programming in your own code, you will likely need to know object-oriented programming. For example, if you use a third-party library, such as those made available by PHP Extension and Application Repository (PEAR), you will find yourself instantiating objects and calling methods.

What are classes and objects?

Put simply, a class is a discrete block or bundle of variables and methods. These components usually coalesce around a single responsibility or set of responsibilities. In this article, you will create a class that collects methods for querying and populating a dictionary of terms and values.

A class can be used directly as a neat way of organizing data and functionality, very much like a bunch of functions and variables. This is to ignore a powerful dimension, however. Classes can be used to generate multiple instances in memory. Such instances are called objects. Each object has access to the same set of functions (called methods in an object-oriented context), and variables (called properties or instance variables), but the actual value of each variable may differ from object to object.

Think about a unit in a role-playing game -- a tank, perhaps. A class may lay down a set of variables for tanks: defensive and offensive capability, range, health, and so on. The class may also define a set of functions, including move() and attack(). While the system contains one tank class, this class may be used to generate tens or hundreds of tank objects, each potentially with its own characteristics of health or range. A class, then, is a blueprint or template for use in generating objects.

Perhaps the easiest way to understand classes and objects is to create some.

A first class

You can create a class with the class keyword. At its simplest, a class consists of the keyword class, a name, and a code block:


class Dictionary {

}


The class name can contain any combination of letters and numbers, as well as the underscore character, but cannot begin with a number.

The Dictionary class in the previous example is perfectly legal, even if it is of limited use. So how do you use this class to create some objects?


$obj1 = new Dictionary();
$obj2 = new Dictionary();
$obj3 = new Dictionary();


In form at least, instantiating an object is similar to calling a function. As with a function call, you must supply parentheses. Like functions, some classes require that you pass them arguments. You must also use the new keyword. This tells the PHP engine that you wish to instantiate a new object. The returned object can then be stored in a variable for later use.

Properties

Within the body of a class, you can declare special variables called properties. In PHP V4, properties had to be declared with the keyword var. This is still legal syntax, but mainly for the sake of backward compatibility. In PHP V5, properties should be declared public, private, or protected. You can read about these qualifiers in Keywords: Can we have a little privacy in here? But for now, declare all properties public in the examples. Listing 1 shows a class that declares two properties.

Listing 1. A class that declares two properties

class Dictionary {
public $translations = array();
public $type ="En";
}


As you can see, you can declare a property and assign its value at the same time. You can get a quick peek at the state of an object with the print_r() function. Listing 2 shows that a Dictionary object now has more to it.

Listing 2. A look at the Dictionary object


$en = new Dictionary();
print_r( $en );

If we run this script, we'll see output of:

Dictionary Object
(
[translations] => Array
(
)

[type] => En
)



You can access public object properties using the object operator '->'. So $en->type means the $type property of the Dictionary object referenced by $en. If you can access a property, it means that you can set and get its value. The code in Listing 3 creates two instances of the Dictionary class -- in other words, it instantiates two Dictionary objects. It changes the $type property of one object and adds translations to both:

Listing 3. Creating two instances of the Dictionary class



$en = new Dictionary();
$en->translations['TREE'] = "tree";

$fr = new Dictionary();
$fr->type = "Fr";
$fr->translations['TREE'] = "arbre";

foreach ( array( $en, $fr ) as $dict ) {
print "type: {$dict->type} ";
print "TREE: {$dict->translations['TREE']}n";
}


The script outputs the following:


type: En TREE: tree
type: Fr TREE: arbre



So the Dictionary class is now a little more useful. Individual objects can store distinct sets of keys and values, as well as a flag that tells a client more about the kind of Dictionary this is.

Even though the Dictionary class is currently little more than a wrapper around an associative array, there is some clue to the power of objects here. At this stage, we could represent our sample data pretty well, as shown in Listing 4.

Listing 4. Sample data



$en = array(
'translations'=>array( 'TREE' => 'tree' ),
'type'=>'En'
);

$fr = array(
'translations'=>array( 'TREE' => 'arbre' ),
'type'=>'Fr'
);


Although this data structure fulfills the same purpose as the Dictionary class, it provides no guarantee of the structure. If you are passed a Dictionary object, you know it is designed to have a $translations property. Given an associative array, you have no such guarantee. This fact makes a query like $fr['translations']['TREE']; somewhat hit and miss, unless the code making the query is sure of the provenance of the array. This is a key point about objects: The type of an object is a guarantee of its characteristics.

Although there are benefits to storing data with objects, you are missing an entire dimension. Objects can be things, but crucially they can also do things.

Methods

Put simply, methods are functions declared within a class. They are usually -- but not always -- called via an object instance using the object operator. Listing 5 adds a method to the Dictionary class and invokes it.

Listing 5. Adding a method to the Dictionary class


class Dictionary {
public $translations = array();
public $type ="En";

function summarize() {
$ret = "Dictionary type: {$this->type}n";
$ret .= "Terms: ".count( $this->translations )."n";
return $ret;
}
}

$en = new Dictionary();
$en->translations['TREE'] = "tree";
print $en->summarize();


It provides output of:


Dictionary type: En
Terms: 1


As you can see, the summarize() method is declared just as any function would be declared, except that is done within a class. The summarize() method is invoked via a Dictionary instance using the object operator. The summarize() function accesses properties to provide a short overview of the state of the object.

Notice the use of a feature new to this article. The $this pseudo-variable provides a mechanism for objects to refer to their own properties and methods. Outside of an object, there is a handle you can use to access its elements ($en, in this case). Inside an object, there is no such handle, so you must fall back on $this. If you find $this confusing, try replacing it in your mind with the current instance when you encounter it in code.

Classes are often represented in diagrams using the Universal Modeling Language (UML). The details of the UML are beyond the scope of this article, but such diagrams are nonetheless an excellent way of visualizing class relationships. Figure 1 shows the Dictionary class as it stands. The class name lives in the top layer, properties in the middle, and methods at the bottom.


Dictionary class using the UML

The constructor

The PHP engine recognizes a number of "magic" methods. If they are defined, it invokes these methods automatically when the correct circumstances arise. The most commonly implemented of these methods is the constructor method. The PHP engine calls a constructor when the object is instantiated. It is the place to put any essential setup code for your object. In PHP V4, you create a constructor by declaring a method with the same name as that of the class. In V5, you should declare a method called __construct(). Listing 6 shows a constructor that requires a DictionaryIO object.

Listing 6. A construtor that requires a DictionaryIO object



class Dictionary {
public $translations = array();
public $type;
public $dictio;

function __construct( $type, DictionaryIO $dictio ) {
$this->type = $type;
$this->dictio=$dictio;
}

//...


To instantiate a Dictionary object, you need to pass a type string and a DictionaryIO object to its constructor. The constructor uses these parameters to set its own properties. Here is how you might now instantiate a Dictionary object:


$en = new Dictionary( "En", new DictionaryIO() );


The Dictionary class is now much safer than before. You know that any Dictionary object will have been initialized with the required arguments.

Of course, there's no way yet to stop someone coming along later and changing the $type property or setting $dictio to null. Luckily, PHP V5 can help you there, too.

Keywords: Can we have a little privacy in here?

You have already seen the public keyword in relation to property declarations. This keyword denotes a property's visibility. In fact, the visibility of a property can be set to public, private, and protected. Properties that are public can be written to and read from outside the class. Properties that are private can only be seen within the object or class context. Properties that are protected can only be seen within the context of the current class or its children. (You will see this in action in the Inheritance section.) You can use private properties to really lock down our classes. If you declare your properties private and attempt to access them from outside the class' scope (as shown in Listing 7), the PHP engine will throw a fatal error.

Listing 7. Attempting to access your properties from outside the class' scope



class Dictionary {
private $translations = array();
private $dictio;
private $type;

function __construct( $type, DictionaryIO $dictio ) {
$this->type = $type;
$this->dictio = $dictio;
}

// ...
}

$en = new Dictionary( "En", new DictionaryIO() );
$en->dictio = null;



This outputs the following:


Fatal error: Cannot access private property
Dictionary::$dictio in...


As a rule of thumb, you should make most properties private, then provide methods for getting and setting them if necessary. In this way, you can control a class' interface, making some data read-only, cleaning up or filtering arguments before assigning them to properties, and providing a clear set of rules for interacting with objects.

You can modify the visibility of methods in the same way as properties, adding public, private, or protected to the method declaration. If a class needs to use some housekeeping methods that the outside world need not know about, for example, you can declare them private. In Listing 8, a get() method provides the interface for users of the Dictionary class to extract a translation. The class also needs to keep track of all queries and provides a private method, logQuery(), for this purpose.

Listing 8. A get() method provides the interface for users of the Dictionary class



function get( $term ) {
$value = $this->translations[$term];
$this->logQuery( $term, $value, "get" );
return $value;
}

private function logQuery( $term, $value, $kind ) {
// write log information
}


Declaring logQuery() as private simplifies the public interface and protects the class from having logQuery() called inappropriately. As with properties, any attempt to call a private method from outside the containing class causes a fatal error.
Working in class context

The methods and properties you have seen so far all operate in object context. That is, you must access them using an object instance, via the $this pseudo-variable or an object reference stored in a standard variable. In some cases, you may find that it's more useful to access properties and methods via a class rather than an object instance. Class members of this kind are known as static.

To declare a static property, place the keyword static after the visibility modifier, directly in front of the property variable.

This example shows a single static property: $iodir, which holds the path to the default directory for saving and reading Dictionary data. Because this data is the same for all objects, it makes sense to make it available across all instances.

Listing 9. A single static $iodir property



class Dictionary {
public static $iodir=".";
// ...
}


You can access a static property using the scope resolution operator, which consists of a double colon (::). The scope resolution operator should sit between the class name and the static property you wish to access.


print Dictionary::$iodir . "n";
Dictionary::$iodir = "/tmp";


As you can see, there is no need to instantiate a Dictionary object to access this property.

The syntax for declaring and accessing static methods is similar. Once again, you should place the static keyword after the visibility modifier. Listing 10 shows two static methods that access the $iodir property, which is now declared private.

Listing 10. Two static methods that access the $iodir property



class Dictionary {
private static $iodir=".";
// ...
public static function setSaveDirectory( $dir ) {
if ( ! is_dir( $dir ) ||
! is_writable( $dir ) ) {
return false;
}
self::$iodir = $dir;
}

public static function getSaveDirectory( ) {
return self::$iodir;
}
// ...
}


Users can no longer access the $iodir property directory. By creating special methods for accessing a property, you can ensure that any provided value is sane. In this case, the method checks that the given string points to a writable directory before making the assignment.

Notice that both methods refer to the $iodir property using the keywordself and the scope resolution operator. You cannot use $this in a static method because $this is a reference to the current object instance, but a static method is called via the class and not an object. If the PHP engine sees $this in a static method, it will throw a fatal error together with an informative message.

To call a static method from outside of its class, use the class name together with the scope resolution operator and the name of the method.


Dictionary::setSaveDirectory("/tmp");
print Dictionary::getSaveDirectory();


There are two good reasons why you might want to use a static method. First of all, a utility operation may not require an object instance to do its job. By declaring it static, you save client code the overhead of creating an object. Second, a static method is globally available. This means that you can set a value that all object instances can access, and it makes static methods a great way of sharing key data across a system.

While static properties are often declared private to prevent meddling, there is one way of creating a read-only statically scoped property: You can declare a constant. Like its global cousin, a class constant is immutable once defined. It is useful for status flags and for other things that don't change during the life of a process, like pi, for example, or all the countries in Africa.

You declare a class constant with the const keyword. For example, since a real-world implementation of a Dictionary object would almost certainly have a database sitting behind it, you can also assume that there will be a maximum length for terms and translations. Listing 11 sets this as a class constant.

Listing 11. Setting MAXLENGTH as a class constant



class Dictionary {
const MAXLENGTH = 250;
// ...
}

print Dictionary::MAXLENGTH;


Class constants are always public, so you can't use the visibility keywords. This is not a problem because any attempt to change the value will result in a parse error. Also notice that unlike regular properties, a class constant does not begin with the dollar sign.