Object handlers¶
In the previous sections you already had some contact with object handlers. In particular you should know how to create
the structure used to specify the handlers and how to implement cloning behavior using clone_obj
. But this is just
the beginning: Nearly all operations on objects in PHP go through object handlers and every magic method or magic
interface is implemented with an object or class handler internally. Furthermore there are quite a few handlers which
are not exposed to userland PHP. For example internal classes can have custom comparison and cast behavior.
As the number of different object handlers is rather large we can only discuss examples (using the typed array implementation from the last section) for a few of them. For all the others only a short description is provided.
An Overview¶
As of this writing there are 26 object handlers, which are listed in the following with their signature and a small description.
- zval *read_property(zval *object, zval *member, int type, const struct _zend_literal *key TSRMLS_DC)
- void write_property(zval *object, zval *member, zval *value, const struct _zend_literal *key TSRMLS_DC)
- int has_property(zval *object, zval *member, int has_set_exists, const struct _zend_literal *key TSRMLS_DC)
- void unset_property(zval *object, zval *member, const struct _zend_literal *key TSRMLS_DC)
- zval **get_property_ptr_ptr(zval *object, zval *member, const struct _zend_literal *key TSRMLS_DC)
These handlers correspond to the
__get
,__set
,__isset
and__unset
methods.get_property_ptr_ptr
is the internal equivalent of__get
returning by reference. Thezend_literal *key
passed to these functions exists as an optimization, for example it contains a precomputed hash of the property name.
- zval *read_dimension(zval *object, zval *offset, int type TSRMLS_DC)
- void write_dimension(zval *object, zval *offset, zval *value TSRMLS_DC)
- int has_dimension(zval *object, zval *member, int check_empty TSRMLS_DC)
- void unset_dimension(zval *object, zval *offset TSRMLS_DC)
This set of handlers is the internal representation of the
ArrayAccess
interface.
- void set(zval **object, zval *value TSRMLS_DC)
- zval *get(zval *object TSRMLS_DC)
These handlers get/set the “object value”. They can be used to override (to a certain degree) the compound assignment operators (like
+=
or++
) and exist mainly for the purpose of proxy objects. In practice they are rarely used.
- HashTable *get_properties(zval *object TSRMLS_DC)
- HashTable *get_debug_info(zval *object, int *is_temp TSRMLS_DC)
Used to get the object properties as a hashtable. The former is more general purpose, for example it is also used for the
get_object_vars
function. The latter on the other hand is used exclusively to display properties in debugging functions likevar_dump
. So even if your object does not provide any formal properties you can still have a meaningful debug output.
- union _zend_function *get_method(zval **object_ptr, char *method, int method_len, const struct _zend_literal *key TSRMLS_DC)
-
int call_method(const char *method, INTERNAL_FUNCTION_PARAMETERS)¶
The
get_method
handler fetches thezend_function
used to call a certain method. If there is no particularzend_function
that you want to invoke, but you rather want a__call
-like catch-all behavior, thenget_method
can signal that it is aZEND_OVERLOADED_FUNCTION
in which case thecall_method
handler will be used instead.
- union _zend_function *get_constructor(zval *object TSRMLS_DC)
Like
get_method
, but getting the constructor function. The most common reason to override this handler is to disallow manual construction by throwing an error in the handler.
- int count_elements(zval *object, long *count TSRMLS_DC)
This is just the internal way of implementing the
Countable::count
method.
- int compare_objects(zval *object1, zval *object2 TSRMLS_DC)
- int cast_object(zval *readobj, zval *retval, int type TSRMLS_DC)
Internal classes have the ability to implement a custom compare behavior and override casting behavior for all types. Userland classes on the other hand only have the ability to override object to string casting through
__toString
.
- int get_closure(zval *obj, zend_class_entry **ce_ptr, union _zend_function **fptr_ptr, zval **zobj_ptr TSRMLS_DC)
This handler is invoked when the object is used as a function, i.e. it is the internal version of
__invoke
. The name derives from the fact that its main use is for the implementation of closures (theClosure
class).
- zend_class_entry *get_class_entry(const zval *object TSRMLS_DC)
- int get_class_name(const zval *object, const char **class_name, zend_uint *class_name_len, int parent TSRMLS_DC)
These two handlers are used to get the class entry and class name from an object. There should be little reason to overwrite them. The only occasion that I can think of where this would be necessary is if you choose to create a custom object structure that does not contain the standard
zend_object
as a substructure. (This is entirely possible, but not usually done.)
- void add_ref(zval *object TSRMLS_DC)
- void del_ref(zval *object TSRMLS_DC)
- zend_object_value clone_obj(zval *object TSRMLS_DC)
- HashTable *get_gc(zval *object, zval ***table, int *n TSRMLS_DC)
This set of handlers is used for various object maintenance tasks.
add_ref
is called when a new zval starts referencing the object,del_ref
is called when a reference is removed. By default these handlers will change the refcount in the object store. Once again there should be virtually no reason to overwrite them. The only application I can think of is when you choose not to use the Zend object store, but rather use some custom storage facility.You already know the
clone_obj
handler, so I’ll jump right toget_gc
: This handler should return all variables that are held by the object, so cyclic dependencies can be properly collected.
Implementing array access using object handlers¶
In the previous section the ArrayAccess
interface was used to provide array-like behavior for the buffer views. Now
we want to improve the implementation by using the respective *_dimension
object handlers. These same handlers are
also used to implement ArrayAccess
, but providing a custom implementation will be faster as the overhead of calling
methods is avoided.
The object handlers for dimensions are read_dimension
, write_dimension
, has_dimension
and
unset_dimension
. They all take the object zval as first argument and the offset zval as second. For our purposes
the offset has to be an integer, so let’s first introduce a helper function for getting the long value from a zval (in
order to avoid all the repeating cast code):
static long get_long_from_zval(zval *zv)
{
if (Z_TYPE_P(zv) == IS_LONG) {
return Z_LVAL_P(zv);
} else {
zval tmp = *zv;
zval_copy_ctor(&tmp);
convert_to_long(&tmp);
return Z_LVAL(tmp);
}
}
Now writing the respective handlers is rather straightforward. For example, this is how the read_dimension
handler
looks like:
static zval *array_buffer_view_read_dimension(zval *object, zval *zv_offset, int type TSRMLS_DC)
{
buffer_view_object *intern = zend_object_store_get_object(object TSRMLS_CC);
zval *retval;
long offset;
if (!zv_offset) {
zend_throw_exception(NULL, "Cannot append to a typed array", 0 TSRMLS_CC);
return NULL;
}
offset = get_long_from_zval(zv_offset);
if (offset < 0 || offset >= intern->length) {
zend_throw_exception(NULL, "Offset is outside the buffer range", 0 TSRMLS_CC);
return NULL;
}
retval = buffer_view_offset_get(intern, offset);
Z_DELREF_P(retval); /* Refcount should be 0 if not referenced from ext / engine */
return retval;
}
Something that is slightly odd about this handler is the Z_DELREF_P(retval)
at the end: read_dimension
is
expected to return a zval with refcount 0 if the returned zval isn’t used anywhere else (as it is the case for us). The
engine will increment the refcount itself. The refcount 0 also tells the engine that reference operations on the return
value don’t make sense (as nothing would be actually modified).
Another thing that might seem strange is that we have to check for array appends (which are signaled by
zv_offset = NULL
) in a read handler. This is related to type
parameter that was left unused in the above
code. This parameter specifies the context in which the read occurred. For “normal” $foo[0]
style reads the type
will be BP_VAR_R
, but it can also be one of BP_VAR_W
, BP_VAR_RW
, BP_VAR_IS
or BP_VAR_UNSET
. To
understand when “non-read” types like this can happen consider the following examples:
$foo[0][1]; // [0] is a read_dimension(..., BP_VAR_R),
// [1] is a read_dimension(..., BP_VAR_R)
$foo[0][1] = $bar; // [0] is a read_dimension(..., BP_VAR_W), [1] is a write_dimension
$foo[][1] = $bar; // [] is a read_dimension(..., BP_VAR_W), [1] is a write_dimension
isset($foo[0][1]); // [0] is a read_dimension(..., BP_VAR_IS), [1] is a has_dimension
unset($foo[0][1]); // [0] is a read_dimension(..., BP_VAR_UNSET), [1] is a unset_dimension
As you can see the other BP_VAR
types occur with nested dimension access. In this case only the outermost access
calls the actual handler for the operation, the inner dimension accesses go through the read handler with the respective
type. So if the []
append operator is used in a nested access the read_dimension
handler can be called with the
offset being NULL
.
The type
parameter can be used to change the behavior depending on the context. For example isset
is usually
expected not to throw any warnings, errors or exceptions. We could honor this by explicitly checking for the
BP_VAR_IS
type:
if (type == BP_VAR_IS)
return &EG(uninitialized_zval_ptr);
}
But as in our particular case nested dimension access doesn’t really make sense we don’t need to worry much about any such behaviors.
The remaining handlers are similar to read_dimension
(but less tricky):
static void array_buffer_view_write_dimension(
zval *object, zval *zv_offset, zval *value TSRMLS_DC
) {
buffer_view_object *intern = zend_object_store_get_object(object TSRMLS_CC);
long offset;
if (!zv_offset) {
zend_throw_exception(NULL, "Cannot append to a typed array", 0 TSRMLS_CC);
return;
}
offset = get_long_from_zval(zv_offset);
if (offset < 0 || offset >= intern->length) {
zend_throw_exception(NULL, "Offset is outside the buffer range", 0 TSRMLS_CC);
return;
}
buffer_view_offset_set(intern, offset, value);
}
static int array_buffer_view_has_dimension(
zval *object, zval *zv_offset, int check_empty TSRMLS_DC
) {
buffer_view_object *intern = zend_object_store_get_object(object TSRMLS_CC);
long offset = get_long_from_zval(zv_offset);
if (offset < 0 || offset >= intern->length) {
return 0;
}
if (check_empty) {
int retval;
zval *value = buffer_view_offset_get(intern, offset);
retval = zend_is_true(value);
zval_ptr_dtor(&value);
return retval;
}
return 1;
}
static void array_buffer_view_unset_dimension(zval *object, zval *zv_offset TSRMLS_DC)
{
zend_throw_exception(NULL, "Cannot unset offsets in a typed array", 0 TSRMLS_CC);
}
There is little to say about these handlers. The only thing worth noting is the check_empty
parameter of the
has_dimension
handler. If this parameter is 0
then it’s an isset
call, if it is 1
then it’s an empty
call. For isset
the mere existence is checked, for empty
the truthyness.
Lastly the new handlers need to be assigned in MINIT
:
memcpy(&array_buffer_view_handlers, zend_get_std_object_handlers(), sizeof(zend_object_handlers));
array_buffer_view_handlers.clone_obj = array_buffer_view_clone; /* from previous section */
array_buffer_view_handlers.read_dimension = array_buffer_view_read_dimension;
array_buffer_view_handlers.write_dimension = array_buffer_view_write_dimension;
array_buffer_view_handlers.has_dimension = array_buffer_view_has_dimension;
array_buffer_view_handlers.unset_dimension = array_buffer_view_unset_dimension;
And now all array operations should work just as previously, only faster (for me using the handlers directly was about
four times faster than ArrayAccess
).
Honoring inheritance¶
One key issue that has to be considered whenever you implement object handlers is that they apply all the way down the
inheritance chain. If the user extends one of the view classes it will still use the same handlers. So if the dimension
access handlers are overridden the user will no longer be able to use ArrayAccess
in an inheriting class.
A very simple way to solve this issue is to check whether the class was extended in the dimension handlers and fall back to the standard handlers in this case:
if (intern->std.ce->parent) {
return zend_get_std_object_handlers()->read_dimension(object, zv_offset, type TSRMLS_CC);
}
Comparison of view objects¶
Right now view objects will always be considered equal if they are of the same type (and have no properties). That’s
not really what we want. Instead we should implement our own comparison behavior: Two buffer views should be considered
equal if they use the same buffer, with the same offset, same length and same type. Furthermore their class entry should
match (so inheriting classes aren’t considered equal). Additionally the properties should be equal, or to simplify our
implementation just shouldn’t exist. In other words: Two buffer views are equal if their internal objects are the same
byte for byte. We can easily check this with memcmp
:
static int array_buffer_view_compare_objects(zval *obj1, zval *obj2 TSRMLS_DC)
{
buffer_view_object *intern1 = zend_object_store_get_object(obj1 TSRMLS_CC);
buffer_view_object *intern2 = zend_object_store_get_object(obj2 TSRMLS_CC);
if (memcmp(intern1, intern2, sizeof(buffer_view_object)) == 0) {
return 0; /* equal */
} else {
return 1; /* not orderable */
}
}
As you can see the compare_objects
handler takes two objects and returns how those two objects relate. The return
value is one of -1 (smaller), 0 (equal) and 1 (greater).
In our case the smaller/greater relationship doesn’t really make sense, so we want $view1 < $view2
and
$view1 > $view2
to always be false. This can be done by returning 1 from the handler if the objects are not equal.
You might wonder why this works, after all 1 means “greater” so one could expect $view1 > $view2
to return true.
The reason why this trick works is that PHP automatically translates $a > $b
to $b < $a
(and $a >= $b
to
$b <= $a
). Thus always the “less than” relationship is used and as we’re returning 1 (regardless of order) any
comparison will be false.
A similar comparison handler can be written for the ArrayBuffer
class too.
Debug information¶
If you dumped a buffer view object with var_dump
or print_r
right now, you wouldn’t get any useful information:
object(Int8Array)#2 (0) {
}
It would be much more helpful if instead the contents of the array were printed. Such a behavior can be implemented
using the get_debug_info
handler:
static HashTable *array_buffer_view_get_debug_info(zval *obj, int *is_temp TSRMLS_DC)
{
buffer_view_object *intern = zend_object_store_get_object(obj TSRMLS_CC);
HashTable *props = Z_OBJPROP_P(obj);
HashTable *ht;
int i;
ALLOC_HASHTABLE(ht);
ZEND_INIT_SYMTABLE_EX(ht, intern->length + zend_hash_num_elements(props), 0);
zend_hash_copy(ht, props, (copy_ctor_func_t) zval_add_ref, NULL, sizeof(zval *));
*is_temp = 1;
for (i = 0; i < intern->length; ++i) {
zval *value = buffer_view_offset_get(intern, i);
zend_hash_index_update(ht, i, (void *) &value, sizeof(zval *), NULL);
}
return ht;
}
The handler creates a hashtable using ZEND_INIT_SYMTABLE_EX
to provide a size-hint, copies the properties (in case
the user added custom properties) and then loops through the view and inserts all its elements into the hash.
Into the additional is_temp
parameter the value 1
is written, signifying that we are using a temporary
hashtable that has to be freed later. Alternatively we could write 0
into the pointer, in which case we would have
to store the hashtable somewhere else and manually free it (you’ll find that many objects have some kind of
debug_info
field in their internal structure that is used for this purpose.)
A small example of the kind of output this produces:
$buffer = new ArrayBuffer(4);
$view = new Int8Array($buffer);
$view->foo = 'bar';
$view[0] = 10; $view[1] = 20; $view[2] = -10; $view[3] = -20;
var_dump($view);
// outputs
object(Int8Array)#2 (5) {
["foo"]=>
string(3) "bar"
[0]=>
int(10)
[1]=>
int(20)
[2]=>
int(-10)
[3]=>
int(-20)
}
One more handler that could be implemented for typed arrays is count_elements
, i.e. the internal equivalent of
Countable::count()
. There is nothing special about that handler though, so I’m leaving this as an exercise for the
reader (just don’t forget the inheritance check!)