Casts and operations¶
Basic operations¶
As zvals are complex structures you can’t directly perform basic operations like zv1 + zv2
on them. Doing something
like this will either give you an error or end up adding together two pointers rather than their values.
The “basic” operations like +
are rather complicated when working with zvals, because they have to work across
many types. For example PHP allows you to add together a double with a string containing an integer (3.14 + "17"
)
or even adding two arrays ([1, 2, 3] + [4, 5, 6]
).
For this reason PHP provides special functions for performing operations on zvals. Addition for example is handled by
add_function()
:
zval *a, *b, *result;
MAKE_STD_ZVAL(a);
MAKE_STD_ZVAL(b);
MAKE_STD_ZVAL(result);
ZVAL_DOUBLE(a, 3.14);
ZVAL_STRING(b, "17", 1);
/* result = a + b */
add_function(result, a, b TSRMLS_CC);
php_printf("%Z\n", result); /* 20.14 */
/* zvals a, b, result need to be dtored */
Apart from add_function()
there are several other functions implementing binary (two-operand) operations, all with
the same signature:
int add_function(zval *result, zval *op1, zval *op2 TSRMLS_DC); /* + */
int sub_function(zval *result, zval *op1, zval *op2 TSRMLS_DC); /* - */
int mul_function(zval *result, zval *op1, zval *op2 TSRMLS_DC); /* * */
int div_function(zval *result, zval *op1, zval *op2 TSRMLS_DC); /* / */
int mod_function(zval *result, zval *op1, zval *op2 TSRMLS_DC); /* % */
int concat_function(zval *result, zval *op1, zval *op2 TSRMLS_DC); /* . */
int bitwise_or_function(zval *result, zval *op1, zval *op2 TSRMLS_DC); /* | */
int bitwise_and_function(zval *result, zval *op1, zval *op2 TSRMLS_DC); /* & */
int bitwise_xor_function(zval *result, zval *op1, zval *op2 TSRMLS_DC); /* ^ */
int shift_left_function(zval *result, zval *op1, zval *op2 TSRMLS_DC); /* << */
int shift_right_function(zval *result, zval *op1, zval *op2 TSRMLS_DC); /* >> */
int boolean_xor_function(zval *result, zval *op1, zval *op2 TSRMLS_DC); /* xor */
int is_equal_function(zval *result, zval *op1, zval *op2 TSRMLS_DC); /* == */
int is_not_equal_function(zval *result, zval *op1, zval *op2 TSRMLS_DC); /* != */
int is_identical_function(zval *result, zval *op1, zval *op2 TSRMLS_DC); /* === */
int is_not_identical_function(zval *result, zval *op1, zval *op2 TSRMLS_DC); /* !== */
int is_smaller_function(zval *result, zval *op1, zval *op2 TSRMLS_DC); /* < */
int is_smaller_or_equal_function(zval *result, zval *op1, zval *op2 TSRMLS_DC); /* <= */
All functions take a result
zval into which the result of the operation on op1
and op2
is stored. The
int
return value is either SUCCESS
or FAILURE
and indicates whether the operation was successful. Note that
result
will always be set to some value (like false
) even if the operations was not successful.
The result
zval needs to be allocated and initialized prior to calling one of the functions. Alternatively
result
and op1
can be the same, in which case effectively a compound assignment operation is performed:
zval *a, *b;
MAKE_STD_ZVAL(a);
MAKE_STD_ZVAL(b);
ZVAL_LONG(a, 42);
ZVAL_STRING(b, "3");
/* a += b */
add_function(a, a, b TSRMLS_CC);
php_printf("%Z\n", a); /* 45 */
/* zvals a, b need to be dtored */
Some binary operators are missing from the above list. For example there are no functions for >
and >=
. The
reason behind this is that you can implement them using is_smaller_function()
and is_smaller_or_equal_function()
simply by swapping the operands.
Also missing from the list are functions for performing &&
and ||
. The reasoning here is that the main feature
those operators provide is short-circuiting, which you can’t implement with a simple function. If you take
short-circuiting away, both operators are just boolean casts followed by a &&
or ||
C-operation.
Apart from the binary operators there are also two unary (single operand) functions:
int boolean_not_function(zval *result, zval *op1 TSRMLS_DC); /* ! */
int bitwise_not_function(zval *result, zval *op1 TSRMLS_DC); /* ~ */
They work in the same way the other functions, but accept only one operand. The unary +
and -
operations are
missing, because they can be implemented as 0 + $value
and 0 - $value
respectively, by making use of
add_function()
and sub_function()
.
The last two functions implement the ++
and --
operators:
int increment_function(zval *op1); /* ++ */
int decrement_function(zval *op1); /* -- */
These functions don’t take a result zval and instead directly modify the passed operand. Note that using these is
different from performing a + 1
or - 1
with add_function()
/sub_function()
. For example incrementing
"a"
will result in "b"
, but adding "a" + 1
will result in 1
.
Comparisons¶
The comparison functions introduced above all perform some specific operation, e.g. is_equal_function()
corresponds
to ==
and is_smaller_function()
performs a <
. An alternative to these is compare_function()
which
computes a more generic result:
zval *a, *b, *result;
MAKE_STD_ZVAL(a);
MAKE_STD_ZVAL(b);
MAKE_STD_ZVAL(result);
ZVAL_LONG(a, 42);
ZVAL_STRING(b, "24");
compare_function(result, a, b TSRMLS_CC);
if (Z_LVAL_P(result) < 0) {
php_printf("a is smaller than b\n");
} else if (Z_LVAL_P(result) > 0) {
php_printf("a is greater than b\n");
} else /*if (Z_LVAL_P(result) == 0)*/ {
php_printf("a is equal to b\n");
}
/* zvals a, b, result need to be dtored */
compare_function()
will set the result
zval to one of -1, 1 or 0 corresponding to the relations “smaller than”,
“greater than” or “equal” between the passed values. It is also part of a larger family of comparison functions:
int compare_function(zval *result, zval *op1, zval *op2 TSRMLS_DC);
int numeric_compare_function(zval *result, zval *op1, zval *op2 TSRMLS_DC);
int string_compare_function_ex(zval *result, zval *op1, zval *op2, zend_bool case_insensitive TSRMLS_DC);
int string_compare_function(zval *result, zval *op1, zval *op2 TSRMLS_DC);
int string_case_compare_function(zval *result, zval *op1, zval *op2 TSRMLS_DC);
#ifdef HAVE_STRCOLL
int string_locale_compare_function(zval *result, zval *op1, zval *op2 TSRMLS_DC);
#endif
Once again all functions accept two operands and a result zval and return SUCCESS
/FAILURE
.
compare_function()
performs a “normal” PHP comparison (i.e. it behaves the same way as the <
, >
and ==
operators). numeric_compare_function()
compares the operands as numbers by casting them to doubles first.
string_compare_function_ex()
compares the operands as strings and has a flag that indicates whether the comparison
should be case_insensitive
. Instead of manually specifying that flag you can also use
string_compare_function()
(case sensitive) or string_case_compare_function()
(case insensitive). The string
comparison done by these functions is a normal lexicographical string comparison without additional magic for numeric
strings.
string_locale_compare_function()
performs a string comparison according to the current locale and is only available
if HAVE_STRCOLL
is defined. As such you must use #ifdef HAVE_STRCOLL
guards whenever you employ the function.
As with anything related to locales, it’s best to avoid its use.
Casts¶
When implementing your own code you will very often deal with only one particular type of zval. E.g. if you are implementing some string handling code, you’ll want to deal only with string zvals and not bother with everything else. On the other hand you likely also want to support PHPs dynamic type system: PHP allows you to work with numbers as strings and extension code should honor this as well.
The solution is to cast a zval of arbitrary type to the specific type you’ll be working with. In order to support this
PHP provides a convert_to_*
function for every type (apart from resources, as there is no (resource)
cast):
void convert_to_null(zval *op);
void convert_to_boolean(zval *op);
void convert_to_long(zval *op);
void convert_to_double(zval *op);
void convert_to_string(zval *op);
void convert_to_array(zval *op);
void convert_to_object(zval *op);
void convert_to_long_base(zval *op, int base);
void convert_to_cstring(zval *op);
The last two functions implement non-standard casts: convert_to_long_base()
is the same as convert_to_long()
,
but it will make use of a particular base for string to long conversions (e.g. 16
for hexadecimals).
convert_to_cstring()
behaves like convert_to_string()
but uses a locale-independent double to string conversion.
This means that the result will always use .
as the decimal separator rather than creating locale-specific strings
like "3,14"
(Germany).
The convert_to_*
functions will directly modify the passed zval:
zval *zv_ptr;
MAKE_STD_ZVAL(zv_ptr);
ZVAL_STRING(zv_ptr, "123 foobar", 1);
convert_to_long(zv_ptr);
php_printf("%ld\n", Z_LVAL_P(zv_ptr));
zval_dtor(&zv_ptr);
If the zval is used in more than one place (refcount > 1) chances are that directly modifying it would result in
incorrect behavior. E.g. if you receive a zval by-value and directly apply a convert_to_*
function to it, you will
modify not only the reference to the zval inside the function but also the reference outside of it.
To solve this issue PHP provides an additional set of convert_to_*_ex
macros:
void convert_to_null_ex(zval **ppzv);
void convert_to_boolean_ex(zval **ppzv);
void convert_to_long_ex(zval **ppzv);
void convert_to_double_ex(zval **ppzv);
void convert_to_string_ex(zval **ppzv);
void convert_to_array_ex(zval **ppzv);
void convert_to_object_ex(zval **ppzv);
These macros take a zval**
and are implemented by performing a SEPARATE_ZVAL_IF_NOT_REF()
before the type
conversion:
#define convert_to_ex_master(ppzv, lower_type, upper_type) \
if (Z_TYPE_PP(ppzv)!=IS_##upper_type) { \
SEPARATE_ZVAL_IF_NOT_REF(ppzv); \
convert_to_##lower_type(*ppzv); \
}
Apart from this the usage is similar to the normal convert_to_*
functions:
zval **zv_ptr_ptr = /* get function argument */;
convert_to_long_ex(zv_ptr_ptr);
php_printf("%ld\n", Z_LVAL_PP(zv_ptr_ptr));
/* No need to dtor as function arguments are dtored automatically */
But even this will not always be enough. Lets consider a very similar case where a value is fetched from an array:
zval *array_zv = /* get array from somewhere */;
/* Fetch array index 42 into zv_dest (how this works is not relevant here) */
zval **zv_dest;
if (zend_hash_index_find(Z_ARRVAL_P(array_zv), 42, (void **) &zv_dest) == FAILURE) {
/* Error: Index not found */
return;
}
convert_to_long_ex(zv_dest);
php_printf("%ld\n", Z_LVAL_PP(zv_dest));
/* No need to dtor because array values are dtored automatically */
The use of convert_to_long_ex()
in the above code will prevent modification of references to the value outside the
array, but it will still change the value inside the array itself. In some cases this is the correct behavior, but
typically you want to avoid modifying the array when fetching values from it.
In cases like these there is no way around copying the zval before converting it:
zval **zv_dest = /* get array value */;
zval tmp_zv;
ZVAL_COPY_VALUE(&tmp_zv, *zv_dest);
zval_copy_ctor(&tmp_zv);
convert_to_long(&tmp_zv);
php_printf("%ld\n", Z_LVAL(tmp_zv));
zval_dtor(&tmp_zv);
The last zval_dtor()
call in the above code is not strictly necessary, because we know that tmp_zv
will be
of type IS_LONG
, which is a type that does not require destruction. For conversions to other types like strings or
arrays the dtor call is necessary though.
If the use of to-long or to-double conversions is common in your code, it can make sense to create helper functions which perform casts without modifying any zval. A sample implementation for long casts:
long zval_get_long(zval *zv) {
switch (Z_TYPE_P(zv)) {
case IS_NULL:
return 0;
case IS_BOOL:
case IS_LONG:
case IS_RESOURCE:
return Z_LVAL_P(zv);
case IS_DOUBLE:
return zend_dval_to_lval(Z_DVAL_P(zv));
case IS_STRING:
return strtol(Z_STRVAL_P(zv), NULL, 10);
case IS_ARRAY:
return zend_hash_num_elements(Z_ARRVAL_P(zv)) ? 1 : 0;
case IS_OBJECT: {
zval tmp_zv;
ZVAL_COPY_VALUE(&tmp_zv, zv);
zval_copy_ctor(&tmp);
convert_to_long_base(&tmp, 10);
return Z_LVAL_P(tmp_zv);
}
}
}
The above code will directly return the result of the cast without performing any zval copies (apart from the
IS_OBJECT
case where the copy is unavoidable). By making use of the function the array value cast example becomes
much simpler:
zval **zv_dest = /* get array value */;
long lval = zval_get_long(*zv_dest);
php_printf("%ld\n", lval);
PHPs standard library already contains one function of this type, namely zend_is_true()
. This function is
functionally equivalent to a bool cast from which the value is returned directly:
zval *zv_ptr;
MAKE_STD_ZVAL(zv_ptr);
ZVAL_STRING(zv, "", 1);
php_printf("%d\n", zend_is_true(zv)); // 0
zval_dtor(zv);
ZVAL_STRING(zv, "foobar", 1);
php_printf("%d\n", zend_is_true(zv)); // 1
zval_ptr_dtor(&zv);
Another function which avoids unnecessary copies during casting is zend_make_printable_zval()
. This function
performs the same string cast as convert_to_string()
but makes use of a different API. The typical usage is as
follows:
zval *zv_ptr = /* get zval from somewhere */;
zval tmp_zval;
int tmp_zval_used;
zend_make_printable_zval(zv_ptr, &tmp_zval, &tmp_zval_used);
if (tmp_zval_used) {
zv_ptr = &tmp_zval;
}
PHPWRITE(Z_STRVAL_P(zv_ptr), Z_STRLEN_P(zv_ptr));
if (tmp_zval_used) {
zval_dtor(&tmp_zval);
}
The second parameter to this function is a pointer to a temporary zval and the third parameter is a pointer to an integer. If the function makes use of the temporary zval, the integer will be set to one, zero otherwise.
Based on tmp_zval_used
you can then decide whether to use the original zval or the temporary copy. Very commonly
the temporary zval is simply assigned to the original zval using zv_ptr = &tmp_zval
. This allows you to always work
with zv_ptr
rather than having conditionals everywhere to choose between the two.
Finally you need to dtor the temporary zval using zval_dtor(&tmp_zval)
, but only if it was actually used.
Another function that is related to casting is is_numeric_string()
. This function checks whether a string is
“numeric” and extracts the value into either a long or a double:
long lval;
double dval;
switch (is_numeric_string(Z_STRVAL_P(zv_ptr), Z_STRLEN_P(zv_ptr), &lval, &dval, 0)) {
case IS_LONG:
/* String is an integer those value was put into `lval` */
break;
case IS_DOUBLE:
/* String is a double those value was put into `dval` */
break;
default:
/* String is not numeric */
}
The last argument to this function is called allow_errors
. Setting it to 0
will reject strings like
"123abc"
, whereas setting it to 1
will silently allow them (with value 123
). A third value -1
provides
an intermediate solution, which accepts the string, but throws a notice.
It is helpful to know that this function also accepts hexadecimal numbers in the 0xabc
format. In this it differs
from convert_to_long()
and convert_to_double()
which would cast "0xabc"
to zero.
is_numeric_string()
is particularly useful in cases where you can work with both integer and floating point numbers,
but don’t want to incur the precision loss associated with using doubles for both cases. To help this use case, there
is an additional convert_scalar_to_number()
function, which accepts a zval and converts non-array values to either
a long or a double (using is_numeric_string()
for strings). This means that the converted zval will have type
IS_LONG
, IS_DOUBLE
or IS_ARRAY
. The usage is the same as for the convert_to_*()
functions:
zval *zv_ptr;
MAKE_STD_ZVAL(zv_ptr);
ZVAL_STRING(zv_ptr, "3.141", 1);
convert_scalar_to_number(zv_ptr);
switch (Z_TYPE_P(zv_ptr)) {
case IS_LONG:
php_printf("Long: %ld\n", Z_LVAL_P(zv_ptr));
break;
case IS_DOUBLE:
php_printf("Double: %G\n", Z_DVAL_P(zv_ptr));
break;
case IS_ARRAY:
/* Likely throw an error here */
break;
}
zval_ptr_dtor(&zv_ptr);
/* Double: 3.141 */
Once again there also is a convert_scalar_to_number_ex()
variant of this function, which accepts a zval**
and
will separate it before the conversion.