core/utf8 – Kohana API 2.3 Documentation

system/core/utf8.php

Class: utf8

final class utf8

Methods

utf8 :: clean

public static function clean

Recursively cleans arrays, objects, and strings. Removes ASCII control codes and converts to UTF-8 while silently discarding incompatible UTF-8 characters.

Parameters:

string str
string to clean

Return: string


utf8 :: is_ascii

public static function is_ascii

Tests whether a string contains only 7bit ASCII bytes. This is used to determine when to use native functions or UTF-8 functions.

Parameters:

string str
string to check

Return: bool


utf8 :: strip_ascii_ctrl

public static function strip_ascii_ctrl

Strips out device control codes in the ASCII range.

Parameters:

string str
string to clean

Return: string


utf8 :: strip_non_ascii

public static function strip_non_ascii

Strips out all non-7bit ASCII bytes.

Parameters:

string str
string to clean

Return: string


utf8 :: transliterate_to_ascii

public static function transliterate_to_ascii

Replaces special/accented UTF-8 characters by ASCII-7 'equivalents'.

Parameters:

string str
string to transliterate
integer case
(int 0) -1 lowercase only, +1 uppercase only, 0 both cases

Author: Andreas Gohr

Return: string


utf8 :: strlen

public static function strlen

Returns the length of the given string.

Parameters:

string str
string being measured for length

See: http://php.net/strlen

Return: integer


utf8 :: strpos

public static function strpos

Finds position of first occurrence of a UTF-8 string.

Parameters:

string str
haystack
string search
needle
integer offset
(int 0) offset from which character in haystack to start searching

See: http://php.net/strlen

Author: Harry Fuecks

Return:

  • integer position of needle
  • boolean FALSE if the needle is not found

utf8 :: strrpos

public static function strrpos

Finds position of last occurrence of a char in a UTF-8 string.

Parameters:

string str
haystack
string search
needle
integer offset
(int 0) offset from which character in haystack to start searching

See: http://php.net/strrpos

Author: Harry Fuecks

Return:

  • integer position of needle
  • boolean FALSE if the needle is not found

utf8 :: substr

public static function substr

Returns part of a UTF-8 string.

Parameters:

string str
input string
integer offset
offset
integer length
(NULL) length limit

See: http://php.net/substr

Author: Chris Smith

Return: string


utf8 :: substr_replace

public static function substr_replace

Replaces text within a portion of a UTF-8 string.

Parameters:

string str
input string
string replacement
replacement string
integer offset
offset
length
(NULL)

See: http://php.net/substr_replace

Author: Harry Fuecks

Return: string


utf8 :: strtolower

public static function strtolower

Makes a UTF-8 string lowercase.

Parameters:

string str
mixed case string

See: http://php.net/strtolower

Author: Andreas Gohr

Return: string


utf8 :: strtoupper

public static function strtoupper

Makes a UTF-8 string uppercase.

Parameters:

string str
mixed case string

See: http://php.net/strtoupper

Author: Andreas Gohr

Return: string


utf8 :: ucfirst

public static function ucfirst

Makes a UTF-8 string's first character uppercase.

Parameters:

string str
mixed case string

See: http://php.net/ucfirst

Author: Harry Fuecks

Return: string


utf8 :: ucwords

public static function ucwords

Makes the first character of every word in a UTF-8 string uppercase.

Parameters:

string str
mixed case string

See: http://php.net/ucwords

Author: Harry Fuecks

Return: string


utf8 :: strcasecmp

public static function strcasecmp

Case-insensitive UTF-8 string comparison.

Parameters:

string str1
string to compare
string str2
string to compare

See: http://php.net/strcasecmp

Author: Harry Fuecks

Return:

  • integer less than 0 if str1 is less than str2
  • integer greater than 0 if str1 is greater than str2
  • integer 0 if they are equal

utf8 :: str_ireplace

public static function str_ireplace

Returns a string or an array with all occurrences of search in subject (ignoring case). replaced with the given replace value.

Parameters:

string or array search
text to replace
string or array replace
replacement text
string or array str
subject text
integer count
(NULL) number of matched and replaced needles will be returned via this parameter which is passed by reference

See: http://php.net/str_ireplace

Note: It's not fast and gets slower if $search and/or $replace are arrays.

Author: Harry Fuecks

Return:

  • string if the input was a string
  • array if the input was an array

utf8 :: stristr

public static function stristr

Case-insenstive UTF-8 version of strstr. Returns all of input string from the first occurrence of needle to the end.

Parameters:

string str
input string
string search
needle

See: http://php.net/stristr

Author: Harry Fuecks

Return:

  • string matched substring if found
  • boolean FALSE if the substring was not found

utf8 :: strspn

public static function strspn

Finds the length of the initial segment matching mask.

Parameters:

string str
input string
string mask
mask for search
integer offset
(NULL) start position of the string to examine
integer length
(NULL) length of the string to examine

See: http://php.net/strspn

Author: Harry Fuecks

Return: integer length of the initial segment that contains characters in the mask


utf8 :: strcspn

public static function strcspn

Finds the length of the initial segment not matching mask.

Parameters:

string str
input string
string mask
mask for search
integer offset
(NULL) start position of the string to examine
integer length
(NULL) length of the string to examine

See: http://php.net/strcspn

Author: Harry Fuecks

Return: integer length of the initial segment that contains characters not in the mask


utf8 :: str_pad

public static function str_pad

Pads a UTF-8 string to a certain length with another string.

Parameters:

string str
input string
integer final_str_length
desired string length after padding
string pad_str
(string ) string to use as padding
string pad_type
(int 1) padding type: STR_PAD_RIGHT, STR_PAD_LEFT, or STR_PAD_BOTH

See: http://php.net/str_pad

Author: Harry Fuecks

Return: string


utf8 :: str_split

public static function str_split

Converts a UTF-8 string to an array.

Parameters:

string str
input string
integer split_length
(int 1) maximum length of each chunk

See: http://php.net/str_split

Author: Harry Fuecks

Return: array


utf8 :: strrev

public static function strrev

Reverses a UTF-8 string.

Parameters:

string str
string to be reversed

See: http://php.net/strrev

Author: Harry Fuecks

Return: string


utf8 :: trim

public static function trim

Strips whitespace (or other UTF-8 characters) from the beginning and end of a string.

Parameters:

string str
input string
string charlist
(NULL) string of characters to remove

See: http://php.net/trim

Author: Andreas Gohr

Return: string


utf8 :: ltrim

public static function ltrim

Strips whitespace (or other UTF-8 characters) from the beginning of a string.

Parameters:

string str
input string
string charlist
(NULL) string of characters to remove

See: http://php.net/ltrim

Author: Andreas Gohr

Return: string


utf8 :: rtrim

public static function rtrim

Strips whitespace (or other UTF-8 characters) from the end of a string.

Parameters:

string str
input string
string charlist
(NULL) string of characters to remove

See: http://php.net/rtrim

Author: Andreas Gohr

Return: string


utf8 :: ord

public static function ord

Returns the unicode ordinal for a character.

Parameters:

string chr
UTF-8 encoded character

See: http://php.net/ord

Author: Harry Fuecks

Return: integer


utf8 :: to_unicode

public static function to_unicode

Takes an UTF-8 string and returns an array of ints representing the Unicode characters. Astral planes are supported i.e. the ints in the output can be > 0xFFFF. Occurrances of the BOM are ignored. Surrogates are not allowed.

The Original Code is Mozilla Communicator client code. The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 1998 the Initial Developer. Ported to PHP by Henri Sivonen , see http://hsivonen.iki.fi/php-utf8/. Slight modifications to fit with phputf8 library by Harry Fuecks .

Parameters:

string str
UTF-8 encoded string

Return:

  • array unicode code points
  • boolean FALSE if the string is invalid

utf8 :: from_unicode

public static function from_unicode

Takes an array of ints representing the Unicode characters and returns a UTF-8 string. Astral planes are supported i.e. the ints in the input can be > 0xFFFF. Occurrances of the BOM are ignored. Surrogates are not allowed.

The Original Code is Mozilla Communicator client code. The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 1998 the Initial Developer. Ported to PHP by Henri Sivonen , see http://hsivonen.iki.fi/php-utf8/. Slight modifications to fit with phputf8 library by Harry Fuecks .

Parameters:

array arr
unicode code points representing a string

Return:

  • string utf8 string of characters
  • boolean FALSE if a code point cannot be found