Multibyte String 函数
在线手册:中文 英文
PHP手册

mb_strtolower

(PHP 4 >= 4.3.0, PHP 5)

mb_strtolowerMake a string lowercase

说明

string mb_strtolower ( string $str [, string $encoding = mb_internal_encoding() ] )

Returns str with all alphabetic characters converted to lowercase.

参数

str

The string being lowercased.

encoding

encoding 参数为字符编码。如果省略,则使用内部字符编码。

返回值

str with all alphabetic characters converted to lowercase.

Unicode

For more information about the Unicode properties, please see » http://www.unicode.org/unicode/reports/tr21/.

By contrast to strtolower(), 'alphabetic' is determined by the Unicode character properties. Thus the behaviour of this function is not affected by locale settings and it can convert any characters that have 'alphabetic' property, such as A-umlaut (?).

范例

Example #1 mb_strtolower() example

<?php
$str 
"Mary Had A Little Lamb and She LOVED It So";
$str mb_strtolower($str);
echo 
$str// Prints mary had a little lamb and she loved it so
?>

Example #2 mb_strtolower() example with non-Latin UTF-8 text

<?php
$str 
"Τ?χιστη αλ?πηξ βαφ?? ψημ?νη γη, δρασκελ?ζει υπ?ρ νωθρο? κυν??";
$str mb_strtolower($str'UTF-8');
echo 
$str// Prints τ?χιστη αλ?πηξ βαφ?? ψημ?νη γη, δρασκελ?ζει υπ?ρ νωθρο? κυν??
?>

参见


Multibyte String 函数
在线手册:中文 英文
PHP手册
PHP手册 - N: Make a string lowercase

用户评论:

akniep at linklift dot net (12-Sep-2011 04:42)

Please, note that when using with UTF-8 mb_strtolower will only convert upper case characters to lower case which are marked with the Unicode property "Upper case letter" ("Lu"). However, there are also letters such as "Letter numbers" (Unicode property "Nl") that also have lower case and upper case variants. These characters will not be converted be mb_strtolower!

Example:
The Roman letters Ⅰ, Ⅱ, Ⅲ, ..., ? (UTF-8 code points 8544 through 8559) also exist in their respective lower case variants ⅰ, ⅱ, ⅲ, ..., ? (UTF-8 code points 8560 through 8575) and should, in my opinion, also be converted by mb_strtolower, but they are not!

Big internet-companies (like Google) do match both variants as semantically equal (since the representations only differ in case).

Since I was not finding any proper solution in the internet on how to map all UTF8-strings to their lowercase counterpart in PHP, I offer the following hard-coded extended mb_strtolower function for UTF-8 strings:

The function wraps the existing function mb_strtolower() and additionally replaces uppercase UTF8-characters for which there is a lowercase representation. Since there is no proper Unicode uppercase and lowercase character-table in the internet that I was able to find, I checked the first million UTF8-characters against the Google-search and -KeywordTool and identified the following 78 characters as uppercase-characters, not being replaced by mb_strtolower, but having a UTF8 lowercase counterpart.

<?php

//the numbers in the in-line-comments display the characters' Unicode code-points (CP).
function strtolower_utf8_extended( $utf8_string )
{
   
$additional_replacements    = array
        (
"?"    => "?"        //   453 ->   454
       
, "?"    => "?"        //   456 ->   457
       
, "?"    => "?"        //   459 ->   460
       
, "?"    => "?"        //   498 ->   499
       
, "?"    => "?"        //  1015 ->  1016
       
, "?"    => "?"        //  1017 ->  1010
       
, "?"    => "?"        //  1018 ->  1019
       
, "?"    => "?"        //  8072 ->  8064
       
, "?"    => "?"        //  8073 ->  8065
       
, "?"    => "?"        //  8074 ->  8066
       
, "?"    => "?"        //  8075 ->  8067
       
, "?"    => "?"        //  8076 ->  8068
       
, "?"    => "?"        //  8077 ->  8069
       
, "?"    => "?"        //  8078 ->  8070
       
, "?"    => "?"        //  8079 ->  8071
       
, "?"    => "?"        //  8088 ->  8080
       
, "?"    => "?"        //  8089 ->  8081
       
, "?"    => "?"        //  8090 ->  8082
       
, "?"    => "?"        //  8091 ->  8083
       
, "?"    => "?"        //  8092 ->  8084
       
, "?"    => "?"        //  8093 ->  8085
       
, "?"    => "?"        //  8094 ->  8086
       
, "?"    => "?"        //  8095 ->  8087
       
, "?"    => "?"        //  8104 ->  8096
       
, "?"    => "?"        //  8105 ->  8097
       
, "?"    => "?"        //  8106 ->  8098
       
, "?"    => "?"        //  8107 ->  8099
       
, "?"    => "?"        //  8108 ->  8100
       
, "?"    => "?"        //  8109 ->  8101
       
, "?"    => "?"        //  8110 ->  8102
       
, "?"    => "?"        //  8111 ->  8103
       
, "?"    => "?"        //  8124 ->  8115
       
, "?"    => "?"        //  8140 ->  8131
       
, "?"    => "?"        //  8188 ->  8179
       
, "Ⅰ"    => "ⅰ"        //  8544 ->  8560
       
, "Ⅱ"    => "ⅱ"        //  8545 ->  8561
       
, "Ⅲ"    => "ⅲ"        //  8546 ->  8562
       
, "Ⅳ"    => "ⅳ"        //  8547 ->  8563
       
, "Ⅴ"    => "ⅴ"        //  8548 ->  8564
       
, "Ⅵ"    => "ⅵ"        //  8549 ->  8565
       
, "Ⅶ"    => "ⅶ"        //  8550 ->  8566
       
, "Ⅷ"    => "ⅷ"        //  8551 ->  8567
       
, "Ⅸ"    => "ⅸ"        //  8552 ->  8568
       
, "Ⅹ"    => "ⅹ"        //  8553 ->  8569
       
, "Ⅺ"    => "?"        //  8554 ->  8570
       
, "Ⅻ"    => "?"        //  8555 ->  8571
       
, "?"    => "?"        //  8556 ->  8572
       
, "?"    => "?"        //  8557 ->  8573
       
, "?"    => "?"        //  8558 ->  8574
       
, "?"    => "?"        //  8559 ->  8575
       
, "?"    => "?"        //  9398 ->  9424
       
, "?"    => "?"        //  9399 ->  9425
       
, "?"    => "?"        //  9400 ->  9426
       
, "?"    => "?"        //  9401 ->  9427
       
, "?"    => "?"        //  9402 ->  9428
       
, "?"    => "?"        //  9403 ->  9429
       
, "?"    => "?"        //  9404 ->  9430
       
, "?"    => "?"        //  9405 ->  9431
       
, "?"    => "?"        //  9406 ->  9432
       
, "?"    => "?"        //  9407 ->  9433
       
, "?"    => "?"        //  9408 ->  9434
       
, "?"    => "?"        //  9409 ->  9435
       
, "?"    => "?"        //  9410 ->  9436
       
, "?"    => "?"        //  9411 ->  9437
       
, "?"    => "?"        //  9412 ->  9438
       
, "?"    => "?"        //  9413 ->  9439
       
, "?"    => "?"        //  9414 ->  9440
       
, "?"    => "?"        //  9415 ->  9441
       
, "?"    => "?"        //  9416 ->  9442
       
, "?"    => "?"        //  9417 ->  9443
       
, "?"    => "?"        //  9418 ->  9444
       
, "?"    => "?"        //  9419 ->  9445
       
, "?"    => "?"        //  9420 ->  9446
       
, "?"    => "?"        //  9421 ->  9447
       
, "?"    => "?"        //  9422 ->  9448
       
, "?"    => "?"        //  9423 ->  9449
       
, "?"    => "?"        // 66598 -> 66638
       
, "?"    => "?"        // 66599 -> 66639
       
);
   
   
$utf8_string    = mb_strtolower( $utf8_string, "UTF-8");
   
   
$utf8_string    =

Ken Shiro (31-Jul-2010 10:07)

[If you get this error:]
Fatal error: Call to undefined function: mb_strtolower() in ????.php on line ??

The PHP mbstring extension, which is required to handle international character sets, is not available on your server. Check your PHP configuration and make sure that PHP has been compiled with --enable-mbstring.

It's also apply to
Call to undefined function mb_eregi() / mb_strtolower()

Philipp H (01-Nov-2007 06:11)

Note that mb_strtolower() is very SLOW, if you have a database connection, you may want to use it to convert your strings to lower case. Even latin1/9 (iso-8859-1/15) and other encodings are possible.

Have a look at my simple benchmark:

<?php

$text
= "L?rem ip?üm d?l?r ?it ?met, c?n?ectetüer ?dipi?cing elit. Sed ligül?. Pr?e?ent jü?t? tellü?, gr?vid? eü, tempü? ?, m?tti? n?n, ?rci. N?m qüi? l?rem. N?m ?liqüet elit ?ed elit. Ph??ellü? venen?ti? jü?t? eget enim. D?nec ni?l. Pr?in m?tti? venen?ti? jü?t?. Sed ?liqü?m p?rt? ?rci. Cr?? elit ni?l, c?nv?lli? qüi?, tincidünt ?t, vehicül? ?ccüm??n, ?di?. Sed m?le?tie. Eti?m m?lli? feügi?t elit. Ve?tibülüm ?nte ip?üm primi? in f?ücibü? ?rci lüctü? et ültrice? p??üere cübili? Cür?e; M?ecen?? n?n nüll?.";

// mb_strtolower()
$timeMB = microtime(true);    
             
    for(
$i=0;$i<30000;$i++)
       
$lower = mb_strtolower("$text/no-cache-$i");

$timeMB = microtime(true) - $timeMB;

// MySQL lower()
$timeSQL = microtime(true);   

   
mysql_query("set names latin1");              
    for(
$i=0;$i<30000;$i++) {
       
$r = mysql_fetch_row(mysql_query("select lower('$text/no-cache-$i')"));
       
$lower = $r[0];
    }

$timeSQL = microtime(true) - $timeSQL;

echo
"mb: ".sprintf("%.5f",$timeMB)." sek.<br />";
echo
"sql: ".sprintf("%.5f",$timeSQL)." sek.<br />";

// Result on my notebook:
// mb: 11.50642 sek.
// sql: 5.44143 sek.

?>

btherl at yahoo dot com dot au (16-Nov-2005 05:12)

If you use this function on a unicode string without telling PHP that it is unicode, then you will corrupt your string.  In particular, the uppercase 'A' with tilde, common in 2-byte UTF-8 characters, is converted to lowercase 'a' with tilde.

This can be handled correctly by:
$str = mb_strtolower($str, mb_detect_encoding($str));

Or if you know your data is UTF-8, just use the string "UTF-8" as the second argument.

You should check also that mb_detect_encoding() is checking the encodings you want it to check, and is detecting the correct encodings.