mirror of
https://github.com/boostorg/locale.git
synced 2026-02-27 17:12:20 +00:00
- cleanup of inspect warnings - various requiremets like libraries.htm and maintainers.txt [SVN r73786]
120 lines
8.3 KiB
HTML
120 lines
8.3 KiB
HTML
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
|
<html xmlns="http://www.w3.org/1999/xhtml">
|
|
<head>
|
|
<meta http-equiv="Content-Type" content="text/xhtml;charset=UTF-8"/>
|
|
<title>Boost.Locale: Text Conversions</title>
|
|
<link href="tabs.css" rel="stylesheet" type="text/css"/>
|
|
<link href="doxygen.css" rel="stylesheet" type="text/css"/>
|
|
<link rel="stylesheet" type="text/css" href="../style/section-basic.css">
|
|
</head>
|
|
<body>
|
|
<div id="boost-common-heading-doc">
|
|
<div class="heading-inner">
|
|
<div class="heading-placard"></div>
|
|
|
|
<h1 class="heading-title">
|
|
<a href="http://www.boost.org/">
|
|
<img src="../style/space.png" alt= "Boost C++ Libraries" class="heading-logo" />
|
|
<span class="heading-boost">Boost</span>
|
|
<span class="heading-cpplibraries">C++ Libraries</span>
|
|
</a>
|
|
</h1>
|
|
|
|
<p class="heading-quote">
|
|
|
|
<q>...one of the most highly
|
|
regarded and expertly designed C++ library projects in the
|
|
world.</q>
|
|
|
|
<span class="heading-attribution">— <a href=
|
|
"http://www.gotw.ca/" class="external">Herb Sutter</a> and <a href=
|
|
"http://en.wikipedia.org/wiki/Andrei_Alexandrescu" class="external">Andrei
|
|
Alexandrescu</a>, <a href=
|
|
"http://safari.awprofessional.com/?XmlId=0321113586" class="external">C++
|
|
Coding Standards</a></span>
|
|
</p>
|
|
</div>
|
|
</div>
|
|
|
|
<div id="boost-common-heading-doc-spacer"></div>
|
|
<!-- Generated by Doxygen 1.7.1 -->
|
|
<div class="navigation" id="top">
|
|
<div class="tabs">
|
|
<ul class="tablist">
|
|
<li><a href="main.html"><span>Main Page</span></a></li>
|
|
<li><a href="modules.html"><span>Modules</span></a></li>
|
|
<li><a href="namespaces.html"><span>Namespaces</span></a></li>
|
|
<li><a href="annotated.html"><span>Classes</span></a></li>
|
|
<li><a href="files.html"><span>Files</span></a></li>
|
|
<li><a href="examples.html"><span>Examples</span></a></li>
|
|
</ul>
|
|
</div>
|
|
<div class="navpath">
|
|
<ul>
|
|
<li><a class="el" href="main.html">Boost.Locale</a> </li>
|
|
<li><a class="el" href="using_boost_locale.html">Using Boost.Locale</a> </li>
|
|
</ul>
|
|
</div>
|
|
</div>
|
|
<div class="header">
|
|
<div class="headertitle">
|
|
<h1>Text Conversions </h1> </div>
|
|
</div>
|
|
<div class="contents">
|
|
<p>There is a set of functions that perform basic string conversion operations: upper, lower and <a class="el" href="glossary.html#term_title_case">title case</a> conversions, <a class="el" href="glossary.html#term_case_folding">case folding</a> and Unicode <a class="el" href="glossary.html#term_normalization">normalization</a>. These are <a class="el" href="group__convert.html#ga7889a57e1bc1059fbb107db0781d0b6d">to_upper</a> , <a class="el" href="group__convert.html#ga33de83f16ff2c09cac780977c6f67099">to_lower</a>, <a class="el" href="group__convert.html#ga646f42adb01baf395d632c32f556a5b9">to_title</a>, <a class="el" href="group__convert.html#gabd1bc157122a5b9392487126fd0fffe5">fold_case</a> and <a class="el" href="group__convert.html#ga11672e3cc3ed7eecf1dd07060265aab2">normalize</a>.</p>
|
|
<p>All these functions receive an <code>std::locale</code> object as parameter or use a global locale by default.</p>
|
|
<p>Global locale is used in all examples below.</p>
|
|
<h2><a class="anchor" id="conversions_case"></a>
|
|
Case Handing</h2>
|
|
<p>For example: </p>
|
|
<div class="fragment"><pre class="fragment"> std::string grussen = <span class="stringliteral">"grüßEN"</span>;
|
|
std::cout <<<span class="stringliteral">"Upper "</span><< <a class="code" href="group__convert.html#ga7889a57e1bc1059fbb107db0781d0b6d">boost::locale::to_upper</a>(grussen) << std::endl
|
|
<<<span class="stringliteral">"Lower "</span><< <a class="code" href="group__convert.html#ga33de83f16ff2c09cac780977c6f67099">boost::locale::to_lower</a>(grussen) << std::endl
|
|
<<<span class="stringliteral">"Title "</span><< <a class="code" href="group__convert.html#ga646f42adb01baf395d632c32f556a5b9">boost::locale::to_title</a>(grussen) << std::endl
|
|
<<<span class="stringliteral">"Fold "</span><< <a class="code" href="group__convert.html#gabd1bc157122a5b9392487126fd0fffe5">boost::locale::fold_case</a>(grussen) << std::endl;
|
|
</pre></div><p>Would print:</p>
|
|
<div class="fragment"><pre class="fragment">
|
|
Upper GRÜSSEN
|
|
Lower grüßen
|
|
Title Grüßen
|
|
Fold grüssen
|
|
</pre></div><p>You may notice that there are existing functions <code>to_upper</code> and <code>to_lower</code> in the Boost.StringAlgo library. The difference is that these function operate over an entire string instead of performing incorrect character-by-character conversions.</p>
|
|
<p>For example:</p>
|
|
<div class="fragment"><pre class="fragment"> std::wstring grussen = L<span class="stringliteral">"grüßen"</span>;
|
|
std::wcout << boost::algorithm::to_upper_copy(grussen) << <span class="stringliteral">" "</span> << <a class="code" href="group__convert.html#ga7889a57e1bc1059fbb107db0781d0b6d">boost::locale::to_upper</a>(grussen) << std::endl;
|
|
</pre></div><p>Would give in output:</p>
|
|
<div class="fragment"><pre class="fragment">
|
|
GRÜßEN GRÜSSEN
|
|
</pre></div><p>Where a letter "ß" was not converted correctly to double-S in first case because of a limitation of <code>std::ctype</code> facet.</p>
|
|
<p>This is even more problematic in case of UTF-8 encodings where non US-ASCII are not converted at all. For example, this code</p>
|
|
<div class="fragment"><pre class="fragment"> std::string grussen = <span class="stringliteral">"grüßen"</span>;
|
|
std::cout << boost::algorithm::to_upper_copy(grussen) << <span class="stringliteral">" "</span> << <a class="code" href="group__convert.html#ga7889a57e1bc1059fbb107db0781d0b6d">boost::locale::to_upper</a>(grussen) << std::endl;
|
|
</pre></div><p>Would modify ASCII characters only</p>
|
|
<div class="fragment"><pre class="fragment">
|
|
GRüßEN GRÜSSEN
|
|
</pre></div><h2><a class="anchor" id="conversions_normalization"></a>
|
|
Unicode Normalization</h2>
|
|
<p>Unicode normalization is the process of converting strings to a standard form, suitable for text processing and comparison. For example, character "ü" can be represented by a single code point or a combination of the character "u" and the diaeresis "¨". Normalization is an important part of Unicode text processing.</p>
|
|
<p>Unicode defines four normalization forms. Each specific form is selected by a flag passed to <a class="el" href="group__convert.html#ga11672e3cc3ed7eecf1dd07060265aab2">normalize</a> function:</p>
|
|
<ul>
|
|
<li>NFD - Canonical decomposition - boost::locale::norm_nfd</li>
|
|
<li>NFC - Canonical decomposition followed by canonical composition - boost::locale::norm_nfc or boost::locale::norm_default</li>
|
|
<li>NFKD - Compatibility decomposition - boost::locale::norm_nfkd</li>
|
|
<li>NFKC - Compatibility decomposition followed by canonical composition - boost::locale::norm_nfkc</li>
|
|
</ul>
|
|
<p>For more details on normalization forms, read <a href="http://unicode.org/reports/tr15/#Norm_Forms">this article</a>.</p>
|
|
<h2><a class="anchor" id="conversions_notes"></a>
|
|
Notes</h2>
|
|
<ul>
|
|
<li><a class="el" href="group__convert.html#ga11672e3cc3ed7eecf1dd07060265aab2">normalize</a> operates only on Unicode-encoded strings, i.e.: UTF-8, UTF-16 and UTF-32 depending on the character width. So be careful when using non-UTF encodings as they may be treated incorrectly.</li>
|
|
<li><a class="el" href="group__convert.html#gabd1bc157122a5b9392487126fd0fffe5">fold_case</a> is generally a locale-independent operation, but it receives a locale as a parameter to determine the 8-bit encoding.</li>
|
|
<li>All of these functions can work with an STL string, a NUL terminated string, or a range defined by two pointers. They always return a newly created STL string.</li>
|
|
<li>The length of the string may change, see the above example. </li>
|
|
</ul>
|
|
</div>
|
|
<hr class="footer"/><address class="footer"><small>
|
|
© Copyright 2009-2011 Artyom Beilis, Distributed under the <a href="http://www.boost.org/LICENSE_1_0.txt">Boost Software License</a>, Version 1.0.
|
|
</small></address>
|
|
</body>
|
|
</html>
|