2
0
mirror of https://github.com/boostorg/nowide.git synced 2026-02-23 15:52:19 +00:00

Updated docs

This commit is contained in:
Artyom Beilis
2012-06-02 18:02:35 +03:00
parent a5ea3d2736
commit 4044d633b7

View File

@@ -14,18 +14,18 @@
Table of Contents:
- \ref main
- \ref main_rationale
- \ref main_the_problem
- \ref main_the_solution
- \ref main_wide
- \ref main_reading
- \ref main_rationale
- \ref main_the_problem
- \ref main_the_solution
- \ref main_wide
- \ref main_reading
- \ref using
- \ref using_standard
- \ref using_custom
- \ref technical
- \ref technical_imple
- \ref technical_cio
- \ref qna
\section main What is Boost.Nowide
@@ -39,8 +39,8 @@ requiring to use Wide API.
\subsection main_rationale Rationale
\subsubsection main_the_problem The Problem
\section main_rationale Rationale
\subsection main_the_problem The Problem
Consider a simple application that splits a big file into chunks, such that
they can be sent by e-mail. It requires doing few very simple taks:
@@ -58,9 +58,9 @@ internally -- the vast majority of Unix-Line operating systems: Linux, Mac OS X,
Solaris, BSD. But it would fail on files like <code>War and Peace - Война и мир - מלחמה ושלום.zip</code>
under Microsoft Windows because the native Windows Unicode aware API is Wide-API - UTF-16.
This, even a trivial task is very hard to implement in cross platform manner.
This, such a trivial task is very hard to implement in a cross platform manner.
\subsubsection main_the_solution The Solution
\subsection main_the_solution The Solution
Boost.Nowide provides a set of standard library functions that are UTF-8 aware and
makes Unicode aware programming easier.
@@ -91,7 +91,7 @@ The library provides:
- \c cin
\subsubsection main_wide Why Not Narrow and Wide?
\subsection main_wide Why Not Narrow and Wide?
Why not to provide both Wide and Narrow implementations so the
developer can choose to use Wide characters on Unix-Like platforms
@@ -104,7 +104,7 @@ Several reasons:
to stick to the standards rather than re-implement Wide API in "Microsoft Windows Style"
\subsubsection main_reading Further Reading
\subsection main_reading Further Reading
- <a href="http://www.utf8everywhere.org/">www.utf8everywhere.org</a>
- <a href="http://alfps.wordpress.com/2011/11/22/unicode-part-1-windows-console-io-approaches/">Windows console i/o approaches</a>
@@ -134,7 +134,7 @@ int main(int argc,char **argv)
std::cerr << "Can't open a file " << argv[1] << std::endl;
return 1;
}
int total_lines = 0
int total_lines = 0;
while(f) {
if(f.get() == '\n')
total_lines++;
@@ -167,7 +167,7 @@ int main(int argc,char **argv)
boost::nowide::cerr << "Can't open a file " << argv[1] << std::endl;
return 1;
}
int total_lines = 0
int total_lines = 0;
while(f) {
if(f.get() == '\n')
total_lines++;
@@ -218,7 +218,7 @@ CopyFileW(wexisting_file.c_str(),wnew_file.c_str(),TRUE);
that use buffers of size 256 or 16 characters, and if the string is longer, they fall-back to memory
allocation
\subsection using_windows_h \c windows.h header
\subsection using_windows_h windows.h header
The library does not include the \c windows.h in order to prevent namespace pollution with numerous
defines and types. The library rather defines the prototypes to the Win32 API functions.
@@ -257,6 +257,43 @@ the stream is not "atty" like a pipe than ReadFile/WriteFile is used.
This approach eliminates a need of manual code page handling. If TrueType
fonts are used the Unicode aware input and output would work.
\section qna Q & A
<b>Q: Why the library does not convert the string from Locale's encoding not UTF-8 and wise versa on POSIX systems</b>
A: It is inherently incorrect
to convert strings to/from locale encodings on POSIX platforms.
You can create a file named "\xFF\xFF.txt" (invalid UTF-8), remove it, pass its name as a parameter to program
and it would work whether the current locale is UTF-8 locale or not.
Also changing the locale from let's say \c en_US.UTF-8 to \c en_US.ISO-8859-1 would not magically change all
files in OS or the strings a user may pass to the program (which is different on Windows)
POSIX OSs treat strings as \c NUL terminated cookies.
So altering their content according to the locale would
actually lead to incorrect behavior.
For example, this is a naive implementation of a standard program "rm"
\code
#include <cstdio>
int main(int argc,char **argv)
{
for(int i=1;i<argc;i++)
std::remove(argv[i]);
return 0;
}
\endcode
It would work with ANY locale and changing the strings would
lead to incorrect behavior.
The meaning of a locale under POSIX and Windows paltforms
is different and has very different effects.
*/