std::mblen
From Cppreference
C++ Standard Library | |||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Strings library | |||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||
Null-terminated multibyte strings | |||||||||||||||||||||||||||||||||||||
Defined in header <cstdlib>
|
||
int mblen( const char* s, std::size_t n );
|
||
Determines the size, in bytes, of the multibyte character whose first byte is pointed to by s
If s is a null pointer, determines if the current locale's multibyte character encoding is state-dependent.
This function is equivalent to the call std::mbtowc((wchar_t*)0, s, n), except that conversion state of std::mbtowc is unaffected.
Contents |
[edit] Parameters
s | - | pointer to the multibyte character |
n | - | limit on the number of bytes in s that can be examined |
[edit] Return value
If s is not a null pointer, returns the number of bytes that are contained in the multibyte character or -1 if the first bytes pointed to by s do not form a valid multibyte character or 0 if s is pointing at the null charcter '\0'.
If s is a null pointer, returns 0 if multibyte encoding is not state-dependent or a non-zero value if multibyte encoding is state-dependent.
[edit] Example
#include <clocale> #include <string> #include <iostream> #include <cstdlib> #include <stdexcept> // the number of characters in a multibyte string is the sum of mblen()'s std::size_t strlen_mb(const std::string& s) { std::size_t result = 0; const char* ptr = &s[0]; const char* end = ptr + s.size(); while(ptr < end) { int next = std::mblen(ptr, end-ptr); if(next == -1) throw std::runtime_error("strlen_mb(): conversion error"); ptr += next; ++result; } return result; } int main() { // allow mblen() to work with UTF-8 multibyte encoding std::setlocale(LC_ALL, "en_US.utf8"); // UTF-8 narrow multibyte encoding std::string str = u8"z\u00df\u6c34\U0001d10b"; // or u8"zß水𝄋" // or "\x7a\xc3\x9f\xe6\xb0\xb4\xf0\x9d\x84\x8b"; std::cout << str << " is " << str.size() << " bytes, but only " << strlen_mb(str) << " characters\n"; }
Output:
zß水𝄋 is 10 bytes, but only 4 characters
[edit] See also
converts the next multibyte character to wide character (function) |
|
returns the number of bytes in the next multibyte character, given state (function) |
|
[virtual]
|
calculates the length of the externT string that would be consumed by conversion into given internT buffer (virtual protected member function of std::codecvt<wchar_t, char, std::mbstate_t>) |