How to load data blob in C++ and then use segmenter? #3298
Replies: 1 comment 8 replies
-
Hi @sffc , I've written below code to load data file but it is not running: UnicodeString str ("ພາສາລາວ");
printUnicodeString(str);
cout<<endl;
std::ifstream file("checking.txt");
if (!file.is_open()) {
// handle error
cout<<"here1"<<endl;
}
if (!file.good()) {
cout<<"here2"<<endl;
// handle error
}
file.seekg(0, std::ios::end);
std::streamsize blob_size = file.tellg();
file.seekg(0, std::ios::beg);
cout<<"size buffer "<<blob_size<<endl;
std::vector<uint8_t> buffer(blob_size);
const diplomat::span<const uint8_t> blob(buffer.data(), buffer.size());
const uint16_t* u16Ptr = reinterpret_cast<const uint16_t*>(str.getBuffer());
diplomat::result<ICU4XDataProvider, ICU4XError> provider_result = ICU4XDataProvider::create_from_byte_slice(blob);
const auto provider_result = ICU4XDataProvider::create_from_byte_slice(blob);
const auto segmenter_auto = ICU4XWordSegmenter::create_auto(provider_result.).ok().value();
const ICU4XWordSegmenter* segmenters[] = {&segmenter_auto};
size_t sizeInBytes = str.length() * sizeof(UChar);
const size_t size = sizeInBytes / sizeof(uint16_t); // Compute the number of elements in the array
diplomat::span<const uint16_t> span(u16Ptr,size);
auto iterator = segmenters[0]->segment_utf16(span);
int32_t breakpoint = iterator.next();
int32_t breakpoint2 = iterator.next();
cout<<breakpoint2; blob_size is coming as -1 which is why the folllowing error is coming: libc++abi: terminating with uncaught exception of type std::length_error: vector |
Beta Was this translation helpful? Give feedback.
8 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I am using ICU4X library in C++. I was trying segmenter code with test data provider initially but now I want to use data blob with all locales and segmenter files. So, I want to know that how to load this data blob in C++?
Beta Was this translation helpful? Give feedback.
All reactions