Unicode support for word boundary `\b`

Is it possible to extend the unicode support to the word boundary anchor?

For example the russian sentence cannot be split:
```
"hello there this is a test".split(XRegExp('\\b', 'A'))
(11) ["hello", " ", "there", " ", "this", " ", "is", " ", "a", " ", "test"]

"Сняли не первый раз изначальную и конечную сумму и начальную не вернули !!!".split(XRegExp('\\b', 'A'))
["Сняли не первый раз изначальную и конечную сумму и начальную не вернули !!!"]
```
_^ note the split has no effect on russian_

The equivalent and desired behaviour in ruby, for example:

```
irb(main):001:0> "hello there this is a test".split(/\b/)
[
  "hello",
  " ",
  "there",
  " ",
  "this",
  " ",
  "is",
  " ",
  "a",
  " ",
  "test"
]
irb(main):002:0> "Сняли не первый раз изначальную и конечную сумму и начальную не вернули !!!".split(/\b/)
[
  "Сняли",
  " ",
  "не",
  " ",
  "первый",
  " ",
  "раз",
  " ",
  "изначальную",
  " ",
  "и",
  " ",
  "конечную",
  " ",
  "сумму",
  " ",
  "и",
  " ",
  "начальную",
  " ",
  "не",
  " ",
  "вернули",
  " !!!"
]
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Unicode support for word boundary `\b` #228

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Unicode support for word boundary \b #228

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Unicode support for word boundary `\b` #228