Skip to content

Commit b620b30

Browse files
committed
blog post for 0.18 release
1 parent f629161 commit b620b30

File tree

2 files changed

+148
-1
lines changed

2 files changed

+148
-1
lines changed

blog/_posts/2022-11-03-0.15-release.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
layout: post
3-
title: 0.15 -- GATs, CapabilityServerSet, and async packing
3+
title: 0.15 GATs, CapabilityServerSet, and async packing
44
author: dwrensha
55
---
66

Lines changed: 147 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,147 @@
1+
---
2+
layout: post
3+
title: 0.18 — lazy UTF-8 and no-alloc
4+
author: dwrensha
5+
---
6+
7+
New release alert!
8+
Version 0.18 of [capnproto-rust](https://github.com/capnproto/capnproto-rust)
9+
is now [available on crates.io](https://crates.io/crates/capnp).
10+
11+
If you use capnproto-rust on data with
12+
the [`Text` built-in type](https://capnproto.org/language.html#built-in-types),
13+
then it's likely that this release will require some
14+
updates to your code.
15+
But don't worry — the changes are straightforward and they bring some
16+
important benefits.
17+
18+
## lazy UTF-8 validation
19+
20+
Suppose we have the following struct defined in a Cap'n Proto schema:
21+
22+
```
23+
struct Foo {
24+
oneText @0 :Text;
25+
anotherText @1 :Text;
26+
}
27+
```
28+
29+
Then, in Rust, these `Text` fields can be accessed through the `text::Reader` type:
30+
31+
```rust
32+
let my_foo: foo::Reader = ...;
33+
let one_text: capnp::text::Reader<'_> = my_foo.get_one_text()?;
34+
let another_text: capnp::text::Reader<'_> = my_foo.get_another_text()?;
35+
```
36+
37+
But what exactly is a `text::Reader`?
38+
39+
40+
### the old definition
41+
42+
In previous versions of capnproto-rust, the `text::Reader` type
43+
was an alias to Rust's `&str` type:
44+
45+
46+
```rust
47+
pub mod text {
48+
type Reader<'a> = &'a str;
49+
}
50+
```
51+
52+
At first glance, this seems like a perfect fit.
53+
A Cap'n Proto `Text` value is required to
54+
contain valid UTF-8 data, just like a Rust `&str`,
55+
and a `text::Reader` is meant to represent
56+
a reference to that data.
57+
58+
However, in practice, there are some ways in which this representation
59+
falls short.
60+
61+
* **performance**: Validating UTF-8 data has a cost,
62+
and ideally we would like to avoid paying it multiple
63+
times on the same data. If `text::Reader` is just
64+
`&str`, then we need to validate every time that we:
65+
- copy a text field from one message to another,
66+
- write a text field to a file,
67+
- write a file to a text field, or
68+
- access some sub-range of a text field.
69+
70+
This goes against the general Cap'n Proto philosophy
71+
of doing validation as lazily as possible.
72+
73+
* **robustness** -- If a text field holds corrupted data, then
74+
you still might want to be able to access that data, even
75+
if it is not valid UTF-8. For example, imagine that a text
76+
field holds log messages from a web server. We should
77+
still be able to read the messages, even if they are garbled.
78+
Indeed, garbled messages are probably the most interesting ones,
79+
as they indicate unexpected behavior.
80+
See [this issue](https://github.com/capnproto/capnproto-rust/issues/314)
81+
for more discussion.
82+
83+
84+
### the new definition
85+
86+
To address the above-noted shortcomings,
87+
version 0.18 of capnproto-rust defines `text::Reader`
88+
like this:
89+
90+
```rust
91+
pub mod text {
92+
/// Wrapper around utf-8 encoded text.
93+
/// This is defined as a tuple struct to allow pattern matching
94+
/// on it via byte literals (for example `text::Reader(b"hello")`).
95+
#[derive(Copy, Clone, PartialEq)]
96+
pub struct Reader<'a>(pub &'a [u8]);
97+
98+
impl<'a> Reader<'a> {
99+
pub fn as_bytes(self) -> &'a [u8] { ... }
100+
pub fn to_str(self) -> Result<&'a str, Utf8Error> { ... }
101+
pub fn to_string(self) -> Result<String, Utf8Error> { ... }
102+
}
103+
104+
impl<'a> From<&'a str> for Reader<'a> { ... }
105+
impl<'a> From<&'a [u8]> for Reader<'a> { ... }}
106+
}
107+
```
108+
Now consumers can easily access the underlying data, via `as_bytes()`,
109+
and getting it as a `&str` or `String` just requires an extra `to_str()`
110+
or `to_string()` call.
111+
112+
When setting text fields in a message, you will now need to
113+
insert some `.into()` calls to convert from a `str` or `String`
114+
into a `text::Reader`, like this:
115+
116+
```rust
117+
let name: &str = "alice";
118+
let mut my_foo: foo::Builder = ...;
119+
my_foo.set_one_text("hello world".into())?;
120+
my_foo.set_another_text(format!("hello {name}")[..].into())?;
121+
```
122+
123+
All this is admittedly more verbose than it was before,
124+
but it's in keeping with the general spirit of capnproto-rust:
125+
we are willing to introduce some verbosity
126+
if that's what it takes to model Cap'n Proto data
127+
in a satisfactory way.
128+
129+
130+
## no-alloc mode
131+
132+
Another new feature is no-alloc mode.
133+
134+
In version 0.13, capnproto-rust
135+
[gained support for no_std environments]({{site.baseurl}}/2020/06/06/no-std-support.html).
136+
However, it still depended on the [`alloc`](https://doc.rust-lang.org/alloc/) crate,
137+
which can sometimes be a problem for microcontroller targets and kernel programming.
138+
(See [this issue](https://github.com/capnproto/capnproto-rust/issues/221)
139+
for some discussion.)
140+
141+
Starting with version 0.18, the `capnp` crate now has an `alloc` Cargo feature,
142+
which can be disabled to remove the `alloc` dependency.
143+
144+
A side benefit of this change is that now error handling in capnproto-rust
145+
is much less dependent on heap allocation, and so should have better
146+
performance and be more reliable.
147+

0 commit comments

Comments
 (0)