Skip to content

Commit 24f1496

Browse files
committed
Enforce maximum string length
BIP-173 states that a bech32 string must not exceed 90 characters however BOLT-11 states that the string limit may be exceeded. This puts in a conundrum - we want to support lightning but this crate pretty heavily documents itself as an implementation of BIP-173 and BIP-350. The solution we choose is to enforce the string limit in the segwit modules and types (`SegwitHrpstring`) and in `lib.rs` and non-segwit types (eg, `UncheckedHrpstring`) we enforce a limit of 1023. Enforce string length limits by doing: - Enforce and document a 1023 character string limit when encoding and decoding non-segwit strings. - Enforce and document a 90 character string limit when encoding and decoding segwit strings (addresses). - Document and make explicit that the 1023 limit is a rust-bech32 thing, based on the BCH code design in BIP-173 but is not part of any explicit spec. FTR in `bech32 v0.9.0` no lengths were not enforced.
1 parent f4a3616 commit 24f1496

File tree

5 files changed

+333
-38
lines changed

5 files changed

+333
-38
lines changed

src/lib.rs

Lines changed: 172 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,12 @@
99
//! a data part. A checksum at the end of the string provides error detection to prevent mistakes
1010
//! when the string is written off or read out loud.
1111
//!
12+
//! Please note, in order to support lighting ([BOLT-11]) we do not enforce the 90 character limit
13+
//! specified by [BIP-173], instead we use 1023 because that is a property of the `Bech32` and
14+
//! `Bech32m` checksum algorithms (specifically error detection, see the [`checksum`] module
15+
//! documentation for more information). We do however enforce the 90 character limit within the
16+
//! `segwit` modules.
17+
//!
1218
//! # Usage
1319
//!
1420
//! - If you are doing segwit stuff you likely want to use the [`segwit`] API.
@@ -89,6 +95,10 @@
8995
//!
9096
//! ## Custom Checksum
9197
//!
98+
//! Please note, if your checksum algorithm can detect errors in data greater than 1023 characters,
99+
//! and you intend on leveraging this fact, then this crate will not currently serve your needs.
100+
//! Patches welcome.
101+
//!
92102
//! ```
93103
//! # #[cfg(feature = "alloc")] {
94104
//! use bech32::Checksum;
@@ -113,6 +123,9 @@
113123
//!
114124
//! # }
115125
//! ```
126+
//!
127+
//! [BOLT-11]: <https://github.com/lightning/bolts/blob/master/11-payment-encoding.md>
128+
//! [`checksum`]: crate::primitives::checksum
116129
117130
#![cfg_attr(all(not(feature = "std"), not(test)), no_std)]
118131
// Experimental features we need.
@@ -142,8 +155,8 @@ pub mod segwit;
142155
use alloc::{string::String, vec::Vec};
143156
use core::fmt;
144157

145-
#[cfg(feature = "alloc")]
146158
use crate::error::write_err;
159+
use crate::primitives::checksum::MAX_STRING_LENGTH;
147160
#[cfg(doc)]
148161
use crate::primitives::decode::CheckedHrpstring;
149162
#[cfg(feature = "alloc")]
@@ -214,19 +227,32 @@ pub fn decode(s: &str) -> Result<(Hrp, Vec<u8>), DecodeError> {
214227
///
215228
/// Encoded string will be prefixed with the `hrp` and have a checksum appended as specified by the
216229
/// `Ck` algorithm (`NoChecksum` to exclude checksum all together).
230+
///
231+
/// ## Deviation from spec (BIP-173)
232+
///
233+
/// We only restrict the total length of the encoded string to 1023 characters (not 90).
217234
#[cfg(feature = "alloc")]
218235
#[inline]
219-
pub fn encode<Ck: Checksum>(hrp: Hrp, data: &[u8]) -> Result<String, fmt::Error> {
236+
pub fn encode<Ck: Checksum>(hrp: Hrp, data: &[u8]) -> Result<String, EncodeError> {
237+
let encoded_length = encoded_length::<Ck>(hrp, data);
238+
if encoded_length > MAX_STRING_LENGTH {
239+
return Err(EncodeError::TooLong(encoded_length));
240+
}
241+
220242
encode_lower::<Ck>(hrp, data)
221243
}
222244

223245
/// Encodes `data` as a lowercase bech32 encoded string.
224246
///
225247
/// Encoded string will be prefixed with the `hrp` and have a checksum appended as specified by the
226248
/// `Ck` algorithm (`NoChecksum` to exclude checksum all together).
249+
///
250+
/// ## Deviation from spec (BIP-173)
251+
///
252+
/// We only restrict the total length of the encoded string to 1023 characters (not 90).
227253
#[cfg(feature = "alloc")]
228254
#[inline]
229-
pub fn encode_lower<Ck: Checksum>(hrp: Hrp, data: &[u8]) -> Result<String, fmt::Error> {
255+
pub fn encode_lower<Ck: Checksum>(hrp: Hrp, data: &[u8]) -> Result<String, EncodeError> {
230256
let mut buf = String::new();
231257
encode_lower_to_fmt::<Ck, String>(&mut buf, hrp, data)?;
232258
Ok(buf)
@@ -236,9 +262,13 @@ pub fn encode_lower<Ck: Checksum>(hrp: Hrp, data: &[u8]) -> Result<String, fmt::
236262
///
237263
/// Encoded string will be prefixed with the `hrp` and have a checksum appended as specified by the
238264
/// `Ck` algorithm (`NoChecksum` to exclude checksum all together).
265+
///
266+
/// ## Deviation from spec (BIP-173)
267+
///
268+
/// We only restrict the total length of the encoded string to 1023 characters (not 90).
239269
#[cfg(feature = "alloc")]
240270
#[inline]
241-
pub fn encode_upper<Ck: Checksum>(hrp: Hrp, data: &[u8]) -> Result<String, fmt::Error> {
271+
pub fn encode_upper<Ck: Checksum>(hrp: Hrp, data: &[u8]) -> Result<String, EncodeError> {
242272
let mut buf = String::new();
243273
encode_upper_to_fmt::<Ck, String>(&mut buf, hrp, data)?;
244274
Ok(buf)
@@ -248,25 +278,33 @@ pub fn encode_upper<Ck: Checksum>(hrp: Hrp, data: &[u8]) -> Result<String, fmt::
248278
///
249279
/// Encoded string will be prefixed with the `hrp` and have a checksum appended as specified by the
250280
/// `Ck` algorithm (`NoChecksum` to exclude checksum all together).
281+
///
282+
/// ## Deviation from spec (BIP-173)
283+
///
284+
/// We only restrict the total length of the encoded string to 1023 characters (not 90).
251285
#[inline]
252286
pub fn encode_to_fmt<Ck: Checksum, W: fmt::Write>(
253287
fmt: &mut W,
254288
hrp: Hrp,
255289
data: &[u8],
256-
) -> Result<(), fmt::Error> {
290+
) -> Result<(), EncodeError> {
257291
encode_lower_to_fmt::<Ck, W>(fmt, hrp, data)
258292
}
259293

260294
/// Encodes `data` to a writer ([`fmt::Write`]) as a lowercase bech32 encoded string.
261295
///
262296
/// Encoded string will be prefixed with the `hrp` and have a checksum appended as specified by the
263297
/// `Ck` algorithm (`NoChecksum` to exclude checksum all together).
298+
///
299+
/// ## Deviation from spec (BIP-173)
300+
///
301+
/// We only restrict the total length of the encoded string to 1023 characters (not 90).
264302
#[inline]
265303
pub fn encode_lower_to_fmt<Ck: Checksum, W: fmt::Write>(
266304
fmt: &mut W,
267305
hrp: Hrp,
268306
data: &[u8],
269-
) -> Result<(), fmt::Error> {
307+
) -> Result<(), EncodeError> {
270308
let iter = data.iter().copied().bytes_to_fes();
271309
let chars = iter.with_checksum::<Ck>(&hrp).chars();
272310
for c in chars {
@@ -279,12 +317,16 @@ pub fn encode_lower_to_fmt<Ck: Checksum, W: fmt::Write>(
279317
///
280318
/// Encoded string will be prefixed with the `hrp` and have a checksum appended as specified by the
281319
/// `Ck` algorithm (`NoChecksum` to exclude checksum all together).
320+
///
321+
/// ## Deviation from spec (BIP-173)
322+
///
323+
/// We only restrict the total length of the encoded string to 1023 characters (not 90).
282324
#[inline]
283325
pub fn encode_upper_to_fmt<Ck: Checksum, W: fmt::Write>(
284326
fmt: &mut W,
285327
hrp: Hrp,
286328
data: &[u8],
287-
) -> Result<(), fmt::Error> {
329+
) -> Result<(), EncodeError> {
288330
let iter = data.iter().copied().bytes_to_fes();
289331
let chars = iter.with_checksum::<Ck>(&hrp).chars();
290332
for c in chars {
@@ -297,27 +339,35 @@ pub fn encode_upper_to_fmt<Ck: Checksum, W: fmt::Write>(
297339
///
298340
/// Encoded string will be prefixed with the `hrp` and have a checksum appended as specified by the
299341
/// `Ck` algorithm (`NoChecksum` to exclude checksum all together).
342+
///
343+
/// ## Deviation from spec (BIP-173)
344+
///
345+
/// We only restrict the total length of the encoded string to 1023 characters (not 90).
300346
#[cfg(feature = "std")]
301347
#[inline]
302348
pub fn encode_to_writer<Ck: Checksum, W: std::io::Write>(
303349
w: &mut W,
304350
hrp: Hrp,
305351
data: &[u8],
306-
) -> Result<(), std::io::Error> {
352+
) -> Result<(), EncodeIoError> {
307353
encode_lower_to_writer::<Ck, W>(w, hrp, data)
308354
}
309355

310356
/// Encodes `data` to a writer ([`std::io::Write`]) as a lowercase bech32 encoded string.
311357
///
312358
/// Encoded string will be prefixed with the `hrp` and have a checksum appended as specified by the
313359
/// `Ck` algorithm (`NoChecksum` to exclude checksum all together).
360+
///
361+
/// ## Deviation from spec (BIP-173)
362+
///
363+
/// We only restrict the total length of the encoded string to 1023 characters (not 90).
314364
#[cfg(feature = "std")]
315365
#[inline]
316366
pub fn encode_lower_to_writer<Ck: Checksum, W: std::io::Write>(
317367
w: &mut W,
318368
hrp: Hrp,
319369
data: &[u8],
320-
) -> Result<(), std::io::Error> {
370+
) -> Result<(), EncodeIoError> {
321371
let iter = data.iter().copied().bytes_to_fes();
322372
let chars = iter.with_checksum::<Ck>(&hrp).chars();
323373
for c in chars {
@@ -330,13 +380,17 @@ pub fn encode_lower_to_writer<Ck: Checksum, W: std::io::Write>(
330380
///
331381
/// Encoded string will be prefixed with the `hrp` and have a checksum appended as specified by the
332382
/// `Ck` algorithm (`NoChecksum` to exclude checksum all together).
383+
///
384+
/// ## Deviation from spec (BIP-173)
385+
///
386+
/// We only restrict the total length of the encoded string to 1023 characters (not 90).
333387
#[cfg(feature = "std")]
334388
#[inline]
335389
pub fn encode_upper_to_writer<Ck: Checksum, W: std::io::Write>(
336390
w: &mut W,
337391
hrp: Hrp,
338392
data: &[u8],
339-
) -> Result<(), std::io::Error> {
393+
) -> Result<(), EncodeIoError> {
340394
let iter = data.iter().copied().bytes_to_fes();
341395
let chars = iter.with_checksum::<Ck>(&hrp).chars();
342396
for c in chars {
@@ -392,6 +446,87 @@ impl From<UncheckedHrpstringError> for DecodeError {
392446
fn from(e: UncheckedHrpstringError) -> Self { Self::Parse(e) }
393447
}
394448

449+
/// An error while encoding a bech32 string.
450+
#[derive(Debug, Clone, PartialEq, Eq)]
451+
#[non_exhaustive]
452+
pub enum EncodeError {
453+
/// Encoding HRP and data into a bech32 string exceeds maximum allowed.
454+
TooLong(usize),
455+
/// Encode to formatter failed.
456+
Fmt(fmt::Error),
457+
}
458+
459+
impl fmt::Display for EncodeError {
460+
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
461+
use EncodeError::*;
462+
463+
match *self {
464+
TooLong(len) =>
465+
write!(f, "encoded length {} exceeds spec limit {} chars", len, MAX_STRING_LENGTH),
466+
Fmt(ref e) => write_err!(f, "encode to formatter failed"; e),
467+
}
468+
}
469+
}
470+
471+
#[cfg(feature = "std")]
472+
impl std::error::Error for EncodeError {
473+
fn source(&self) -> Option<&(dyn std::error::Error + 'static)> {
474+
use EncodeError::*;
475+
476+
match *self {
477+
TooLong(_) => None,
478+
Fmt(ref e) => Some(e),
479+
}
480+
}
481+
}
482+
483+
impl From<fmt::Error> for EncodeError {
484+
#[inline]
485+
fn from(e: fmt::Error) -> Self { Self::Fmt(e) }
486+
}
487+
488+
/// An error while encoding a bech32 string.
489+
#[cfg(feature = "std")]
490+
#[derive(Debug)]
491+
#[non_exhaustive]
492+
pub enum EncodeIoError {
493+
/// Encoding HRP and data into a bech32 string exceeds maximum allowed.
494+
TooLong(usize),
495+
/// Encode to writer failed.
496+
Write(std::io::Error),
497+
}
498+
499+
#[cfg(feature = "std")]
500+
impl fmt::Display for EncodeIoError {
501+
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
502+
use EncodeIoError::*;
503+
504+
match *self {
505+
TooLong(len) =>
506+
write!(f, "encoded length {} exceeds spec limit {} chars", len, MAX_STRING_LENGTH),
507+
Write(ref e) => write_err!(f, "encode to writer failed"; e),
508+
}
509+
}
510+
}
511+
512+
#[cfg(feature = "std")]
513+
impl std::error::Error for EncodeIoError {
514+
fn source(&self) -> Option<&(dyn std::error::Error + 'static)> {
515+
use EncodeIoError::*;
516+
517+
match *self {
518+
TooLong(_) => None,
519+
Write(ref e) => Some(e),
520+
}
521+
}
522+
}
523+
524+
#[cfg(feature = "std")]
525+
impl From<std::io::Error> for EncodeIoError {
526+
#[inline]
527+
fn from(e: std::io::Error) -> Self { Self::Write(e) }
528+
}
529+
395530
#[cfg(test)]
396531
#[cfg(feature = "alloc")]
397532
mod tests {
@@ -493,4 +628,31 @@ mod tests {
493628

494629
assert_eq!(got, want);
495630
}
631+
632+
#[test]
633+
fn can_encode_maximum_length_string() {
634+
let data = [0_u8; 632];
635+
let hrp = Hrp::parse_unchecked("abcd");
636+
let s = encode::<Bech32m>(hrp, &data).expect("failed to encode string");
637+
assert_eq!(s.len(), 1023);
638+
}
639+
640+
#[test]
641+
fn can_not_encode_string_too_long() {
642+
let data = [0_u8; 632];
643+
let hrp = Hrp::parse_unchecked("abcde");
644+
645+
match encode::<Bech32m>(hrp, &data) {
646+
Ok(_) => panic!("false positive"),
647+
Err(EncodeError::TooLong(len)) => assert_eq!(len, 1024),
648+
_ => panic!("false negative"),
649+
}
650+
}
651+
652+
#[test]
653+
fn can_decode_segwit_too_long_string() {
654+
// A 91 character long string, greater than the segwit enforced maximum of 90.
655+
let s = "abcd1qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqrw9z3s";
656+
assert!(decode(s).is_ok());
657+
}
496658
}

src/primitives/checksum.rs

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,24 @@
22

33
//! Degree-2 [BCH] code checksum.
44
//!
5+
//! How this BCH code was chosen to be used by the bech32 address format is outlined in BIP-173 in
6+
//! the ["Checksum design"] section, of particular importance is:
7+
//!
8+
//! > Even though the chosen code performs reasonably well up to 1023 characters, other designs are
9+
//! > preferable for lengths above 89 characters (excluding the separator).
10+
//!
11+
//! The segwit address format uses, for this reason, a 90 character limit. Lightning's [BOLT-11]
12+
//! does not use such a limit, we would like to support lightning addresses but we choose to enforce
13+
//! a hard limit of 1023 characters, this is purely a `rust-bech32` decision.
14+
//!
515
//! [BCH]: <https://en.wikipedia.org/wiki/BCH_code>
16+
//! ["Checksum design"]: <https://github.com/bitcoin/bips/blob/master/bip-0173.mediawiki#user-content-Checksum_design>
17+
18+
/// The maximum enforced string length of a bech32 string.
19+
pub const MAX_STRING_LENGTH: usize = 1023;
20+
21+
/// The maximum enforced string length of a segwit address.
22+
pub const MAX_SEGWIT_STRING_LENGTH: usize = 90;
623

724
use core::{mem, ops};
825

0 commit comments

Comments
 (0)