Skip to content

Commit d8bdf0f

Browse files
authored
fix: Support reading from files that have an UTF-8 Byte Order Mark (#670)
Adds support for reading files with UTF-8 BOM. This is commonly created by Windows text editors and should be skipped because serde deserialization will not handle those bytes. We have encountered this issue with our Windows customers who may create UTF-8 BOM files without their knowledge. Although we fixed it with a custom FileSource implementation, it would be nice to have this in the upstream to help others who may run into this issue. This PR came from discussion in #565 Unlike that PR, this one handles only UTF-8 Boms, and not other encodings, and does not pull in any new dependencies. - Adds a test with a UTF-8 BOM text file. - Updates FileSourceFile to skip the 3 BOM bytes if they are detected.
2 parents 4b58d4b + 588bd66 commit d8bdf0f

File tree

3 files changed

+27
-1
lines changed

3 files changed

+27
-1
lines changed

src/file/source/file.rs

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -115,7 +115,17 @@ where
115115
.unwrap_or_else(|| filename.clone());
116116

117117
// Read contents from file
118-
let text = fs::read_to_string(filename)?;
118+
let buf = fs::read(filename)?;
119+
120+
// If it exists, skip the UTF-8 BOM byte sequence: EF BB BF
121+
let buf = if buf.len() >= 3 && &buf[0..3] == b"\xef\xbb\xbf" {
122+
&buf[3..]
123+
} else {
124+
&buf
125+
};
126+
127+
let c = String::from_utf8_lossy(buf);
128+
let text = c.into_owned();
119129

120130
Ok(FileSourceResult {
121131
uri: Some(uri.to_string_lossy().into_owned()),
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
{
2+
"debug": true,
3+
"production": false
4+
}

tests/testsuite/file.rs

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,18 @@ fn test_file_ext() {
6767
assert_eq!(c.get("production").ok(), Some(false));
6868
}
6969

70+
#[test]
71+
#[cfg(feature = "json")]
72+
fn test_file_ext_with_utf8_bom() {
73+
let c = Config::builder()
74+
.add_source(File::with_name("tests/testsuite/file-ext-with-bom.json"))
75+
.build()
76+
.unwrap();
77+
78+
assert_eq!(c.get("debug").ok(), Some(true));
79+
assert_eq!(c.get("production").ok(), Some(false));
80+
}
81+
7082
#[test]
7183
#[cfg(feature = "json")]
7284
fn test_file_second_ext() {

0 commit comments

Comments
 (0)