You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Introduce `read_utf8_lines` as a wrapper around `File.readlines` that ensures that it:
1. Takes the encoding the file was saved in into (by looking at its BOM if present)
2. Yield the lines as UTF-8 strings (after converting it to that encoding), especially to ensure that any subsequent matching using `RegExp.match?` doesn't throw a `Encoding::CompatibilityError` (as our `RegExp`s are UTF-8)
This commit addresses the suggestion from #418 (comment)
# Ensure the line is re-encoded to UTF-8 regardless of the encoding that was used in the input file
57
+
line.encode(Encoding::UTF_8)
58
+
end
59
+
end
60
+
39
61
# Merge the content of multiple `.strings` files into a new `.strings` text file.
40
62
#
41
63
# @param [Hash<String, String>] paths The paths of the `.strings` files to merge together, associated with the prefix to prepend to each of their respective keys
# Read line-by-line to reduce memory footprint during content copy
94
+
read_utf8_lines(input_file).eachdo |line|
73
95
unlessprefix.nil? || prefix.empty?
74
-
# We need to ensure the line and RegExp are using the same encoding, so we transcode everything to UTF-8.
75
-
line.encode!(Encoding::UTF_8)
76
96
# The `/u` modifier on the RegExps is to make them UTF-8
77
97
line.gsub!(/^(\s*")/u,"\\1#{prefix}")# Lines starting with a quote are considered to be start of a key; add prefix right after the quote
78
98
line.gsub!(/^(\s*)([A-Z0-9_]+)(\s*=\s*")/ui,"\\1\"#{prefix}\\2\"\\3")# Lines starting with an identifier followed by a '=' are considered to be an unquoted key (typical in InfoPlist.strings files for example)
0 commit comments