Skip to content

Commit 553e624

Browse files
Reproduce unhashable type error in cyclonedx #3016
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
1 parent 3021c74 commit 553e624

File tree

12 files changed

+13212
-1
lines changed

12 files changed

+13212
-1
lines changed

tests/formattedcode/data/cyclonedx/simple-icu-expected.json

Whitespace-only changes.

tests/formattedcode/data/cyclonedx/simple-icu/LICENSE

Lines changed: 519 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 304 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,304 @@
1+
Name: icu
2+
URL: https://github.com/unicode-org/icu
3+
Version: 71-1
4+
CPEPrefix: cpe:/a:icu-project:international_components_for_unicode:71.1
5+
License: MIT
6+
Security Critical: yes
7+
8+
Description:
9+
This directory contains the source code of ICU 71.1 for C/C++.
10+
11+
A. How to update ICU
12+
13+
1. Run "scripts/update.sh <version>" (e.g. 71-1).
14+
This will download ICU from the upstream git repository.
15+
It does preserve Chrome-specific build files and
16+
converter files. (see section C)
17+
18+
source.gni and icu.gyp* files are automatically updated, too.
19+
20+
2. Review and apply patches/changes in "D. Local Modifications" if
21+
necessary/applicable. Update patch files in patches/.
22+
23+
3. Follow the instructions in section B on building ICU data files
24+
25+
B. How to build ICU data files
26+
27+
28+
Pre-built data files are generated and checked in with the following steps
29+
30+
1. icu data files for Chrome OS, Linux, Mac and Windows
31+
32+
a. Make a icu data build directory outside the Chromium source tree
33+
and cd to that directory (say, $ICUBUILDIR).
34+
35+
b. Run
36+
${CHROME_ICU_TREE_TOP}/scripts/make_data_all.sh
37+
38+
This script takes the following steps:
39+
40+
i) Run
41+
${CHROME_ICU_TREE_TOP}/source/runConfigureICU Linux --disable-layout --disable-tests
42+
43+
ii) Run make
44+
45+
iii) (cd data && make clean)
46+
47+
iv) scripts/config_data.sh common
48+
This configure the build with filer for common.
49+
50+
v) Run make
51+
52+
vi) scripts/copy_data.sh common
53+
This copies the ICU data files for non-Android platforms
54+
(both Little and Big Endian) to the following locations:
55+
56+
common/icudtl.dat
57+
common/icudtb.dat
58+
59+
vii) Repeat step iii) - vi) for chromeos to produce chromeos/icudtl.dat
60+
61+
viii) cast/patch_locale.sh
62+
Modify the file for cast, android, ios and flutter.
63+
64+
ix) Repeat step iii) - vi) for cast, andriod and ios to produce
65+
cast/icudtl.dat
66+
andriod/icudtl.dat
67+
ios/icudtl.dat
68+
69+
x) flutter/patch_brkitr.sh
70+
On top of cast/patch_locale.sh.sh (step viii)), further patch
71+
the code for flutter.
72+
73+
xi) Repeat step iii) - vi) for flutter to produce
74+
flutter/icudtl.dat
75+
76+
xii) scripts/clean_up_data_source.sh
77+
78+
This reverts the result of cast/patch_locale.sh and flutter/patch_brkitr.sh
79+
make the tree ready for committing updated ICU data files for
80+
non-Android and Android platforms.
81+
82+
c. Whenever data is updated (e.g timezone update), take step b as long
83+
as the ICU build directory used in a. is kept.
84+
85+
2. Note on the locale data customization
86+
87+
- filter/chromeos.json
88+
a. Filter the locale data for ChromeOS's UI langauges :
89+
locales, lang, region, currency, zone
90+
b. Filter the locale data for non-UI languages to the bare minimum :
91+
ExemplarCharacters, LocaleScript, layout, and the name of the
92+
language for a locale in its native language.
93+
c. Filter the legacy Chinese character set-based collation
94+
(big5han/gb2312han) that don't make any sense and nobdoy uses.
95+
96+
- filter/common.json
97+
Same as above in filter/chromeos.json, AND
98+
e. Filter exemplar cities in timezone data (data/zone).
99+
100+
- filter/android.json and filter/ios.json
101+
a. Filter the locale data for Android / iOS UI langauges :
102+
locales, lang, region, currency, zone
103+
b. Filter the locale data for non-UI languages to the bare minimum :
104+
ExemplarCharacters, LocaleScript, layout, and the name of the
105+
language for a locale in its native language.
106+
c. Filter the legacy Chinese character set-based collation
107+
d. Filter source/data/{region,lang} to exclude these data
108+
except the language and script names of zh_Hans and zh_Hant.
109+
e. Keep only the minimal calendar data in data/locales.
110+
f. Include currency display names for a smaller subset of currencies.
111+
g. Minimize the locale data for 9 locales to which Chrome on Android
112+
is not localized.
113+
114+
115+
C. Chromium-specific data build files and converters
116+
117+
They're preserved in step A.1 above. In general, there's no need to touch
118+
them when updating ICU.
119+
120+
1. source/data/mappings
121+
- convrtrs.txt : Lists encodings and aliases required by the WHATWG
122+
Encoding spec plus a few extra (see the file as to why).
123+
124+
- ucmlocal.txt : to list only converters we need.
125+
126+
- *html.ucm: Mapping files per WHATWG encoding standards for EUC-JP,
127+
Shift_JIS, Big5 (Big5+Big5HKSCS), EUC-KR and all the single byte encodings.
128+
They're generated with scripts/{eucjp,sjis,big5,euckr,single_byte}_gen.sh.
129+
130+
- gb18030.ucm and windows-936.ucm
131+
gb_table.patch was applied for the following changes. No need
132+
to apply it again. The patch is kept for the record.
133+
a. Map \xA3\xA0 to U+3000 instead of U+E5E5 in gb18030 and windows-936 per
134+
the encoding spec (one-way mapping in toUnicode direction).
135+
b. Map \xA8\xBF to U+01F9 instead of U+E7C8. Add one-way map
136+
from U+1E3F to \xA8\xBC (windows-936/GBK).
137+
See https://www.w3.org/Bugs/Public/show_bug.cgi?id=28740#c3
138+
139+
2. source/data/brkitr
140+
- dictionaries/khmerdict.txt: Abridged Khmer dictionary. See
141+
https://unicode-org.atlassian.net/browse/ICU-9451
142+
- dictionaries/laodict.txt: Abridged Lao dictionary. We keep using the smaller
143+
old version from ICU69-1.
144+
- rules/word_ja.txt (used only on Android)
145+
Added for Japanese-specific word-breaking without the C+J dictionary.
146+
- rules/{root,zh,zh_Hant}.txt
147+
a. Use line_normal by default.
148+
b. Drop local patches we used to have for the following issues. They'll
149+
be dealt with in the upstream (Unicode/CLDR).
150+
http://unicode.org/cldr/trac/ticket/6557
151+
http://unicode.org/cldr/trac/ticket/4200 (http://crbug.com/39779)
152+
153+
3. Add {an,ku,tg,wa}.txt to source/data/{locale,lang}
154+
with the minimal locale data necessary for spellchecker and
155+
and language menus.
156+
157+
D. Local Modifications
158+
159+
1. Applied locale data patches from Google obtained by diff'ing
160+
the upstream copy and Google's internal copy for source/data
161+
162+
- patches/locale_google.patch:
163+
* Google's internal ICU locale changes
164+
* Simpler region names for Hong Kong and Macau in all locales
165+
* Currency signs in ru and uk locales (do not include 'tr' locale changes)
166+
* AM/PM, midnight, noon formatting for a few Indian locales
167+
* Timezone name changes in Korean and Chinese locales
168+
* Default digit for Arabic locale is European digits.
169+
170+
- patches/locale1.patch: Minor fixes for Korean
171+
172+
173+
2. Breakiterator patches
174+
- patches/wordbrk.patch for word.txt
175+
a. Move full stops (U+002E, U+FF0E) from MidNumLet to MidNum so that
176+
FQDN labels can be split at '.'
177+
b. Move fullwidth digits (U+FF10 - U+FF19) from Ideographic to Numeric.
178+
See http://unicode.org/cldr/trac/ticket/6555
179+
180+
- patches/khmer-dictbe.patch
181+
Adjust parameters to use a smaller Khmer dictionary (khmerdict.txt).
182+
https://unicode-org.atlassian.net/browse/ICU-9451
183+
184+
- Add several common Chinese words that were dropped previously to
185+
source/data/cjdict/brkitr/cjdict.txt
186+
patch: patches/cjdict.patch
187+
upstream bug: https://unicode-org.atlassian.net/browse/ICU-10888
188+
189+
3. Timezone data update
190+
Run scripts/update_tz.sh to grab the latest version of the
191+
following timezone data files and put them in source/data/misc
192+
193+
metaZones.txt
194+
timezoneTypes.txt
195+
windowsZones.txt
196+
zoneinfo64.txt
197+
198+
As of Oct 13, 2022, the latest version is 2022e
199+
and the above files are available at the ICU github repos.
200+
201+
4. Build-related changes
202+
203+
- patches/configure.patch:
204+
* Remove a section of configure that will cause breakage while
205+
running runConfigureICU.
206+
207+
- patches/wpo.patch (only needed when icudata dll is used).
208+
upstream bugs : https://unicode-org.atlassian.net/browse/ICU-8043
209+
https://unicode-org.atlassian.net/browse/ICU-5701
210+
211+
- patches/data_symb.patch :
212+
Put ICU_DATA_ENTRY_POINT(icudtXX_dat) in common when we use
213+
the icu data file or icudt.dll
214+
215+
- patches/unused-var-unary-operators.patch:
216+
upstream bug: https://unicode-org.atlassian.net/browse/ICU-21966
217+
upstream PR: https://github.com/unicode-org/icu/pull/2055
218+
219+
5. ISO-2022-JP encoding (fromUnicode) change per WHATWG encoding spec.
220+
- patches/iso2022jp.patch
221+
- upstream bug:
222+
https://unicode-org.atlassian.net/browse/ICU-20251
223+
224+
6. Enable tracing of file but not resource, only for Chromium
225+
to reduce performance impact/risk.
226+
- patches/restrace.patch
227+
228+
7. Patch Arabic date time pattern back to 67 value to avoid test
229+
breakage in
230+
third_party/blink/web_tests/fast/forms/datetimelocal/datetimelocal-appearance-l10n.html
231+
- patches/ardatepattern.patch
232+
- https://bugs.chromium.org/p/chromium/issues/detail?id=1139186
233+
234+
8. Remove explicit std::atomic<NumberRangeFormatterImpl*> template
235+
instantiation
236+
patches/atomic_template_instantiation.patch
237+
- The explicit instantiation was added to silence MSVC C4251 warnings:
238+
https://unicode-org.atlassian.net/browse/ICU-20157
239+
Small test cases show that it is generally an error to instantiate
240+
std::atomic<T*> with an incomplete type T with MSVC, clang, and GCC, so this
241+
instantiation never should have worked:
242+
https://gcc.godbolt.org/z/34xx8h
243+
At this time, it's not clear if this particular instantiation with
244+
NumberRangeFormatterImpl* was ever necessary for MSVC. Further testing with
245+
MSVC is required to upstream this patch.
246+
- https://unicode-org.atlassian.net/browse/ICU-21482
247+
248+
9. Patch source/common/uposixdefs.h so it compiles on Fuchsia on Macs.
249+
patches/fuchsia.patch
250+
- context bug: https://bugs.chromium.org/p/chromium/issues/detail?id=1184527
251+
252+
10. Patch i18n/dtitvfmt.cpp to fix DateIntervalFormat regression
253+
patches/DateIntervalFormatnormalizeHourMetacharacters.patch
254+
- https://github.com/unicode-org/icu/pull/2060
255+
- https://unicode-org.atlassian.net/browse/ICU-21984
256+
257+
11. Patch common/locid.cpp to fix heap-buffer-overflow
258+
patches/AliasDataBuilder-readAlias.patch
259+
- https://patch-diff.githubusercontent.com/raw/unicode-org/icu/pull/2067
260+
- https://unicode-org.atlassian.net/browse/ICU-21994
261+
262+
12. Patch source/i18n/collationdatabuilder.*
263+
patches/collationdatabuilder.patch
264+
- https://github.com/unicode-org/icu/pull/2052
265+
- https://unicode-org.atlassian.net/browse/ICU-20715
266+
267+
13. Patch i18n/formatted_string_builder to fix int32_t overflow bug
268+
patches/formatted_string_builder.patch
269+
- https://github.com/unicode-org/icu/pull/2070
270+
- https://unicode-org.atlassian.net/browse/ICU-22005
271+
272+
14. Patch to fix C++20 enum issues
273+
patches/cxx20enum.patch
274+
- https://unicode-org.atlassian.net/browse/ICU-22014
275+
- https://github.com/unicode-org/icu/pull/2084
276+
277+
15. Patch to remove ATOMIC_VAR_INIT for C++20
278+
patches/rmATOMIC_VAR_INIT.patch
279+
- https://github.com/unicode-org/icu/pull/2090
280+
281+
16. Patch Calendar and TimeZone code to fix out-of-bound result in get()
282+
patches/calendar-get-out-of-bound.patch
283+
- https://unicode-org.atlassian.net/browse/ICU-22023
284+
- https://github.com/unicode-org/icu/pull/2086
285+
AND
286+
patches/calendar-get-out-of-bound2.patch
287+
- https://unicode-org.atlassian.net/browse/ICU-22043
288+
- https://github.com/unicode-org/icu/pull/2100
289+
290+
17. Patch TimeZone to fix incorrect name for "Africa/Casablanca"
291+
patches/timezone-rawoffset.patch
292+
- https://github.com/unicode-org/icu/pull/2096
293+
- https://unicode-org.atlassian.net/browse/ICU-22041
294+
295+
18. Patch NumberRangeFormatter to fix numbering system resolution.
296+
patches/number_range_format.patch
297+
- https://github.com/unicode-org/icu/pull/2085
298+
- https://unicode-org.atlassian.net/browse/ICU-22017
299+
300+
19. Patch Calendar to return error ASAP to avoid incorrect assert
301+
patches/calendar_return_error_early.patch
302+
- https://github.com/unicode-org/icu/pull/2177
303+
- https://unicode-org.atlassian.net/browse/ICU-22070
304+

0 commit comments

Comments
 (0)