Skip to content

Commit 91bb7d2

Browse files
committed
Merge branch 'master' into lkchen-ray_data_llm
Signed-off-by: Linkun Chen <github@lkchen.net>
2 parents 5a9f079 + fde2b2d commit 91bb7d2

File tree

64 files changed

+3963
-1505
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

64 files changed

+3963
-1505
lines changed

.buildkite/cicd.rayci.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ steps:
1313
depends_on:
1414
- oss-ci-base_test
1515
- forge
16+
tags: tools
1617
- label: ":coral: reef: privileged container tests"
1718
commands:
1819
- bazel run //ci/ray_ci:test_in_docker --
@@ -25,3 +26,4 @@ steps:
2526
depends_on:
2627
- oss-ci-base_test
2728
- forge
29+
tags: tools

.buildkite/core.rayci.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -338,6 +338,7 @@ steps:
338338
key: core_flaky_tests
339339
tags:
340340
- python
341+
- flaky
341342
- skip-on-premerge
342343
instance_type: large
343344
soft_fail: true

.buildkite/data.rayci.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -221,6 +221,7 @@ steps:
221221
tags:
222222
- python
223223
- data
224+
- flaky
224225
- skip-on-premerge
225226
instance_type: medium
226227
soft_fail: true

.buildkite/lint.rayci.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,9 @@ group: lint
22
steps:
33
- label: ":lint-roller: lint: {{matrix}}"
44
key: lint-small
5+
tags:
6+
- lint
7+
- always
58
depends_on:
69
- forge
710
commands:
@@ -25,6 +28,7 @@ steps:
2528
tags:
2629
- oss
2730
- lint
31+
- always
2832
key: lint-medium
2933
instance_type: medium
3034
depends_on: docbuild

.buildkite/macos/macos.rayci.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,7 @@ steps:
111111
- python
112112
- macos_wheels
113113
- oss
114+
- flaky
114115
- skip_on_premerge
115116
job_env: MACOS
116117
instance_type: macos

.buildkite/ml.rayci.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -274,6 +274,7 @@ steps:
274274
key: ml_flaky_tests
275275
tags:
276276
- train
277+
- flaky
277278
- skip-on-premerge
278279
instance_type: large
279280
commands:

.buildkite/others.rayci.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,8 @@ steps:
1818
soft_fail: true
1919
job_env: oss-ci-base_test-py3.11
2020
depends_on: oss-ci-base_test-multipy
21+
tags:
22+
- always
2123

2224
- label: ":tapioca: build: uv pip compile LLM dependencies"
2325
key: uv_pip_compile_llm_dependencies

.buildkite/rllib.rayci.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -194,6 +194,7 @@ steps:
194194
tags:
195195
- rllib_gpu
196196
- gpu
197+
- flaky
197198
- skip-on-premerge
198199
instance_type: gpu
199200
commands:
@@ -209,6 +210,7 @@ steps:
209210
key: rllib_flaky_tests_01
210211
tags:
211212
- rllib
213+
- flaky
212214
- skip-on-premerge
213215
instance_type: large
214216
commands:
@@ -238,6 +240,7 @@ steps:
238240
key: rllib_flaky_tests_02
239241
tags:
240242
- rllib
243+
- flaky
241244
- skip-on-premerge
242245
instance_type: large
243246
commands:

.buildkite/serve.rayci.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -183,8 +183,9 @@ steps:
183183
key: serve_flaky_tests
184184
tags:
185185
- serve
186-
- skip-on-premerge
187186
- python
187+
- flaky
188+
- skip-on-premerge
188189
instance_type: medium
189190
soft_fail: true
190191
commands:

.buildkite/windows.rayci.yml

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -128,7 +128,9 @@ steps:
128128

129129
- label: "flaky :windows: core tests"
130130
key: windows_core_flaky_tests
131-
tags: skip-on-premerge
131+
tags:
132+
- flaky
133+
- skip-on-premerge
132134
job_env: WINDOWS
133135
instance_type: windows
134136
commands:
@@ -146,7 +148,9 @@ steps:
146148

147149
- label: "flaky :windows: serverless tests"
148150
key: windows_serverless_flaky_tests
149-
tags: skip-on-premerge
151+
tags:
152+
- flaky
153+
- skip-on-premerge
150154
job_env: WINDOWS
151155
instance_type: windows
152156
commands:
@@ -164,7 +168,9 @@ steps:
164168

165169
- label: "flaky :windows: serve tests"
166170
key: windows_serve_flaky_tests
167-
tags: skip-on-premerge
171+
tags:
172+
- flaky
173+
- skip-on-premerge
168174
job_env: WINDOWS
169175
instance_type: windows
170176
commands:

.pre-commit-config.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -58,9 +58,9 @@ repos:
5858
rev: 8.0.1
5959
hooks:
6060
- id: buildifier
61-
files: ^(src|cpp|python)(/[^/]+)*/BUILD$
61+
files: ^(src|cpp|python|rllib)(/[^/]+)*/BUILD$
6262
- id: buildifier-lint
63-
files: ^(src|cpp|python)(/[^/]+)*/BUILD$
63+
files: ^(src|cpp|python|rllib)(/[^/]+)*/BUILD$
6464

6565
- repo: https://github.com/psf/black
6666
rev: 22.10.0

.rayciversion

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
0.10.0
1+
0.11.0

BUILD.bazel

Lines changed: 2 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -712,17 +712,6 @@ ray_cc_library(
712712
],
713713
)
714714

715-
ray_cc_library(
716-
name = "gcs_autoscaler_state_manager",
717-
srcs = ["src/ray/gcs/gcs_server/gcs_autoscaler_state_manager.cc"],
718-
hdrs = ["src/ray/gcs/gcs_server/gcs_autoscaler_state_manager.h"],
719-
deps = [
720-
":gcs_init_data",
721-
":gcs_kv_manager",
722-
":gcs_service_rpc",
723-
],
724-
)
725-
726715
ray_cc_library(
727716
name = "gcs_worker_manager",
728717
srcs = ["src/ray/gcs/gcs_server/gcs_worker_manager.cc"],
@@ -890,9 +879,7 @@ ray_cc_library(
890879
"@com_google_absl//absl/container:flat_hash_map",
891880
"@com_google_absl//absl/memory",
892881
"@com_google_absl//absl/strings",
893-
"@com_google_googletest//:gtest",
894-
"@io_opencensus_cpp//opencensus/exporters/stats/prometheus:prometheus_exporter",
895-
"@io_opencensus_cpp//opencensus/exporters/stats/stdout:stdout_exporter",
882+
"@com_google_googletest//:gtest_prod",
896883
"@io_opencensus_cpp//opencensus/stats",
897884
"@io_opencensus_cpp//opencensus/tags",
898885
],
@@ -2010,6 +1997,7 @@ ray_cc_test(
20101997
deps = [
20111998
":grpc_common_lib",
20121999
":test_service_cc_grpc",
2000+
"@com_google_googletest//:gtest_main",
20132001
],
20142002
)
20152003

ci/k8s/run-kuberay-doc-tests.sh

Lines changed: 56 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,61 @@ pip install -c python/requirements_compiled.txt pytest nbval bash_kernel
1010
python -m bash_kernel.install
1111
pip install "ray[default]==2.41.0"
1212

13+
echo "--- Run a deliberate failure test to ensure the test script fails on error"
14+
# The following Jupyter notebook only contains a single cell that runs the `date` command.
15+
# The test script should fail because the output of the `date` command is different everytime.
16+
cat <<EOF > test.ipynb
17+
{
18+
"cells": [
19+
{
20+
"cell_type": "code",
21+
"execution_count": 1,
22+
"id": "43a8bb95-f6f2-45a8-ba48-b16856b2106d",
23+
"metadata": {},
24+
"outputs": [
25+
{
26+
"name": "stdout",
27+
"output_type": "stream",
28+
"text": [
29+
"Wed Mar 26 06:28:51 PM CST 2025\n"
30+
]
31+
}
32+
],
33+
"source": [
34+
"date"
35+
]
36+
}
37+
],
38+
"metadata": {
39+
"kernelspec": {
40+
"display_name": "Bash",
41+
"language": "bash",
42+
"name": "bash"
43+
},
44+
"language_info": {
45+
"codemirror_mode": "shell",
46+
"file_extension": ".sh",
47+
"mimetype": "text/x-sh",
48+
"name": "bash"
49+
}
50+
},
51+
"nbformat": 4,
52+
"nbformat_minor": 5
53+
}
54+
EOF
55+
set +e
56+
if pytest --nbval test.ipynb --nbval-kernel-name bash; then
57+
echo "The test script should have failed but it didn't."
58+
exit 1
59+
fi
60+
set -e
61+
1362
echo "--- Run doc tests"
1463
cd doc/source/cluster/kubernetes
15-
py.test --nbval getting-started/raycluster-quick-start.ipynb --nbval-kernel-name bash --sanitize-with doc_sanitize.cfg
64+
TESTS=(
65+
"getting-started/raycluster-quick-start.ipynb"
66+
)
67+
for test in "${TESTS[@]}"; do
68+
echo "Running test: ${test}"
69+
pytest --nbval "${test}" --nbval-kernel-name bash --sanitize-with doc_sanitize.cfg
70+
done

doc/source/data/examples/data_juicer_distributed_data_processing.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ See the [Data-Juicer 2.0: Cloud-Scale Adaptive Data Processing for Foundation Mo
2323

2424
### Ray mode in Data-Juicer
2525

26-
- For most implementations of Data-Juicer [operators](https://github.com/modelscope/data-juicer/blob/main/docs/Operators.md), the core processing functions are engine-agnostic. Operators manage interoperability is primarily in [RayDataset](https://github.com/modelscope/data-juicer/blob/main//data_juicer/core/ray_data.py) and [RayExecutor](https://github.com/modelscope/data-juicer/blob/main//data_juicer/core/ray_executor.py), which are subclasses of the base `DJDataset` and `BaseExecutor`, respectively, and support both Ray [Tasks](ray-remote-functions) and [Actors](actor-guide).
26+
- For most implementations of Data-Juicer [operators](https://github.com/modelscope/data-juicer/blob/main/docs/Operators.md), the core processing functions are engine-agnostic. Operators manage interoperability is primarily in [RayDataset](https://github.com/modelscope/data-juicer/blob/main/data_juicer/core/data/ray_dataset.py) and [RayExecutor](https://github.com/modelscope/data-juicer/blob/main/data_juicer/core/executor/ray_executor.py), which are subclasses of the base `DJDataset` and `BaseExecutor`, respectively, and support both Ray [Tasks](ray-remote-functions) and [Actors](actor-guide).
2727
- The exception is the deduplication operators, which are challenging to scale in standalone mode. The names of these operators follow the pattern of [`ray_xx_deduplicator`](https://github.com/modelscope/data-juicer/blob/main//data_juicer/ops/deduplicator/).
2828

2929
### Subset splitting
@@ -152,4 +152,4 @@ python tools/process_data.py --config demos/process_on_ray/configs/dedup.yaml
152152
dj-process --config demos/process_on_ray/configs/dedup.yaml
153153
```
154154

155-
Data-Juicer deduplicates the demo dataset with the demo config file and export the result datasets to the directory specified by the `export_path` argument in the config file.
155+
Data-Juicer deduplicates the demo dataset with the demo config file and export the result datasets to the directory specified by the `export_path` argument in the config file.

0 commit comments

Comments
 (0)