Skip to content

Commit 7a0673b

Browse files
authored
update of dr::shp::sort() (#1614)
1 parent f7c82cc commit 7a0673b

File tree

15 files changed

+9003
-8417
lines changed

15 files changed

+9003
-8417
lines changed

.clang-format

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
BasedOnStyle: LLVM
22

3-
Standard: c++17
3+
Standard: c++20
44

55
IndentWidth: 4
66
ColumnLimit: 120

CODE_OF_CONDUCT.md

Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
# Contributor Covenant Code of Conduct
2+
3+
## Our Pledge
4+
5+
We as members, contributors, and leaders pledge to make participation in our
6+
community a harassment-free experience for everyone, regardless of age, body
7+
size, visible or invisible disability, ethnicity, sex characteristics, gender
8+
identity and expression, level of experience, education, socio-economic status,
9+
nationality, personal appearance, race, religion, or sexual identity
10+
and orientation.
11+
12+
We pledge to act and interact in ways that contribute to an open, welcoming,
13+
diverse, inclusive, and healthy community.
14+
15+
## Our Standards
16+
17+
Examples of behavior that contributes to a positive environment for our
18+
community include:
19+
20+
* Demonstrating empathy and kindness toward other people
21+
* Being respectful of differing opinions, viewpoints, and experiences
22+
* Giving and gracefully accepting constructive feedback
23+
* Accepting responsibility and apologizing to those affected by our mistakes,
24+
and learning from the experience
25+
* Focusing on what is best not just for us as individuals, but for the
26+
overall community
27+
28+
Examples of unacceptable behavior include:
29+
30+
* The use of sexualized language or imagery, and sexual attention or
31+
advances of any kind
32+
* Trolling, insulting or derogatory comments, and personal or political attacks
33+
* Public or private harassment
34+
* Publishing others' private information, such as a physical or email
35+
address, without their explicit permission
36+
* Other conduct which could reasonably be considered inappropriate in a
37+
professional setting
38+
39+
## Enforcement Responsibilities
40+
41+
Community leaders are responsible for clarifying and enforcing our standards of
42+
acceptable behavior and will take appropriate and fair corrective action in
43+
response to any behavior that they deem inappropriate, threatening, offensive,
44+
or harmful.
45+
46+
Community leaders have the right and responsibility to remove, edit, or reject
47+
comments, commits, code, wiki edits, issues, and other contributions that are
48+
not aligned to this Code of Conduct, and will communicate reasons for moderation
49+
decisions when appropriate.
50+
51+
## Scope
52+
53+
This Code of Conduct applies within all community spaces, and also applies when
54+
an individual is officially representing the community in public spaces.
55+
Examples of representing our community include using an official e-mail address,
56+
posting via an official social media account, or acting as an appointed
57+
representative at an online or offline event.
58+
59+
## Enforcement
60+
61+
Instances of abusive, harassing, or otherwise unacceptable behavior may be
62+
reported to the community leaders responsible for enforcement at
63+
oneDPLCodeOfConduct@intel.com.
64+
All complaints will be reviewed and investigated promptly and fairly.
65+
66+
All community leaders are obligated to respect the privacy and security of the
67+
reporter of any incident.
68+
69+
## Enforcement Guidelines
70+
71+
Community leaders will follow these Community Impact Guidelines in determining
72+
the consequences for any action they deem in violation of this Code of Conduct:
73+
74+
### 1. Correction
75+
76+
**Community Impact**: Use of inappropriate language or other behavior deemed
77+
unprofessional or unwelcome in the community.
78+
79+
**Consequence**: A private, written warning from community leaders, providing
80+
clarity around the nature of the violation and an explanation of why the
81+
behavior was inappropriate. A public apology may be requested.
82+
83+
### 2. Warning
84+
85+
**Community Impact**: A violation through a single incident or series
86+
of actions.
87+
88+
**Consequence**: A warning with consequences for continued behavior. No
89+
interaction with the people involved, including unsolicited interaction with
90+
those enforcing the Code of Conduct, for a specified period of time. This
91+
includes avoiding interactions in community spaces as well as external channels
92+
like social media. Violating these terms may lead to a temporary or
93+
permanent ban.
94+
95+
### 3. Temporary Ban
96+
97+
**Community Impact**: A serious violation of community standards, including
98+
sustained inappropriate behavior.
99+
100+
**Consequence**: A temporary ban from any sort of interaction or public
101+
communication with the community for a specified period of time. No public or
102+
private interaction with the people involved, including unsolicited interaction
103+
with those enforcing the Code of Conduct, is allowed during this period.
104+
Violating these terms may lead to a permanent ban.
105+
106+
### 4. Permanent Ban
107+
108+
**Community Impact**: Demonstrating a pattern of violation of community
109+
standards, including sustained inappropriate behavior, harassment of an
110+
individual, or aggression toward or disparagement of classes of individuals.
111+
112+
**Consequence**: A permanent ban from any sort of public interaction within
113+
the community.
114+
115+
## Attribution
116+
117+
This Code of Conduct is adapted from the [Contributor Covenant][homepage],
118+
version 2.0, available at
119+
https://www.contributor-covenant.org/version/2/0/code_of_conduct.html.
120+
121+
Community Impact Guidelines were inspired by [Mozilla's code of conduct
122+
enforcement ladder](https://github.com/mozilla/diversity).
123+
124+
[homepage]: https://www.contributor-covenant.org
125+
126+
For answers to common questions about this code of conduct, see the FAQ at
127+
https://www.contributor-covenant.org/faq. Translations are available at
128+
https://www.contributor-covenant.org/translations.

documentation/library_guide/parallel_api/iterators.rst

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -78,9 +78,11 @@ header. All iterators are implemented in the ``oneapi::dpl`` namespace.
7878
The ``transform_iterator`` class provides the following constructors:
7979

8080
* ``transform_iterator()``: instantiates the iterator using a default constructed base iterator and unary functor.
81-
This constructor participates in overload resolution only if the base iterator and unary functor are both default constructible.
81+
This constructor participates in overload resolution only if the base iterator and unary functor are both default constructible.
82+
8283
* ``transform_iterator(iter)``: instantiates the iterator using the base iterator provided and a default constructed
83-
unary functor. This constructor participates in overload resolution only if the unary functor is default constructible.
84+
unary functor. This constructor participates in overload resolution only if the unary functor is default constructible.
85+
8486
* ``transform_iterator(iter, func)``: instantiates the iterator using the base iterator and unary functor provided.
8587

8688
To simplify the construction of the iterator, ``oneapi::dpl::make_transform_iterator`` is provided. The

documentation/release_notes.rst

Lines changed: 87 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,90 @@ The Intel® oneAPI DPC++ Library (oneDPL) accompanies the Intel® oneAPI DPC++/C
88
and provides high-productivity APIs aimed to minimize programming efforts of C++ developers
99
creating efficient heterogeneous applications.
1010

11+
New in 2022.6.0
12+
===============
13+
News
14+
------------
15+
- `oneAPI DPC++ Library Manual Migration Guide`_ to simplify the migration of Thrust* and CUB* APIs from CUDA*.
16+
- ``radix_sort`` and ``radix_sort_by_key`` kernel templates were moved into
17+
``oneapi::dpl::experimental::kt::gpu::esimd`` namespace. The former ``oneapi::dpl::experimental::kt::esimd``
18+
namespace is deprecated and will be removed in a future release.
19+
- The ``for_loop``, ``for_loop_strided``, ``for_loop_n``, ``for_loop_n_strided`` algorithms
20+
in `namespace oneapi::dpl::experimental` are enforced to fail with device execution policies.
21+
22+
New Features
23+
------------
24+
- Added experimental ``inclusive_scan`` kernel template algorithm residing in
25+
the ``oneapi::dpl::experimental::kt::gpu`` namespace.
26+
- ``radix_sort`` and ``radix_sort_by_key`` kernel templates are extended with overloads for out-of-place sorting.
27+
These overloads preserve the input sequence and sort data into the user provided output sequence.
28+
- Improved performance of the ``reduce``, ``min_element``, ``max_element``, ``minmax_element``, ``is_partitioned``,
29+
``lexicographical_compare``, ``binary_search``, ``lower_bound``, and ``upper_bound`` algorithms with device policies.
30+
- ``sort``, ``stable_sort``, ``sort_by_key`` algorithms now use Radix sort [#fnote1]_
31+
for sorting ``sycl::half`` elements compared with ``std::less`` or ``std::greater``.
32+
33+
Fixed Issues
34+
------------
35+
- Fixed compilation errors when using ``reduce``, ``min_element``, ``max_element``, ``minmax_element``,
36+
``is_partitioned``, and ``lexicographical_compare`` with Intel oneAPI DPC++/C++ compiler 2023.0 and earlier.
37+
- Fixed possible data races in the following algorithms used with device execution policies:
38+
``remove_if``, ``unique``, ``inplace_merge``, ``stable_partition``, ``partial_sort_copy``, ``rotate``.
39+
- Fixed excessive copying of data in ``std::vector`` allocated with a USM allocator for standard library
40+
implementations which have allocator information in the ``std::vector::iterator`` type.
41+
- Fixed an issue where checking ``std::is_default_constructible`` for ``transform_iterator`` with a functor
42+
that is not default-constructible could cause a build error or an incorrect result.
43+
- Fixed handling of `sycl device copyable`_ for internal and public oneDPL types.
44+
- Fixed handling of ``std::reverse_iterator`` as input to oneDPL algorithms using a device policy.
45+
- Fixed ``set_intersection`` to always copy from the first input sequence to the output,
46+
where previously some calls would copy from the second input sequence.
47+
- Fixed compilation errors when using ``oneapi::dpl::zip_iterator`` with the oneTBB backend and C++20.
48+
49+
Known Issues and Limitations
50+
----------------------------
51+
New in This Release
52+
^^^^^^^^^^^^^^^^^^^
53+
- ``histogram`` algorithm requires the output value type to be an integral type no larger than 4 bytes
54+
when used with an FPGA policy.
55+
56+
Existing Issues
57+
^^^^^^^^^^^^^^^
58+
See oneDPL Guide for other `restrictions and known limitations`_.
59+
60+
- When compiled with ``-fsycl-pstl-offload`` option of Intel oneAPI DPC++/C++ compiler and with
61+
``libstdc++`` version 8 or ``libc++``, ``oneapi::dpl::execution::par_unseq`` offloads
62+
standard parallel algorithms to the SYCL device similarly to ``std::execution::par_unseq``
63+
in accordance with the ``-fsycl-pstl-offload`` option value.
64+
- When using the dpl modulefile to initialize the user's environment and compiling with ``-fsycl-pstl-offload``
65+
option of Intel® oneAPI DPC++/C++ compiler, a linking issue or program crash may be encountered due to the directory
66+
containing libpstloffload.so not being included in the search path. Use the env/vars.sh to configure the working
67+
environment to avoid the issue.
68+
- Compilation issues may be encountered when passing zip iterators to ``exclusive_scan_by_segment`` on Windows.
69+
- For ``transform_exclusive_scan`` and ``exclusive_scan`` to run in-place (that is, with the same data
70+
used for both input and destination) and with an execution policy of ``unseq`` or ``par_unseq``,
71+
it is required that the provided input and destination iterators are equality comparable.
72+
Furthermore, the equality comparison of the input and destination iterator must evaluate to true.
73+
If these conditions are not met, the result of these algorithm calls is undefined.
74+
- ``sort``, ``stable_sort``, ``sort_by_key``, ``partial_sort_copy`` algorithms may work incorrectly or cause
75+
a segmentation fault when used a DPC++ execution policy for CPU device, and built
76+
on Linux with Intel® oneAPI DPC++/C++ Compiler and -O0 -g compiler options.
77+
To avoid the issue, pass ``-fsycl-device-code-split=per_kernel`` option to the compiler.
78+
- Incorrect results may be produced by ``exclusive_scan``, ``inclusive_scan``, ``transform_exclusive_scan``,
79+
``transform_inclusive_scan``, ``exclusive_scan_by_segment``, ``inclusive_scan_by_segment``, ``reduce_by_segment``
80+
with ``unseq`` or ``par_unseq`` policy when compiled by Intel® oneAPI DPC++/C++ Compiler
81+
with ``-fiopenmp``, ``-fiopenmp-simd``, ``-qopenmp``, ``-qopenmp-simd`` options on Linux.
82+
To avoid the issue, pass ``-fopenmp`` or ``-fopenmp-simd`` option instead.
83+
- Incorrect results may be produced by ``reduce``, ``reduce_by_segment``, and ``transform_reduce``
84+
with 64-bit data types when compiled by Intel® oneAPI DPC++/C++ Compiler versions 2021.3 and newer
85+
and executed on GPU devices.
86+
For a workaround, define the ``ONEDPL_WORKAROUND_FOR_IGPU_64BIT_REDUCTION`` macro to ``1`` before
87+
including oneDPL header files.
88+
- ``std::tuple``, ``std::pair`` cannot be used with SYCL buffers to transfer data between host and device.
89+
- ``std::array`` cannot be swapped in DPC++ kernels with ``std::swap`` function or ``swap`` member function
90+
in the Microsoft* Visual C++ standard library.
91+
- The ``oneapi::dpl::experimental::ranges::reverse`` algorithm is not available with ``-fno-sycl-unnamed-lambda`` option.
92+
- STL algorithm functions (such as ``std::for_each``) used in DPC++ kernels do not compile with the debug version of
93+
the Microsoft* Visual C++ standard library.
94+
1195
New in 2022.5.0
1296
===============
1397

@@ -661,8 +745,8 @@ Known Issues and Limitations
661745
(including ``std::ldexp``, ``std::frexp``, ``std::sqrt(std::complex<float>)``) require device support
662746
for double precision.
663747

664-
.. [#fnote1] The sorting algorithms in oneDPL use Radix sort for arithmetic data types compared with
665-
``std::less`` or ``std::greater``, otherwise Merge sort.
748+
.. [#fnote1] The sorting algorithms in oneDPL use Radix sort for arithmetic data types and
749+
``sycl::half`` (since oneDPL 2022.6) compared with ``std::less`` or ``std::greater``, otherwise Merge sort.
666750
.. _`the oneDPL Specification`: https://spec.oneapi.com/versions/latest/elements/oneDPL/source/index.html
667751
.. _`oneDPL Guide`: https://oneapi-src.github.io/oneDPL/index.html
668752
.. _`Intel® oneAPI Threading Building Blocks (oneTBB) Release Notes`: https://www.intel.com/content/www/us/en/developer/articles/release-notes/intel-oneapi-threading-building-blocks-release-notes.html
@@ -671,3 +755,4 @@ Known Issues and Limitations
671755
.. _`Macros`: https://oneapi-src.github.io/oneDPL/macros.html
672756
.. _`2022.0 Changes`: https://oneapi-src.github.io/oneDPL/oneDPL_2022.0_changes.html
673757
.. _`sycl device copyable`: https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#sec::device.copyable
758+
.. _`oneAPI DPC++ Library Manual Migration Guide`: https://www.intel.com/content/www/us/en/developer/articles/guide/oneapi-dpcpp-library-manual-migration.html

include/oneapi/dpl/internal/distributed_ranges_impl/concepts/concepts.hpp

Lines changed: 11 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -10,29 +10,25 @@ namespace oneapi::dpl::experimental::dr
1010
{
1111

1212
template <typename I>
13-
concept remote_iterator = std::forward_iterator<I>&&
14-
requires(I& iter)
13+
concept remote_iterator = std::forward_iterator<I> && requires(I& iter)
1514
{
1615
ranges::rank(iter);
1716
};
1817

1918
template <typename R>
20-
concept remote_range = rng::forward_range<R>&&
21-
requires(R& r)
19+
concept remote_range = rng::forward_range<R> && requires(R& r)
2220
{
2321
ranges::rank(r);
2422
};
2523

2624
template <typename R>
27-
concept distributed_range = rng::forward_range<R>&&
28-
requires(R& r)
25+
concept distributed_range = rng::forward_range<R> && requires(R& r)
2926
{
3027
ranges::segments(r);
3128
};
3229

3330
template <typename I>
34-
concept remote_contiguous_iterator = std::random_access_iterator<I>&&
35-
requires(I& iter)
31+
concept remote_contiguous_iterator = std::random_access_iterator<I> && requires(I& iter)
3632
{
3733
ranges::rank(iter);
3834
{
@@ -41,39 +37,34 @@ requires(I& iter)
4137
};
4238

4339
template <typename I>
44-
concept distributed_iterator = std::forward_iterator<I>&&
45-
requires(I& iter)
40+
concept distributed_iterator = std::forward_iterator<I> && requires(I& iter)
4641
{
4742
ranges::segments(iter);
4843
};
4944

5045
template <typename R>
51-
concept remote_contiguous_range = remote_range<R>&& rng::random_access_range<R>&&
52-
requires(R& r)
46+
concept remote_contiguous_range = remote_range<R> && rng::random_access_range<R> && requires(R& r)
5347
{
5448
{
5549
ranges::local(r)
5650
} -> rng::contiguous_range;
5751
};
5852

5953
template <typename R>
60-
concept distributed_contiguous_range = distributed_range<R>&& rng::random_access_range<R>&&
61-
requires(R& r)
54+
concept distributed_contiguous_range = distributed_range<R> && rng::random_access_range<R> && requires(R& r)
6255
{
6356
{
6457
ranges::segments(r)
6558
} -> rng::random_access_range;
66-
}
67-
&&remote_contiguous_range<rng::range_value_t<decltype(ranges::segments(std::declval<R>()))>>;
59+
} && remote_contiguous_range<rng::range_value_t<decltype(ranges::segments(std::declval<R>()))>>;
6860

6961
template <typename Iter>
70-
concept distributed_contiguous_iterator = distributed_iterator<Iter>&& std::random_access_iterator<Iter>&&
71-
requires(Iter& iter)
62+
concept distributed_contiguous_iterator = distributed_iterator<Iter> && std::random_access_iterator<Iter> &&
63+
requires(Iter& iter)
7264
{
7365
{
7466
ranges::segments(iter)
7567
} -> rng::random_access_range;
76-
}
77-
&&remote_contiguous_range<rng::range_value_t<decltype(ranges::segments(std::declval<Iter>()))>>;
68+
} && remote_contiguous_range<rng::range_value_t<decltype(ranges::segments(std::declval<Iter>()))>>;
7869

7970
} // namespace oneapi::dpl::experimental::dr

0 commit comments

Comments
 (0)