|
6 | 6 |
|
7 | 7 | ## Current
|
8 | 8 |
|
| 9 | +## New Features and Enhancements |
| 10 | + |
| 11 | +### CL/HIER |
| 12 | +- Disable onesided alltoallv {PR #875} |
| 13 | + |
| 14 | +### TL/CUDA |
| 15 | +- Initialize remote CUDA scratch to NULL {PR #911} |
| 16 | + |
| 17 | + |
| 18 | +### TL/UCP |
| 19 | +- Enable hybrid alltoallv {PR #781} |
| 20 | +- Avoid copy in knomial scatter {PR #771} |
| 21 | +- Enable reorder ranks to reduce_scatter, Knomial Allreduce, Ring Allgather/v {PR #819} |
| 22 | +- Remove memcpy in last SRA step {PR #743} |
| 23 | +- Fix sparse pack in hybrid a2av {PR #825} |
| 24 | +- Fix recycle in hybrid a2av {PR #827} |
| 25 | +- Reorder ranks for SRA {PR #834} |
| 26 | +- Use ring allgather when reordering needed {PR #879} |
| 27 | +- Use pipelining in SRA allreduce for CUDA {PR #873} |
| 28 | +- Poll for onesided alltoall completion {PR #876} |
| 29 | +- Add support for non-host buffers in bruck alltoall {PR #852} |
| 30 | +- Added Neighbor Exchange Allgather{PR #822} |
| 31 | + |
| 32 | +### TL/SHARP |
| 33 | +- Enable bcast for any predefined dt {PR #774} |
| 34 | +- Don't print team create error {PR #777} |
| 35 | +- Check datasize supported {PR #776} |
| 36 | +- Fix sharp context cleanup {PR #843} |
| 37 | + |
| 38 | +### API |
| 39 | +- Remove duplicate get_version_string {PR #933} |
| 40 | + |
| 41 | +### TL/NCCL |
| 42 | +- Make team init non-blocking {PR #772} |
| 43 | +- Add CUDA managed to score {PR #793} |
| 44 | +- Make ncclGroupEnd nb {PR #798} |
| 45 | +- Lazy init nccl comm {PR #851} |
| 46 | + |
| 47 | +### TL/MLX5 |
| 48 | +- Share ib_ctx and pd {PR #749} |
| 49 | +- Rcache {PR #753} |
| 50 | +- Device memory and topo init {PR #780} |
| 51 | +- Adding mcast interface {PR #784} |
| 52 | +- A2A part 1 -- coll init {PR #790} |
| 53 | +- A2A part 2 -- full collective {PR #802} |
| 54 | +- Revisit team and ctx init {PR #815} |
| 55 | +- Fix context create hang {PR #887} |
| 56 | +- Add librdmacm linkage {PR #910} |
| 57 | + |
| 58 | +### CORE |
| 59 | +- Fix score update when only score given {PR #779} |
| 60 | +- Coverity fixes {PR #809} |
| 61 | +- Additional coverty fixes {PR #813} |
| 62 | +- Fix error handling for ctx create epilog {PR #818} |
| 63 | +- Skip zero size collectives {PR #787} |
| 64 | + |
| 65 | +### DOCS |
| 66 | +- Updating NEWS for v1.2 {PR #791} |
| 67 | +- Updating NEWS for v1.3 {PR #937} |
| 68 | + |
| 69 | +### BUILD and TEST |
| 70 | +- Updated build system to enable UCC with ROCm 6.x {PR #906 and #917} |
| 71 | +- Check op and dt compatibility {PR #773} |
| 72 | +- Fix barrier test {PR #799} |
| 73 | +- Propagate HIP_CXXFLAGS to gtest and mpi {PR #803} |
| 74 | + |
| 75 | + |
| 76 | + |
9 | 77 | ## 1.2.0 (June 6th, 2023)
|
10 | 78 |
|
11 | 79 | ## New Features and Enhancements
|
|
0 commit comments