Add support for improved fault tolerance: non-blocking mode, new init function with config, and ncclCommFinalize function. Reintroduce collnet+chain algorithm, alongside collnet+direct. Add LL protocol for intra-node P2P (on by default) and network communication (off by default). Use network instead of shared memory when performance is better. Fix: wait for CUDA graph destroy before destroying comm with linked graph resources. Remove aggressive polling during enqueue. Fix DMABUF fallback on MOFED 5.4 and earlier.
7 lines
103 B
Makefile
7 lines
103 B
Makefile
##### version
|
|
NCCL_MAJOR := 2
|
|
NCCL_MINOR := 14
|
|
NCCL_PATCH := 3
|
|
NCCL_SUFFIX :=
|
|
PKG_REVISION := 1
|