• 2.14.3-1

    Ghost released this 2022-08-18 17:53:17 +08:00 | 59 commits to master since this release

    Add support for improved fault tolerance: non-blocking mode, new
    init function with config, and ncclCommFinalize function.
    Reintroduce collnet+chain algorithm, alongside collnet+direct.
    Add LL protocol for intra-node P2P (on by default) and network
    communication (off by default).
    Use network instead of shared memory when performance is better.
    Fix: wait for CUDA graph destroy before destroying comm with linked
    graph resources.
    Remove aggressive polling during enqueue.
    Fix DMABUF fallback on MOFED 5.4 and earlier.

    Downloads