Sylvain Jeaugey
|
8bb06c94be
|
Improved allreduce segmentation for small sizes
|
2016-10-07 12:42:23 -07:00 |
|
Sylvain Jeaugey
|
ca330b110a
|
Add scan tests
v1.3.0-1
|
2016-09-22 11:58:33 -07:00 |
|
Sylvain Jeaugey
|
6c77476cc1
|
Make tests check for deltas and report bandwidth
|
2016-09-22 11:58:28 -07:00 |
|
Sylvain Jeaugey
|
cabd6848e4
|
Heavy code refactoring to remove a lot of code in collectives (~1000 lines).
Have all collectives use the same args, the same ring, and the same primitives for synchronization between threads with the same pattern.
|
2016-09-22 11:57:56 -07:00 |
|
Sylvain Jeaugey
|
e3dbc6110e
|
Add profiling API
|
2016-09-22 11:56:51 -07:00 |
|
Sylvain Jeaugey
|
1d6715fe20
|
Fix MPI test path
|
2016-09-22 11:56:20 -07:00 |
|
Sylvain Jeaugey
|
9ee6189bf9
|
Merge pull request #41 from jia-kai/master
Some minor fixes for compile/usage
|
2016-09-15 09:45:52 -07:00 |
|
Sylvain Jeaugey
|
939b0a4297
|
Merge pull request #45 from NVIDIA/cw-update-copyright-year
Update LICENSE.txt
|
2016-08-26 15:44:00 -07:00 |
|
Cliff Woolley
|
234c8c9ef3
|
Update LICENSE.txt
|
2016-08-26 15:39:21 -07:00 |
|
Sylvain Jeaugey
|
75bad643bd
|
Updated LICENCE.txt
|
2016-08-26 15:08:20 -07:00 |
|
jiakai
|
47b0797fe1
|
pass devlist as const int* rather than int* in ncclCommInitAll
|
2016-08-19 19:00:14 +08:00 |
|
jiakai
|
ed401cc29b
|
link library with -lrt; otherwise there is undefined reference to shm_open
|
2016-08-19 18:58:56 +08:00 |
|
Sylvain Jeaugey
|
b3a9e1333d
|
Remove unneeded deb build script
|
2016-07-27 17:58:00 -07:00 |
|
Sylvain Jeaugey
|
428ec5b2a3
|
Merge remote-tracking branch 'github/master' into public
|
2016-07-25 10:53:01 -07:00 |
|
Nathan Luehr
|
55c42ad681
|
Fixed redundant contexts in multi-process apps
Change-Id: If787014450fd281304f0c7baf01d25963e40905d
|
2016-07-25 10:10:30 -07:00 |
|
Sylvain Jeaugey
|
7a1aa6b563
|
Improved Deb generation
|
2016-07-07 16:31:57 +02:00 |
|
Sylvain Jeaugey
|
9ae84f5d6b
|
Fix version number
|
2016-06-16 17:07:42 -07:00 |
|
Sylvain Jeaugey
|
e51e922924
|
Add a debug level to NCCL and CUDA versions at init
|
2016-06-16 17:04:41 -07:00 |
|
Sylvain Jeaugey
|
9fcc523485
|
Increased version to 1.2.3
|
2016-06-15 19:18:13 -07:00 |
|
Sylvain Jeaugey
|
67d1ab9106
|
Packaging : Generate shlibs.local
|
2016-06-15 19:03:08 -07:00 |
|
Sylvain Jeaugey
|
da6d2009e0
|
Move deb to build directory
|
2016-06-15 18:20:10 -07:00 |
|
Sylvain Jeaugey
|
155132d336
|
Fix make install to use BUILDDIR
|
2016-06-15 18:20:02 -07:00 |
|
Sylvain Jeaugey
|
08ddfe03d2
|
Rework debian packaging
|
2016-06-15 18:18:44 -07:00 |
|
Sylvain Jeaugey
|
5d4716a8a3
|
Include link to blog post in README.md
|
2016-06-15 10:54:19 -07:00 |
|
Boris Fomitchev
|
aa8f669a3d
|
Updating for .deb rebuild
v1.2.3-1+cuda7.5
|
2016-06-13 02:01:49 -07:00 |
|
Sylvain Jeaugey
|
d5e507fc7f
|
Only call the CUDA runtime. That may fix #27.
|
2016-06-07 16:27:51 -07:00 |
|
Sylvain Jeaugey
|
620491a649
|
Merge remote-tracking branch 'github/master' into HEAD
|
2016-06-06 14:35:57 -07:00 |
|
Sylvain Jeaugey
|
7edfc57228
|
Make NCCL collectives work on communicators with only one rank
|
2016-06-06 14:35:00 -07:00 |
|
Sylvain Jeaugey
|
bd3cf73e6e
|
Changed CURAND generator to work on a wider set of platforms.
|
2016-06-06 14:34:03 -07:00 |
|
Boris Fomitchev
|
177505b757
|
Gencodes changed to NV recommended
v1.2.1-2+cuda7.5
v1.2.2-1+cuda7.5
|
2016-06-06 00:06:18 -07:00 |
|
Sylvain Jeaugey
|
9d9d8cd59f
|
Bump to 1.2.2
|
2016-06-03 17:21:53 -07:00 |
|
Sylvain Jeaugey
|
1657af1567
|
Better name for GENCODE
|
2016-06-03 10:25:37 -07:00 |
|
Sylvain Jeaugey
|
acb93d1aed
|
Removing unneeded includes
|
2016-06-02 17:33:43 -07:00 |
|
Sylvain Jeaugey
|
889ad3d4e6
|
Makefile improvements
- Use standard CXX env var
- Permit redefinition of more env
- Separate lib from tests
|
2016-06-02 15:01:03 -07:00 |
|
Boris Fomitchev
|
93538def65
|
Merge pull request #22 from borisfom/master
Fixed version in ChangeLog
v1.2.1-1+cuda7.5
|
2016-04-21 18:58:44 -07:00 |
|
Boris Fomitchev
|
e5067b6611
|
Fixed version in ChangeLog
|
2016-04-21 16:28:13 -07:00 |
|
Boris Fomitchev
|
0629fb62d7
|
Merge pull request #21 from borisfom/master
Fixed install location, new .deb version
|
2016-04-21 14:46:41 -07:00 |
|
Boris Fomitchev
|
0177cf3ea4
|
Fixed install location, new .deb version
|
2016-04-21 14:10:31 -07:00 |
|
Nathan Luehr
|
658aca1469
|
Merge pull request #17 from Hopobcn/master
Enable compilation with specific g++
|
2016-04-21 13:25:18 -07:00 |
|
Nathan Luehr
|
03df4c7759
|
Moved no-as-needed flag to link rule.
Avoids link errors for tests linked with nvcc.
|
2016-04-19 14:51:03 -07:00 |
|
Nathan Luehr
|
0d4f8f4e95
|
Merge pull request #18 from apaszke/master
Add --no-as-needed to make sure that cudart library gets linked
|
2016-04-19 11:11:39 -07:00 |
|
Sylvain Jeaugey
|
ddd3f2084d
|
Fix readme to reflect the new test paths
|
2016-04-19 11:09:25 -07:00 |
|
Sylvain Jeaugey
|
dba3ec9428
|
Fix random deadlock during ncclCommInitRank.
|
2016-04-19 10:47:27 -07:00 |
|
Sylvain Jeaugey
|
9de361a1b9
|
Fix MPI test usage
Only display usage from rank 0 and exit instead of continuing (and seg fault).
|
2016-04-19 10:43:38 -07:00 |
|
Adam Paszke
|
c0c959b1be
|
Add --no-as-needed to make sure that cudart library gets liked
|
2016-04-13 10:04:38 -04:00 |
|
Pau Farré
|
e30bf95989
|
Enable compilation with old g++ when the default g++ is not supported (+5.0)
|
2016-04-12 12:49:13 +02:00 |
|
Boris Fomitchev
|
b16cc5d197
|
Merge pull request #16 from borisfom/master
Remved Tegra, fixed + format.
v1.2.1-1+cuda7.5-1
|
2016-03-17 17:35:04 -07:00 |
|
Boris Fomitchev
|
e6f4a83da6
|
Removing Tegra
|
2016-03-17 17:25:27 -07:00 |
|
Boris Fomitchev
|
1a8bae5b2f
|
fixed version format
|
2016-03-17 17:13:45 -07:00 |
|
Boris Fomitchev
|
e8eb285a59
|
Merge pull request #15 from borisfom/master
Fixing version number and compile param for 5.3
|
2016-03-17 16:03:05 -07:00 |
|