Fixed buffer overflow in ReduceOrCopy
Bug caused AllGathers and ReduceScatters of less than 8 bytes to fail in certain cases. Change-Id: I33e1beb50805bfdb457ae16a90e3f91c1b283b9b Reviewed-on: http://git-master/r/1011505 Reviewed-by: Przemek Tredak <ptredak@nvidia.com> Tested-by: Przemek Tredak <ptredak@nvidia.com>
This commit is contained in:
parent
caa40b8dd3
commit
9442285526
@ -1,5 +1,5 @@
|
|||||||
/*************************************************************************
|
/*************************************************************************
|
||||||
* Copyright (c) 2015, NVIDIA CORPORATION. All rights reserved.
|
* Copyright (c) 2015-2016, NVIDIA CORPORATION. All rights reserved.
|
||||||
*
|
*
|
||||||
* Redistribution and use in source and binary forms, with or without
|
* Redistribution and use in source and binary forms, with or without
|
||||||
* modification, are permitted provided that the following conditions
|
* modification, are permitted provided that the following conditions
|
||||||
@ -274,7 +274,7 @@ __device__ inline void ReduceOrCopy(const int tid,
|
|||||||
const int UNROLL2 = (UNROLL >= 2) ? (UNROLL / 2) : 1;
|
const int UNROLL2 = (UNROLL >= 2) ? (UNROLL / 2) : 1;
|
||||||
const bool NOUNROLL2 = ((UNROLL / 2) == 0);
|
const bool NOUNROLL2 = ((UNROLL / 2) == 0);
|
||||||
|
|
||||||
int Npreamble = AlignUp(dest0, alignof(PackType)) - dest0;
|
int Npreamble = (N<alignof(PackType)) ? N : AlignUp(dest0, alignof(PackType)) - dest0;
|
||||||
|
|
||||||
// stage 0: check if we'll be able to use the fast, 64-bit aligned path.
|
// stage 0: check if we'll be able to use the fast, 64-bit aligned path.
|
||||||
// If not, we'll just use the slow preamble path for the whole operation
|
// If not, we'll just use the slow preamble path for the whole operation
|
||||||
|
Loading…
x
Reference in New Issue
Block a user