tensorflow

mirror of https://github.com/tensorflow/tensorflow.git synced 2024-11-22 06:15:44 +00:00

Author	SHA1	Message	Date
dependabot[bot]	8603f6ae0a	Bump certifi from 2024.6.2 to 2024.7.4 Bumps [certifi](https://github.com/certifi/python-certifi) from 2024.6.2 to 2024.7.4. - [Commits](https://github.com/certifi/python-certifi/compare/2024.06.02...2024.07.04) --- updated-dependencies: - dependency-name: certifi dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2024-07-11 17:04:16 +00:00
Dragan Mladjenovic	05797a7187	PR #13879 : [ROCm] Pin amdhip64 soversion in dso loader Imported from GitHub PR https://github.com/openxla/xla/pull/13879 This prevents us accidentally loading a second copy of HIP runtime in local_config_rocm. Do similar for rocblas to guard against ABI break in rocm 6.0. Merging this change closes #13879 PiperOrigin-RevId: 651388560	2024-07-11 09:57:58 -07:00
Adrian Kuegel	8ab1fc3e38	Add missing header include. This is needed for M_LN2l Without the include the build is failing on MacOS. PiperOrigin-RevId: 651387236	2024-07-11 09:28:16 -07:00
Oleg Shyshkov	c56871e192	[XLA:GPU] Return early in RemoveUnusedSymbols/Dimensions. `DetectUnusedVariables` can be expensive, but often we don't have symbols in the indexing map at all, so there is nothing to remove. PiperOrigin-RevId: 651385393	2024-07-11 09:21:17 -07:00
A. Unique TensorFlower	c4fb138843	[XLA:GPU] Remove sparse pass from ROCm Triton emitter PiperOrigin-RevId: 651379019	2024-07-11 09:14:22 -07:00
Victor Stone	1c7e37a219	Improve HloVerifier's error message when the size of minor-to-major and the size of dimensions mismatch. PiperOrigin-RevId: 651378879	2024-07-11 08:54:18 -07:00
pemeliya	ca4c83e3d3	PR #13479 : [ROCM] adding new ROCM-6.2 features: hipGetFuncBySymbol and error codes Imported from GitHub PR https://github.com/openxla/xla/pull/13479 In this PR we enable some new rocm-6.2 features: mainly the missing hipGetFuncBySymbol in rocm_runtime, so that we had to the workaround. This affects only rocm-specific files. @xla-rotation: could you have a look please ? Copybara import of the project: -- bcd2b2341887d305583161a592c23750f5ee584c by Pavel Emeliyanenko <pavel.emeliyanenko@amd.com>: adding new ROCM-6.2 features -- 3eb5aa9c69e8905d9f408ab2a141130084670d29 by Pavel Emeliyanenko <pavel.emeliyanenko@amd.com>: solving conflicts after rebase -- 09938d6c6b9a358e6571fb13acdc1623fe205a4e by Pavel Emeliyanenko <pavel.emeliyanenko@amd.com>: added blas get_version test -- 215b92dddd440f5bacee8d7f678e1f5138761e00 by Pavel Emeliyanenko <pavel.emeliyanenko@amd.com>: added runtime_version from DeviceDescription Merging this change closes #13479 PiperOrigin-RevId: 651378287	2024-07-11 08:46:58 -07:00
Oleg Shyshkov	a4c7375c85	[XLA:GPU] Use llvm::SmallVector instead of std::vector. PiperOrigin-RevId: 651371534	2024-07-11 08:39:25 -07:00
A. Unique TensorFlower	ac2fd97dc6	Automated Code Change PiperOrigin-RevId: 651370675	2024-07-11 08:32:24 -07:00
Tori Baker	483085cdf7	Add gpu.thread_id conversion to nvvm after sparse dot lowering We already converted triton gpu dialect to nvvm in TritonGPUTOLLVMPass but since we need to lower SparseDot afterwards and we generate a gpu.thread_id in the lowering, add a pattern to also convert that to nvvm. PiperOrigin-RevId: 651369703	2024-07-11 08:23:16 -07:00
A. Unique TensorFlower	070cc89fd3	Automated Code Change PiperOrigin-RevId: 651368824	2024-07-11 08:16:19 -07:00
A. Unique TensorFlower	d24e5e9c17	Automated Code Change PiperOrigin-RevId: 651368594	2024-07-11 08:09:28 -07:00
A. Unique TensorFlower	bf622e8572	Automated Code Change PiperOrigin-RevId: 651368231	2024-07-11 08:02:11 -07:00
A. Unique TensorFlower	3cb9c1c060	Automated Code Change PiperOrigin-RevId: 651366444	2024-07-11 07:51:45 -07:00
A. Unique TensorFlower	a404eb3239	Automated Code Change PiperOrigin-RevId: 651365254	2024-07-11 07:44:43 -07:00
Christian Sigg	a892116f56	[XLA:GPU] Move SparseWGMMAOpPattern from Triton to OpenXLA. PiperOrigin-RevId: 651361331	2024-07-11 07:37:34 -07:00
Sergey Kozub	016a0a596d	PR #14796 : Fix gemm_fusion_autotuner_test on Hopper Imported from GitHub PR https://github.com/openxla/xla/pull/14796 Updated result type and error thresholds for the SelectsSplitK test. Previously this failed on Hopper. Copybara import of the project: -- 5005f288b67a2a34ec643cfcc3fbae815b5f0ef6 by Sergey Kozub <skozub@nvidia.com>: Fix gemm_fusion_autotuner_test on Hopper Merging this change closes #14796 PiperOrigin-RevId: 651359673	2024-07-11 07:30:45 -07:00
A. Unique TensorFlower	9aea579e75	Automated Code Change PiperOrigin-RevId: 651358878	2024-07-11 07:21:42 -07:00
A. Unique TensorFlower	f8975ea946	Automated Code Change PiperOrigin-RevId: 651355024	2024-07-11 07:14:47 -07:00
A. Unique TensorFlower	53184cf551	Automated Code Change PiperOrigin-RevId: 651353992	2024-07-11 07:07:51 -07:00
A. Unique TensorFlower	068d60380f	Automated Code Change PiperOrigin-RevId: 651352887	2024-07-11 07:00:55 -07:00
Adrian Kuegel	7a0ab26bce	Mark gloo_collectives_test with tag nomac. gloo is not supported on MacOS. PiperOrigin-RevId: 651352354	2024-07-11 06:40:41 -07:00
A. Unique TensorFlower	a590708f67	Automated Code Change PiperOrigin-RevId: 651350115	2024-07-11 06:30:21 -07:00
A. Unique TensorFlower	ad220e4377	Automated Code Change PiperOrigin-RevId: 651347794	2024-07-11 06:23:49 -07:00
A. Unique TensorFlower	29e6d860b0	Automated Code Change PiperOrigin-RevId: 651347343	2024-07-11 06:17:10 -07:00
Oleg Shyshkov	5479bd7ebc	[XLA:GPU] Replace block_id_to_tile_offsets_indexing with N-d tile_offsets_indexing map. Currently we compute an indexing map from 1-d block_id to N-d tile offset for each TiledHloInstruction. We use that indexing map to deduplicate identical tiles. To get the map we compute delinearization of block_id in SymbolicTileAnalysis. Composition and simplification of `block_id_to_tile_offsets_indexing` is actually very computationally intensive, because it the expression has a lot of mods and floordivs from delinearization. This is not necessary for out purposes. After this change, `TiledHloComputation` will have N-d to M-d map from N-d tile indexing into M-d tile offsets of the instruction. This way expressions in the map are much smaller and easier to simplify (see changes in symbolic_tile_analysis_test). This change has an additional benefit that we don't enforce 1-d launch grid at the early stage. PiperOrigin-RevId: 651344451	2024-07-11 06:10:20 -07:00
A. Unique TensorFlower	a53acb0d0a	Automated Code Change PiperOrigin-RevId: 651343668	2024-07-11 05:57:06 -07:00
A. Unique TensorFlower	88f418747f	Automated Code Change PiperOrigin-RevId: 651343567	2024-07-11 05:51:28 -07:00
A. Unique TensorFlower	1e2db29600	Automated Code Change PiperOrigin-RevId: 651343129	2024-07-11 05:45:49 -07:00
A. Unique TensorFlower	ba76840342	Integrate LLVM at llvm/llvm-project@694b132177 Updates LLVM usage to match [694b132177a9](https://github.com/llvm/llvm-project/commit/694b132177a9) PiperOrigin-RevId: 651340955	2024-07-11 05:40:07 -07:00
Benjamin Chetioui	1028724c1f	[XLA:GPU][NFC] Delete path to reduce from legacy Triton emitter. Reductions only arise when using the new Triton generic emitter now. PiperOrigin-RevId: 651335997	2024-07-11 04:54:44 -07:00
A. Unique TensorFlower	8116978a26	Automated Code Change PiperOrigin-RevId: 651334675	2024-07-11 04:49:20 -07:00
A. Unique TensorFlower	da6b16843e	Automated Code Change PiperOrigin-RevId: 651334435	2024-07-11 04:36:47 -07:00
Chao	fcfd6083cb	PR #14792 : [ROCM ] hotfix ROCm build Imported from GitHub PR https://github.com/openxla/xla/pull/14792 related rocm part change is missing and internal CL is merged without check due to this `c40dbf2b3c` @xla-rotation @gflegar @beckerhe Thanks in advance! Copybara import of the project: -- 0f4236ca8a3767666ce03713fd7ae9e4d1254e5c by Chao Chen <cchen104@amd.com>: fixed build due to `c40dbf2b3c` Merging this change closes #14792 PiperOrigin-RevId: 651333429	2024-07-11 04:29:04 -07:00
Johannes Reifferscheid	8ec15da58d	Simplifier optimizations. - minimize storage uniquer invocations - don't allocate std::functions - don't put symbol and dims ranges in dense map in RangeEvaluator, also don't put them in a vector first. After this, the biggest thing left to to is to remove the MLIR simplifier, which is now responsible for 2/3 or so of the runtime of simplify. PiperOrigin-RevId: 651330275	2024-07-11 04:21:40 -07:00
Benjamin Chetioui	c6a85cf3f5	[XLA:GPU] Add support for multidimensional tiles in Triton reduction lowering rule. PiperOrigin-RevId: 651327801	2024-07-11 04:12:27 -07:00
A. Unique TensorFlower	64f9e5f814	Automated Code Change PiperOrigin-RevId: 651327215	2024-07-11 04:05:44 -07:00
A. Unique TensorFlower	61de8c3d40	compat: Update forward compatibility horizon to 2024-07-11 PiperOrigin-RevId: 651323797	2024-07-11 03:58:35 -07:00
A. Unique TensorFlower	4ab30daf71	Update GraphDef version to 1920. PiperOrigin-RevId: 651323780	2024-07-11 03:51:33 -07:00
Adrian Kuegel	366a25f834	Fix remaining build issues in the gpu directory for non-gpu builds. PiperOrigin-RevId: 651319915	2024-07-11 03:40:04 -07:00
Henning Becker	3b0732c5c4	Add support for libnvjitlink This is preparing the CUDA backend for linking and compiling with libnvjitlink. The plan is to replace ptxas and nvlink command line tools eventually. This change is so far only adding a function `CompileAndLinkUsingLibNvJitLink`, but it's not yet being used (outside of the corresponding unit tests). PiperOrigin-RevId: 651319016	2024-07-11 03:33:34 -07:00
Alexander Belyaev	9d353fbe06	[XLA:GPU][MLIR-based emitter] Set unroll factor to 1 for scatters. PiperOrigin-RevId: 651318862	2024-07-11 03:27:00 -07:00
Benjamin Chetioui	9b84e02372	[XLA:GPU][NFC] Refactor code to reduce `SoftmaxRewriterTriton::FindAllFusibleDiamondChains`'s complexity. PiperOrigin-RevId: 651318662	2024-07-11 03:20:19 -07:00
Johannes Reifferscheid	2b29dd5fe0	Reduce number of simplify calls in multi-result affine map simplifier. Currently, we rerun the simplifier for all results, even when only one changes. Also, we rerun our simplifier in the last round (when the upstream simplifier does not find any more changes), but it's not necessary, since Simplify is idempotent. PiperOrigin-RevId: 651317828	2024-07-11 03:01:27 -07:00
A. Unique TensorFlower	33bcc06d0f	Automated Code Change PiperOrigin-RevId: 651307920	2024-07-11 02:54:16 -07:00
Shaogang Wang	5981600814	PR #14725 : [XLA:GPU] Lowering FusedMHABackward thunk to command buffer Imported from GitHub PR https://github.com/openxla/xla/pull/14725 This PR lowers FusedMHABackwardThunk into command buffer, the command buffer lowering knob is DebugOptions::CUDNN. Copybara import of the project: -- ff9156f57569cb5e88a4671a110365e79c9f857f by Shawn Wang <shawnw@nvidia.com>: support lowering fusedMHABackward to command buffer -- 83ddf0cbadf5f0f9513c67e7bbdd7ecea4f3404c by Shawn Wang <shawnw@nvidia.com>: fix rebase conflicts -- 9dd82651d0434beab21bed16ab2edea06611f8a0 by Shawn Wang <shawnw@nvidia.com>: remove duplicated inclusion Merging this change closes #14725 PiperOrigin-RevId: 651306567	2024-07-11 02:41:00 -07:00
Adrian Kuegel	3b7b99c1b3	Avoid compile errors in builds without GPU configured. Currently, triton_test_util depends on ir_emitter_triton unconditionally, but ir_emitter_triton only gives access to the ir_emitter_triton.h header in builds with a GPU configured. We can make the ir_emitter_triton.h header available in all builds if we add a stub implementation that returns errors. PiperOrigin-RevId: 651303559	2024-07-11 02:32:40 -07:00
A. Unique TensorFlower	535e270f4c	Automated Code Change PiperOrigin-RevId: 651296179	2024-07-11 02:24:59 -07:00
A. Unique TensorFlower	63a7d95df2	Automated Code Change PiperOrigin-RevId: 651294944	2024-07-11 02:14:58 -07:00
A. Unique TensorFlower	1384ec54a7	Automated Code Change PiperOrigin-RevId: 651294385	2024-07-11 02:07:58 -07:00

1 2 3 4 5 ...

166848 Commits