tensorflow

mirror of https://github.com/tensorflow/tensorflow.git synced 2024-11-22 06:15:44 +00:00

Author	SHA1	Message	Date
Jacques Pienaar	3d599f3ad5	Prefetch LLVM c9549e10e9ea70428ada80a34d15afeaf5710b2d Needed for building the Jax python wheels. PiperOrigin-RevId: 644150860	2024-06-17 15:56:34 -07:00
Kyle Lucke	1c0ad8a6c1	Remove unused TpuExecutor functions. PiperOrigin-RevId: 644145033	2024-06-17 15:27:37 -07:00
Dimitar (Mitko) Asenov	6029d13a58	[XLA:GPU] Pass the correct stream when running the Cub Sort kernel. PiperOrigin-RevId: 644136103	2024-06-17 15:20:32 -07:00
A. Unique TensorFlower	5252bc9129	Created a new LegalizeMlirToHloReproducer proto and dump that instead of just the mlir module. PiperOrigin-RevId: 644130713	2024-06-17 15:13:29 -07:00
Kyle Lucke	ef4eb00fd1	Stop using xla/statusor.h now that it just contains an alias for absl::Status. In some situations, this meant also changing unrelated files to directly include tsl/platform/statusor.h to get the definitions for TF_ASSIGN_OR_RETURN, etc., where they were getting transitively included for free. PiperOrigin-RevId: 644129488	2024-06-17 14:28:28 -07:00
A. Unique TensorFlower	636f805ea0	Modify boot_id per LocalTopology when using mock NCCL A device's slice_index depends on its LocalTopologyProto's boot_id. Currently, all devices in a mocked GPU client will have the same slice_index due to the boot_id being identical, which breaks hybrid mesh construction in AOT compilation. This change sets a distinct boot_id for each LocalTopology. PiperOrigin-RevId: 644118436	2024-06-17 13:56:17 -07:00
Greg Olechwierowicz	7a70848100	[XLA:GPU] Print PGLE profile when found. PiperOrigin-RevId: 644112277	2024-06-17 13:50:58 -07:00
Quoc Truong	4a4d5519f9	Update GCS Staging artifacts for Tensorflow upload. PiperOrigin-RevId: 644108046	2024-06-17 13:45:36 -07:00
David Dunleavy	f599e3f3a3	Don't use `layering_check` as it doesn't work with XLA's toolchain Set `bes_upload_mode` to `fully_async` to prevent errors Use `_DEFAULT_BAZEL_OPTIONS` for JAX builds PiperOrigin-RevId: 644106310	2024-06-17 13:28:51 -07:00
Adam Banaś	87e0544384	[xla:cpu] Add more convolution benchmarks This CL adds more convolution benchmarks. The benchmarks are based on shapes from XLA convolution tests, TF convolution benchmarks, and Eigen spatial convolution benchmarks. PiperOrigin-RevId: 644104994	2024-06-17 13:19:52 -07:00
Kyle Lucke	d450b82f4d	Stop using xla/statusor.h now that it just contains an alias for absl::Status. In some situations, this meant also changing unrelated files to directly include tsl/platform/statusor.h to get the definitions for TF_ASSIGN_OR_RETURN, etc., where they were getting transitively included for free. PiperOrigin-RevId: 644097216	2024-06-17 12:59:33 -07:00
A. Unique TensorFlower	c41df4594a	Implements the `FullyReplicatedShard` method for the BasicStringArray class. PiperOrigin-RevId: 644072353	2024-06-17 12:51:32 -07:00
A. Unique TensorFlower	64f503b9fd	Remove dependency of tensorflow/lite/string_util PiperOrigin-RevId: 644065618	2024-06-17 12:16:52 -07:00
Arturo Schmidt	0f2d21520a	Replace translate ConvertMlirToGraph (with control ret nodes) with tf2xla version and remove translate version. Functionality is unchanged. PiperOrigin-RevId: 644065323	2024-06-17 12:11:37 -07:00
Quentin Khan	f4f2393888	Register StableHLO composite in the built-in operation resolver. PiperOrigin-RevId: 644060676	2024-06-17 12:01:43 -07:00
A. Unique TensorFlower	25379bb66b	Move TF dependence from TFL tflite_copts PiperOrigin-RevId: 644050058	2024-06-17 11:52:05 -07:00
Junwhan Ahn	b2eda97caf	Replace the use of `xla::ifrt::Array::Reshard()` in JAX Python binding with `xla::ifrt::Client::CopyArrays()` IFRT API now distinguishes reshard vs. copy, so this CL is reflecting such a semantics change to the JAX Python binding. Since pjit input sharding is relatively easy to batch, it was also rewritten to leverage the batched `CopyArrays` API. PiperOrigin-RevId: 644048754	2024-06-17 11:43:36 -07:00
zoranjovanovic-ns	e6e20850e2	PR #13842 : [ROCm] fixed build due to https://github.com/openxla/xla/commit/19c11 … Imported from GitHub PR https://github.com/openxla/xla/pull/13842 …baa83f31e25a3f841cf41fa47a53e8ca161 @xla-rotation https://github.com/openxla/xla/blob/main/xla/service/gpu/ir_emitter_triton_rocm.cc was partially updated according to the change, but there was missing include directive in it (because of which ROCm build is broken). Copybara import of the project: -- 710f0acd1cb36381938a0b0322fc39d5f983f841 by Zoran Jovanovic <zjovanov@amd.com>: [ROCm] fixed build due to `19c11baa83` Merging this change closes #13842 PiperOrigin-RevId: 644048524	2024-06-17 11:30:29 -07:00
A. Unique TensorFlower	59f14f10dd	[Multi-host GPU] Build GpuTopology only by device ids when the topology is asymmetric. PiperOrigin-RevId: 644048385	2024-06-17 11:22:13 -07:00
A. Unique TensorFlower	076a07d11c	Remove unneeded dependency from quantize_weights. PiperOrigin-RevId: 644043376	2024-06-17 11:14:01 -07:00
A. Unique TensorFlower	c8eac773e6	Remove unneeded dependency from tfl_quantizer. PiperOrigin-RevId: 644043255	2024-06-17 10:45:59 -07:00
A. Unique TensorFlower	5ee3db654b	Remove unneeded header from mlir_tflite_runner. PiperOrigin-RevId: 644042893	2024-06-17 10:40:18 -07:00
A. Unique TensorFlower	8acac2b81a	Remove unneeded dependency from quantize_model. PiperOrigin-RevId: 644042409	2024-06-17 10:34:39 -07:00
A. Unique TensorFlower	d4669c0734	Remove unneeded dependency from sparsify_model. PiperOrigin-RevId: 644041988	2024-06-17 10:28:40 -07:00
Arturo Schmidt	0f76d9fae8	Migrate tensorflow::ConvertMlirFunctionToFunctionLibraryDef to tensorflow::tf2xla::v2::ConvertMlirFunctionToFunctionLibraryDef. Functionality is unchanged, location of code is the only difference. Previous location is deleted. PiperOrigin-RevId: 644041897	2024-06-17 10:08:57 -07:00
David Dunleavy	7557d352d0	Go back to old continuous build until L4 RBE is ready PiperOrigin-RevId: 644035693	2024-06-17 10:01:14 -07:00
Penporn Koanantakool	a3d43ee9bd	[oneDNN] Remove unused variable. PiperOrigin-RevId: 644034495	2024-06-17 09:42:12 -07:00
Matthias Kramm	8b5f1d5f8a	Add XlaSplitND and XlaConcatND ops to tf_generated_ops.td. PiperOrigin-RevId: 644021276	2024-06-17 09:15:24 -07:00
Emilio Cota	337286957f	[xla] add missing includes for absl::StrCat PiperOrigin-RevId: 644016440	2024-06-17 09:09:52 -07:00
George Karpenkov	1cbaea7c4f	[XLA] [NFC] Remove dead code PiperOrigin-RevId: 644012964	2024-06-17 09:03:55 -07:00
Shraiysh	67dc709e2e	PR #13190 : Add pipelined while loop annotator Imported from GitHub PR https://github.com/openxla/xla/pull/13190 This patch recognises pipelined while loop using the rotate right pattern. It recognises a rotate-right pattern on sharded inputs and labels the surrounding while loop as a pipelined while loop. This is an unsafe optimization. This is hidden behind a debug flag so it wont be triggered in TPU pipelines or other GPU pipelines unexpectedly. Copybara import of the project: -- b643b17a5d25d46e838a37fe87a4134b2cc512aa by Shraiysh Vaishay <svaishay@nvidia.com>: Add pipelined while loop annotator This patch recognises pipelined while loop using the rotate right pattern. It recognises a rotate-right pattern on sharded inputs and labels the surrounding while loop as a pipelined while loop. This is an unsafe optimization. This is hidden behind a debug flag so it wont be triggered in TPU pipelines or other GPU pipelines unexpectedly. Merging this change closes #13190 PiperOrigin-RevId: 644012174	2024-06-17 08:57:54 -07:00
Emilio Cota	9ec6201b4f	[tflite] add missing include for absl::StrCat PiperOrigin-RevId: 644006957	2024-06-17 08:16:43 -07:00
pemeliya	3e0c870fe9	PR #13462 : [ROCM][NFC] gpublas-lt refactoring after adding workspace and scratch allocator Imported from GitHub PR https://github.com/openxla/xla/pull/13462 This PR https://github.com/openxla/xla/pull/11514 added workspace allocation to cublas-lt. Basically, it doubled the implementation of a number of functions in gpu/cu/hipblas-lt, i.e. now we have: ``` DoMatmul(..., std::optional<DeviceMemoryBase> workspace) DoMatmul(..., std::optional<ScratchAllocator> scratch_allocator) DoMatmul(..., std::optional<DeviceMemoryBase> workspace, std::optional<ScratchAllocator> scratch_allocator) ``` and the same holds for ```ExecuteOnStream```. This makes gpublas_lt interface barely readable. The first two functions outlined above are just forwarding calls to the 3rd most generic one. Therefore, I do not see any need to implement these inside the derived classes, i.e. hip_blas_lt.h and cuda_blas_lt.h. Instead, forwarding can be handled in gpu_blas_lt.h interface. @xla-rotation: could you please have a look? Copybara import of the project: -- 6d3700a7b4141dee82a3b3f4d6be492a0a67d92b by Pavel Emeliyanenko <pavel.emeliyanenko@amd.com>: refactoring -- 495b2cc7b5a4e944804acddc9abc9442d9cce32a by Pavel Emeliyanenko <pavel.emeliyanenko@amd.com>: cuda side adaptions -- 4078221daebb8cb88faebe9423e87a1a781a765b by Pavel Emeliyanenko <pavel.emeliyanenko@amd.com>: parameter fix Merging this change closes #13462 PiperOrigin-RevId: 644003051	2024-06-17 08:10:34 -07:00
Alexander Belyaev	ecb3380d36	[XLA:GPU][MLIR-based emitters] Add more tests for thread indexing maps. Minor clean-ups. PiperOrigin-RevId: 644002369	2024-06-17 08:03:43 -07:00
Peter Hawkins	68da791239	[TSL] Remove apparently unnecessary "template" keywords that are yielding a clang warning. PiperOrigin-RevId: 643994891	2024-06-17 07:43:42 -07:00
Emilio Cota	fc4cf5a631	[tsl] logging_test: test LOG/VLOG/VLOG_IS_ON and associated flags/envvars We have substantial changes coming soon to TSL logging; add tests to make sure we will not introduce regressions. Also remove vmodule_test from tensorflow/core/platform since it is now redundant. Besides, TSL is a more appropriate place for these tests. PiperOrigin-RevId: 643994337	2024-06-17 07:36:30 -07:00
Benjamin Chetioui	19259c3bea	[XLA:GPU] Dissociate the legacy Triton support logic from the new one. Now, in order to query support information about the legacy Triton emitters, it is necessary to call functions in the `xla::gpu::legacy_triton` namespace. This helps clarify our tests, and most notably allows us to evolve the new (respectively old) Triton emitters without back- (respectively front-) porting the logic to the old (respectively new) Triton emitters. PiperOrigin-RevId: 643988589	2024-06-17 07:15:18 -07:00
Alexander Pivovarov	b5b8550ff0	PR #13771 : Fix algebraic_simplifier for rsqrt Imported from GitHub PR https://github.com/openxla/xla/pull/13771 Some algsimp patterns used `IsPositive()` function instead of using `IsNonNegative()` to check its operands. As a result, the patterns did not work when the operand was >= 0. However, they should. ## Fixes: ### Fix for pattern `rsqrt(B) * rsqrt(B) => 1/B` Issue: Pattern did not work for B >= 0, for example when B=abs(x). Solution: Fixed by checking that B isNonNegative Validation: - If `B==0` the `result is inf` - If `B>0` the `result > 0` - If `B is inf` the `result is 0` - If `B is nan` the `result is nan` ### Fix for pattern `rsqrt(pow(A, -2)) => A` Issue: Pattern did not work for A >= 0, for example when A=abs(x). Solution: Fixed by checking that A isNonNegative. Validation (before and after the simplification): - If `A==0` the result is `0` - If `A>0` the `result > 0` - If `A is inf` the `result is inf` - If `A is nan` the `result is nan` Additional fix: Since we know that A is non-negative we can use A directly without wrapping it with `abs()`. ### Fix for pattern `rsqrt(1/A)) => sqrt(A)` Issue: Pattern did not work for A >= 0, for example when A=abs(x). Solution: Fixed by checking that A isNonNegative Validation (before and after the simplification): - If `A==0` the result is `0` - If `A>0` the `result > 0` - If `A is inf` the `result is inf` - If `A is nan` the `result is nan` Copybara import of the project: -- e0251aa875c38442834be6f93ae534e481d06e2a by Alexander Pivovarov <pivovaa@amazon.com>: Fix algebraic_simplifier for rsqrt Merging this change closes #13771 PiperOrigin-RevId: 643982420	2024-06-17 06:52:31 -07:00
A. Unique TensorFlower	c2c0ca8848	Integrate Triton up to 71b8d336c22e508ff0c37fc090da6a38adf09a11(`71b8d336c2`) PiperOrigin-RevId: 643973562	2024-06-17 05:58:42 -07:00
pemeliya	33bc0a0d85	PR #13779 : [ROCM] added memory management functions Imported from GitHub PR https://github.com/openxla/xla/pull/13779 Here we enable XLA-managed workspace buffer for rocblas @xla-rotation: would you please have a look ? Copybara import of the project: -- 548701656df3a9e12bc6bae201113c5a7410f9b4 by Pavel Emeliyanenko <pavel.emeliyanenko@amd.com>: added memory management functions -- b01f6c03b0ef1b9f9588fa438fa278aac68d727b by Pavel Emeliyanenko <pavel.emeliyanenko@amd.com>: addressing reviewer comments -- 3b96edbee38cbfcc30779f8820956c219a512da7 by Pavel Emeliyanenko <pavel.emeliyanenko@amd.com>: beautified rocblas wrapper -- d65b695a1444f59d89095031ac8e00e3e1556d61 by Pavel Emeliyanenko <pavel.emeliyanenko@amd.com>: added explicit nullptr Merging this change closes #13779 PiperOrigin-RevId: 643967546	2024-06-17 05:51:57 -07:00
Christian Sigg	30d6e3a6af	Split `SlowReduceWindow` from hlo_evaluator_test into a separate target so that we can exclude it from tsan/asan/zapfhahn. PiperOrigin-RevId: 643966476	2024-06-17 05:45:33 -07:00
Tamás Danyluk	bdc917fbf9	[XLA:GPU] Fix that Simplify returned false, even if it did simplify I recently played with this code, and it was confusing to me that it said that it didn't simplify, but it did in fact. PiperOrigin-RevId: 643965860	2024-06-17 05:10:51 -07:00
Alexander Lyashuk	53cd4fd5d0	[XLA:GPU] Make BufferComparator accept tolerance as a parameter PiperOrigin-RevId: 643952464	2024-06-17 04:32:57 -07:00
Tamás Danyluk	a1eda34ca6	[XLA:GPU] Add and test != operator for IndexingMap-related types PiperOrigin-RevId: 643948829	2024-06-17 04:25:01 -07:00
Penporn Koanantakool	e1d7ca5fc2	Add kDomain and kOptimizationBarrier as no-op opcodes for thunks. Move Constant op handling next to other ops that also emit empty thunks. PiperOrigin-RevId: 643948540	2024-06-17 04:17:05 -07:00
George Karpenkov	d9c89b9884	[XLA] Remove ServiceInterface The abstraction does not add much value over just using Service class. PiperOrigin-RevId: 643946442	2024-06-17 03:56:52 -07:00
Greg Olechwierowicz	ceef8097a4	[XLA:GPU] Improve error message for missing costs/latencies in PGLE. PiperOrigin-RevId: 643933487	2024-06-17 03:24:47 -07:00
pemeliya	0eea73a7ec	PR #13722 : [ROCM] rocBLAS: default algorithm fallback Imported from GitHub PR https://github.com/openxla/xla/pull/13722 Some number types (like complex64) are generally supported by rocBLAS but the library does not provide any solutions for autotuning. In that case, we shall fallback to use the default solution (kDefaultAlgorithm). Besides, I have also added workspace buffer provided by XLA runtime which previously was ignored by rocBLAS. @xla-rotation: would you please have a look ? Copybara import of the project: -- 57645a2bfb357b9d00f6b85ae3b0e77a4b00fb61 by Pavel Emeliyanenko <pavel.emeliyanenko@amd.com>: use fallback default algoritm if no solutions are provided by the library -- be8327f8e2e11f0eaf2383171340f69a23759260 by Pavel Emeliyanenko <pavel.emeliyanenko@amd.com>: adding space Merging this change closes #13722 PiperOrigin-RevId: 643925837	2024-06-17 03:04:29 -07:00
Dimitar (Mitko) Asenov	658f70625d	Reverts changelist 578813627 PiperOrigin-RevId: 643924789	2024-06-17 02:48:45 -07:00
A. Unique TensorFlower	5cd6ae2cfe	compat: Update forward compatibility horizon to 2024-06-17 PiperOrigin-RevId: 643922506	2024-06-17 02:41:22 -07:00

1 2 3 4 5 ...

165835 Commits