LLVM/project 0a0cac6llvm/lib/Target/SystemZ SystemZISelLowering.cpp, llvm/test/CodeGen/SystemZ atomic-store-08.ll atomic-load-08.ll

[SystemZ] Simplify f128 atomic load/store (#90977)

Change definition of expandBitCastI128ToF128 and expandBitCastF128ToI128
to allow for simplified use in atomic load/store.

Update logic to split 128-bit loads and stores in DAGCombine to also
handle the f128 case where appropriate. This fixes the regressions
introduced by recent atomic load/store patches.
DeltaFile
+155-116llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
+8-26llvm/test/CodeGen/SystemZ/atomic-store-08.ll
+5-16llvm/test/CodeGen/SystemZ/atomic-load-08.ll
+2-4llvm/test/CodeGen/SystemZ/atomicrmw-fmin-03.ll
+2-4llvm/test/CodeGen/SystemZ/atomicrmw-fmax-03.ll
+172-1665 files

LLVM/project 522b4bfllvm/include/llvm/CodeGen SDPatternMatch.h, llvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp

[DAG] Fold bitreverse(shl/srl(bitreverse(x),y)) -> srl/shl(x,y) (#89897)

Noticed while investigating GFNI per-element vector shifts (we can form SHL but not SRL/SRA)

Alive2: https://alive2.llvm.org/ce/z/fSH-rf
DeltaFile
+11-285llvm/test/CodeGen/X86/combine-bitreverse.ll
+27-112llvm/test/CodeGen/RISCV/bitreverse-shift.ll
+14-0llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+5-0llvm/include/llvm/CodeGen/SDPatternMatch.h
+57-3974 files

LLVM/project 0933a7allvm/lib/Target/LoongArch LoongArchOptWInstrs.cpp, llvm/test/CodeGen/LoongArch prefer-w-inst.ll

[LoongArch] Rename some OptWInstrs functions. NFC
DeltaFile
+25-21llvm/lib/Target/LoongArch/LoongArchOptWInstrs.cpp
+7-7llvm/test/CodeGen/LoongArch/prefer-w-inst.ll
+32-282 files

LLVM/project 69d740eclang/lib/AST/Interp ByteCodeEmitter.cpp, clang/test/AST/Interp cxx23.cpp

[clang][Interp] Fix creating functions with explicit instance parameters
DeltaFile
+7-5clang/lib/AST/Interp/ByteCodeEmitter.cpp
+7-0clang/test/AST/Interp/cxx23.cpp
+1-0clang/test/SemaCXX/cxx2b-deducing-this-constexpr.cpp
+15-53 files

LLVM/project d98a785llvm/test/CodeGen/LoongArch rotl-rotr.ll

[LoongArch] Mark data type i32 are sign-extended. NFC
DeltaFile
+9-9llvm/test/CodeGen/LoongArch/rotl-rotr.ll
+9-91 files

LLVM/project e9bcd2bllvm/lib/Target/LoongArch LoongArchOptWInstrs.cpp, llvm/test/CodeGen/LoongArch sextw-removal.ll opt-pipeline.ll

[LoongArch] Optimize *W Instructions at MI level (#90463)

Referring to RISC-V, adding an MI level pass to optimize *W instructions
for LoongArch.

First it removes unneeded sext(addi.w rd, rs, 0) instructions. Either
because the sign extended bits aren't consumed or because the input was
already sign extended by an earlier instruction.

Then:
1. Unless explicit disabled or the target prefers instructions with W
suffix, it removes the -w suffix from opw instructions whenever all
users are dependent only on the lower word of the result of the
instruction. The cases handled are:
* addi.w because it helps reduce test differences between LA32 and LA64
w/o being a pessimization.

2. Or if explicit enabled or the target prefers instructions with W
suffix, it adds the W suffix to the instruction whenever all users are

    [4 lines not shown]
DeltaFile
+815-0llvm/lib/Target/LoongArch/LoongArchOptWInstrs.cpp
+554-32llvm/test/CodeGen/LoongArch/sextw-removal.ll
+164-163llvm/test/CodeGen/LoongArch/opt-pipeline.ll
+121-121llvm/test/CodeGen/LoongArch/atomicrmw-uinc-udec-wrap.ll
+50-96llvm/test/CodeGen/LoongArch/ir-instruction/atomic-cmpxchg.ll
+0-80llvm/test/CodeGen/LoongArch/ir-instruction/atomicrmw-minmax.ll
+1,704-49215 files not shown
+1,797-62721 files

LLVM/project 9a521e2clang/lib/AST/Interp ByteCodeExprGen.cpp, clang/test/AST/Interp lambda.cpp

[clang][Interp] Fix primitive lambda capture defaults

We need to use InitField here, not SetField.
DeltaFile
+16-0clang/test/AST/Interp/lambda.cpp
+1-1clang/lib/AST/Interp/ByteCodeExprGen.cpp
+17-12 files

LLVM/project 8a65ee8llvm/test/Analysis/UniformityAnalysis/AMDGPU/MIR temporal-divergence.mir, llvm/test/CodeGen/AMDGPU/GlobalISel divergence-structurizer.mir divergence-divergent-i1-used-outside-loop.mir

[AMDGPU] don't mark control-flow intrinsics as convergent (#90026)

This is really a workaround to allow control flow lowering in the
presence of convergence control tokens. Control-flow intrinsics in LLVM
IR are convergent because they indirectly represent the wave CFG, i.e.,
sets of threads that are "converged" or "execute in lock-step". But they
exist during a small window in the lowering process, inserted after the
structurizer and then translated to equivalent MIR pseudos. So rather
than create convergence tokens for these builtins, we simply mark them
as not convergent.

The corresponding MIR pseudos are marked as having side effects, which
is sufficient to prevent optimizations without having to mark them as
convergent.
DeltaFile
+67-67llvm/test/CodeGen/AMDGPU/GlobalISel/divergence-structurizer.mir
+56-56llvm/test/CodeGen/AMDGPU/GlobalISel/divergence-divergent-i1-used-outside-loop.mir
+20-20llvm/test/CodeGen/AMDGPU/GlobalISel/divergence-temporal-divergent-i1.mir
+14-14llvm/test/Analysis/UniformityAnalysis/AMDGPU/MIR/temporal-divergence.mir
+12-12llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-amdgcn.if-invalid.mir
+12-12llvm/test/CodeGen/AMDGPU/GlobalISel/divergence-divergent-i1-phis-no-lane-mask-merging.mir
+181-18111 files not shown
+244-23217 files

LLVM/project d3dad7allvm/lib/Transforms/InstCombine InstCombineCompares.cpp, llvm/test/Transforms/InstCombine icmp-of-trunc-ext.ll

[InstCombine] Fix miscompilation caused by #90436 (#91133)

Proof: https://alive2.llvm.org/ce/z/iRnJ4i

Fixes https://github.com/llvm/llvm-project/issues/91127.
DeltaFile
+74-0llvm/test/Transforms/InstCombine/icmp-of-trunc-ext.ll
+1-0llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+75-02 files

LLVM/project 30367cblldb/include/lldb/API SBType.h, lldb/source/API SBType.cpp

[lldb] Add SBType::GetByteAlign (#90960)

lldb already mostly(*) tracks this information. This just makes it
available to the SB users.

(*) It does not do that for typedefs right now see llvm.org/pr90958
DeltaFile
+21-0lldb/test/API/python_api/type/TestTypeList.py
+13-0lldb/source/API/SBType.cpp
+3-0lldb/test/API/python_api/type/main.cpp
+2-0lldb/include/lldb/API/SBType.h
+39-04 files

LLVM/project eb75af2llvm/lib/Target/SystemZ SystemZInstrInfo.cpp SystemZInstrInfo.h, llvm/test/CodeGen/SystemZ fold-copy-vector-immediate.mir

Reapply "SystemZ: Fold copy of vector immediate to gr128" (#91099)

This reverts commit a415b4dfcc02e3e82b8c8a7836f7c04b9d65dc9b.

Modify the instruction in place to transform it into a REG_SEQUENCE,
which is what other implementations of foldImmediate do. Also start
erasing the def instruction if there are no other uses.

Fixes #91110.
DeltaFile
+206-0llvm/test/CodeGen/SystemZ/fold-copy-vector-immediate.mir
+55-0llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp
+3-0llvm/lib/Target/SystemZ/SystemZInstrInfo.h
+264-03 files

LLVM/project e2c8925llvm/lib/Target/AMDGPU AMDGPUPostLegalizerCombiner.cpp AMDGPUCombine.td

[AMDGPU] Fix typo in function name
DeltaFile
+3-3llvm/lib/Target/AMDGPU/AMDGPUPostLegalizerCombiner.cpp
+1-1llvm/lib/Target/AMDGPU/AMDGPUCombine.td
+4-42 files

LLVM/project 4b61d04llvm/test/CodeGen/SystemZ frame-26.mir frame-28.mir

SystemZ: Remove unnecessary REQUIRES asserts from tests
DeltaFile
+10-11llvm/test/CodeGen/SystemZ/frame-26.mir
+3-4llvm/test/CodeGen/SystemZ/frame-28.mir
+0-1llvm/test/CodeGen/SystemZ/memcmp-03.ll
+13-163 files

LLVM/project 181e821llvm/test/CodeGen/SystemZ zos-ppa2.ll

SystemZ: Remove redundant REQUIRES systemz from test
DeltaFile
+0-1llvm/test/CodeGen/SystemZ/zos-ppa2.ll
+0-11 files

LLVM/project ef8d814llvm/include/llvm/ExecutionEngine/Orc LLJIT.h IndirectionUtils.h

Revert "Remove redundant move in return statement" (#91169)

Reverts llvm/llvm-project#90546

This broke some bots, seems like some toolchain don’t consider the
implicit move here.
DeltaFile
+4-4llvm/include/llvm/ExecutionEngine/Orc/LLJIT.h
+1-1llvm/include/llvm/ExecutionEngine/Orc/IndirectionUtils.h
+5-52 files

LLVM/project 0140ba0clang/include/clang/Basic LangOptions.h, clang/test/AST ast-dump-fpfeatures.cpp ast-dump-late-parsing.cpp

[clang] Enable FPContract with optnone (#91061)

Previously treatment of the attribute `optnone` was modified in
https://github.com/llvm/llvm-project/pull/85605 ([clang] Set correct
FPOptions if attribute 'optnone' presents). As a side effect FPContract
was disabled for optnone. It created unneeded divergence with the
behavior of -O0, which enables this optimization.

In the discussion
https://github.com/llvm/llvm-project/pull/85605#issuecomment-2089350379
it was pointed out that FP contraction should be enabled even if all
optimizations are turned off, otherwise results of calculations would be
different. This change enables FPContract at optnone.
DeltaFile
+9-9clang/test/AST/ast-dump-fpfeatures.cpp
+4-4clang/test/AST/ast-dump-late-parsing.cpp
+1-4clang/include/clang/Basic/LangOptions.h
+2-2clang/test/AST/ast-dump-fpfeatures.m
+16-194 files

LLVM/project d654278llvm/docs AMDGPUUsage.rst LangRef.rst, llvm/lib/Target/AMDGPU SIModeRegisterDefaults.cpp SIISelLowering.cpp

Reapply "AMDGPU: Implement llvm.set.rounding (#88587)" series (#91113)

Revert "Revert 4 last AMDGPU commits to unbreak Windows bots"

This reverts commit 0d493ed2c6e664849a979b357a606dcd8273b03f.

MSVC does not like constexpr on the definition after an extern
declaration of a global.
DeltaFile
+1,665-0llvm/test/CodeGen/AMDGPU/llvm.set.rounding.ll
+119-0llvm/lib/Target/AMDGPU/SIModeRegisterDefaults.cpp
+88-0llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+7-0llvm/lib/Target/AMDGPU/SIModeRegisterDefaults.h
+6-0llvm/docs/AMDGPUUsage.rst
+2-0llvm/docs/LangRef.rst
+1,887-02 files not shown
+1,890-08 files

LLVM/project db532ffllvm/include/llvm/ExecutionEngine/Orc LLJIT.h IndirectionUtils.h

Remove redundant move in return statement (#90546)

This pull request removes unnecessary move in the return statement to
suppress compilation warnings.

Co-authored-by: Xiaolei Shi <xiaoleis at nvidia.com>
DeltaFile
+4-4llvm/include/llvm/ExecutionEngine/Orc/LLJIT.h
+1-1llvm/include/llvm/ExecutionEngine/Orc/IndirectionUtils.h
+5-52 files

LLVM/project 1500dc0llvm/test/CodeGen/RISCV/rvv coalesce-vsetvli.mir

[RISCV] Use virtual registers for AVL instrs in coalesce-vsetvli.mir. NFC

All GPR registers will still be virtual at this stage, so update the test
to reflect that.
DeltaFile
+11-7llvm/test/CodeGen/RISCV/rvv/coalesce-vsetvli.mir
+11-71 files

LLVM/project 0348e71clang/lib/Analysis/FlowSensitive Transfer.cpp, clang/unittests/Analysis/FlowSensitive TransferTest.cpp

[clang][dataflow] Fix crash when `operator=` result type is not destination type. (#90898)

The existing code was full of comments about how we assume this is
always the
case, but it's not mandated by the standard, and there is code out there
that
returns a different type. So check that the result type is in fact the
same as
the destination type before attempting to copy to the result.

To make sure that we don't bail out in more cases than intended, I've
extended
existing tests to verify that in the common case, we do return the
destination
object (by reference or value, as the case may be).
DeltaFile
+71-2clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
+16-7clang/lib/Analysis/FlowSensitive/Transfer.cpp
+87-92 files

LLVM/project d70267fclang/test/Driver riscv-option-arch.c riscv-option-arch.s, llvm/lib/Target/RISCV/AsmParser RISCVAsmParser.cpp

[RISCV] Teach .option arch to support experimental extensions. (#89727)

Previously `.option arch` denied extenions are not belongs to RISC-V
features. But experimental features have experimental- prefix, so
`.option arch` can not serve for experimental extension.
This patch uses the features of extensions to identify extension
existance.
DeltaFile
+15-10llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
+10-3llvm/test/MC/RISCV/option-arch.s
+7-0clang/test/Driver/riscv-option-arch.c
+5-0clang/test/Driver/riscv-option-arch.s
+37-134 files

LLVM/project 947b062clang/include/clang/Serialization ASTBitCodes.h SourceLocationEncoding.h, clang/lib/Serialization ASTReader.cpp ASTWriter.cpp

Reland "[Modules] No transitive source location change (#86912)"

This relands 6c31104.

The patch was reverted due to incorrectly introduced alignment. And the
patch was re-commited after fixing the alignment issue.

Following off are the original message:

This is part of "no transitive change" patch series, "no transitive
source location change". I talked this with @Bigcheese in the tokyo's
WG21 meeting.

The idea comes from @jyknight posted on LLVM discourse. That for:

```
// A.cppm
export module A;
...

    [246 lines not shown]
DeltaFile
+57-60clang/include/clang/Serialization/ASTBitCodes.h
+65-26clang/include/clang/Serialization/SourceLocationEncoding.h
+87-0clang/test/Modules/no-transitive-source-location-change.cppm
+22-44clang/lib/Serialization/ASTReader.cpp
+31-17clang/include/clang/Serialization/ASTReader.h
+35-8clang/lib/Serialization/ASTWriter.cpp
+297-1558 files not shown
+326-17414 files

LLVM/project b944b54llvm/test/CodeGen/RISCV/rvv coalesce-vsetvli.mir

[RISCV] Add RISCVCoalesceVSETVLI tests for removing dead AVLs. NFC
DeltaFile
+62-0llvm/test/CodeGen/RISCV/rvv/coalesce-vsetvli.mir
+62-01 files

LLVM/project db0ed55clang/lib/Format UnwrappedLineParser.cpp, clang/unittests/Format FormatTest.cpp

[clang-format] Don't remove parentheses of fold expressions (#91045)

Fixes #90966.
DeltaFile
+9-0clang/unittests/Format/FormatTest.cpp
+6-1clang/lib/Format/UnwrappedLineParser.cpp
+15-12 files

LLVM/project c609043clang/lib/Format UnwrappedLineParser.cpp, clang/unittests/Format TokenAnnotatorTest.cpp

[clang-format] Don't allow comma in front of structural enum (#91056)

Assume that a comma in front of `enum` means it is actually a part of an
elaborated type in a template parameter list.

Fixes https://github.com/llvm/llvm-project/issues/47782
DeltaFile
+3-2clang/lib/Format/UnwrappedLineParser.cpp
+4-0clang/unittests/Format/TokenAnnotatorTest.cpp
+7-22 files

LLVM/project 774b7ebllvm/include/llvm/ADT StringRef.h

[ADT] Reimplement operator==(StringRef, StringRef) (NFC) (#91139)

I'm planning to deprecate and eventually remove StringRef::equals in
favor of operator==.  This patch reimplements operator== without using
StringRef::equals.

I'm not sure if there is a good way to make StringRef::compareMemory
available to operator==, which is not a member function.  "friend"
works to some extent but breaks corner cases, which is why I've chosen
to "inline" compareMemory.
DeltaFile
+5-1llvm/include/llvm/ADT/StringRef.h
+5-11 files

LLVM/project f7bfb07llvm/lib/Target/X86 X86ISelLowering.cpp, llvm/test/CodeGen/X86 pr91005.ll

[X86][FP16] Do not create VBROADCAST_LOAD for f16 without AVX2 (#91125)

AVX doesn't provide 16-bit BROADCAST instruction.

Fixes #91005
DeltaFile
+39-0llvm/test/CodeGen/X86/pr91005.ll
+1-1llvm/lib/Target/X86/X86ISelLowering.cpp
+40-12 files

LLVM/project 3d6cf53mlir/docs/DefiningDialects Operations.md

fix formatting issues with ODS docs around assembly format directives (#91149)

- Some sentences are incorrectly split across list items.
- Some pre-formatted syntax is left in plaintext
- Some lines end in spaces

Co-authored-by: Jeremy Kun <j2kun at users.noreply.github.com>
DeltaFile
+15-14mlir/docs/DefiningDialects/Operations.md
+15-141 files

LLVM/project ddecadaclang/lib/Basic/Targets AArch64.cpp, clang/test/OpenMP distribute_parallel_for_simd_num_threads_codegen.cpp distribute_parallel_for_num_threads_codegen.cpp

[clang backend] In AArch64's DataLayout, specify a minimum function alignment of 4. (#90702)

This addresses an issue where the explicit alignment of 2 (for C++ ABI
reasons) was being propagated to the back end and causing under-aligned
functions (in special sections).

This is an alternate approach suggested by @efriedma-quic in PR #90415.

Fixes #90358
DeltaFile
+15-15clang/test/OpenMP/distribute_parallel_for_simd_num_threads_codegen.cpp
+10-10clang/test/OpenMP/distribute_parallel_for_num_threads_codegen.cpp
+6-6clang/lib/Basic/Targets/AArch64.cpp
+6-3llvm/unittests/Bitcode/DataLayoutUpgradeTest.cpp
+8-0llvm/lib/IR/AutoUpgrade.cpp
+4-4llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
+49-384 files not shown
+57-4610 files

LLVM/project e123643llvm/lib/Target/AArch64 AArch64ISelLowering.cpp, llvm/test/CodeGen/AArch64 mul_pow2.ll

[AArch64][SelectionDAG] Lower multiplication by a constant to shl+sub+shl+sub (#90199)

Change the costmodel to lower a = b * C where C = 1 - (1 - 2^m) * 2^n to
              sub  w8, w0, w0, lsl #m
              sub  w0, w0, w8, lsl #n
Fix https://github.com/llvm/llvm-project/issues/89430
DeltaFile
+73-2llvm/test/CodeGen/AArch64/mul_pow2.ll
+30-0llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+103-22 files