[mlir][OpenMP] Extend `omp.private` with a `dealloc` region (#90456)
Extends `omp.private` with a new region: `dealloc` where deallocation
logic for Fortran deallocatables will be outlined (this will happen in
later PRs).
[C++20] [Modules] Don't skip pragma diagnostic mappings
Close https://github.com/llvm/llvm-project/issues/75057
Previously, I thought the diagnostic mappings is not meaningful with
modules incorrectly. And this problem get revealed by another change
recently. So this patch tried to rever the previous "optimization"
partially.
[RISCV] Add DAG combine for (vmv_s_x_vl (undef) (vmv_x_s X). (#90524)
We can use the original vector as long as the type of X matches the
result type of the vmv_s_x_vl.
Revert "[C++20] [Modules] Don't skip pragma diagnostic mappings"
and "[NFC] [C++20] [Modules] Use new class CXX20ModulesGenerator to
generate module file for C++20 modules instead of PCHGenerator"
This reverts commit fb21343473e33e9a886b42d2fe95d1cec1cd0030.
and commit 18268ac0f48d93c2bcddb69732761971669c09ab.
It looks like there are some problems about linking the compiler
[RISCV] Remove hasSideEffects=1 for saturating/fault-only-first instructions
Marking them as `hasSideEffects=1` stops some optimizations.
According to `Target.td`:
> // Does the instruction have side effects that are not captured by any
> // operands of the instruction or other flags?
> bit hasSideEffects = ?;
It seems we don't need to set `hasSideEffects` for vleNff since we have
modelled `vl` as an output operand.
As for saturating instructions, I think that explicit Def/Use list
is kind of side effects captured by any operands of the instruction,
so we don't need to set `hasSideEffects` either. And I have just
investigated AArch64's implementation, they don't set this flag and
don't add `Def` list.
[12 lines not shown]
[C++20] [Modules] Don't skip pragma diagnostic mappings
Close https://github.com/llvm/llvm-project/issues/75057
Previously, I thought the diagnostic mappings is not meaningful with
modules incorrectly. And this problem get revealed by another change
recently. So this patch tried to rever the previous "optimization"
partially.
[SelectionDAG][RISCV] Move VP_REDUCE* legalization to LegalizeDAG.cpp. (#90522)
LegalizeVectorType is responsible for legalizing nodes that perform an
operation on each element may need to scalarize.
This is not true for nodes like VP_REDUCE.*, BUILD_VECTOR,
SHUFFLE_VECTOR, EXTRACT_SUBVECTOR, etc.
This patch drops any nodes with a scalar result from LegalizeVectorOps
and handles them in LegalizeDAG instead.
This required moving the reduction promotion to LegalizeDAG. I have
removed the support integer promotion as it was incorrect for integer
min/max reductions. Since it was untested, it was best to assert on it
until it was really needed.
There are a couple regressions that can be fixed with a small DAG
combine which I will do as a follow up.
[NFC] [C++20] [Modules] Use new class CXX20ModulesGenerator to generate module file for C++20 modules instead of PCHGenerator
Previously we're re-using PCHGenerator to generate the module file for
C++20 modules. But this is slighty more or less odd. This patch tries
to use a new class 'CXX20ModulesGenerator' to generate the module file
for C++20 modules.
[clang-tidy] fix false-negative for macros in `readability-math-missing-parentheses` (#90279)
When a binary operator is the last operand of a macro, the end location
that is past the `BinaryOperator` will be inside the macro and therefore
an
invalid location to insert a `FixIt` into, which is why the check bails
when encountering such a pattern.
However, the end location is only required for the `FixIt` and the
diagnostic can still be emitted, just without an attached fix.
[ELF] --compress-debug-sections=zstd: replace ZSTD_c_nbWorkers parallelism with multi-frame parallelism
https://reviews.llvm.org/D133679 utilizes zstd's multithread API to
create one single frame. This provides a higher compression ratio but is
significantly slower than concatenating multiple frames.
With manual parallelism, it is easier to parallelize memcpy in
OutputSection::writeTo for parallel memcpy.
In addition, as the individual allocated decompression buffers are much
smaller, we can make a wild guess (compressed_size/4) without worrying
about a resize (due to wrong guess) would waste memory.
Disable test for lsan and x86_64h (#90483)
Disable this test on x86_64h for LSan.
This test is failing with malformed object only on x86_64h.
Disabling for now.
rdar://125052424
[C++20] [Modules] [Reduced BMI] Avoid force writing static declarations
within module purview
Close https://github.com/llvm/llvm-project/issues/90259
Technically, the static declarations shouldn't be leaked from the module
interface, otherwise it is an illegal program according to the spec. So
we can get rid of the static declarations from the reduced BMI
technically. Then we can close the above issue.
However, there are too many `static inline` codes in existing headers.
So it will be a pretty big breaking change if we do this globally.
[BOLT] Avoid reference updates for non-JT symbol operands (#88838)
Skip updating references for operands that do not directly
refer to jump table symbols but fall within a jump table's
address range to prevent unintended modifications.