[flang][cuda] Avoid crash by exiting the check if assignment is not usable (#89149)
In the presence of other semantic error, `GetAssignment` would return a
nullptr and therefore would make the rest of the check crash when trying
to collect symbols.
Exiting early when we have a nullptr so the compiler doesn't crash and
user can get the meaningful semantic error.
[Support] Report EISDIR when opening a directory (#79880)
The test `llvm/unittests/Support/CommandLineTest.cpp` that handles
errors in expansion of response files was previously disabled for AIX.
Originally the code was dependent on `read` returning `EISDIR` which
occurs on platforms such as Linux. However, other platforms such as AIX
allow use of `read` on file descriptors for directories. This change
updates `readNativeFile` to produce `EISDIR` on AIX and z/OS when used
on a directory (instead of relying on the call to `read` to do so).
---------
Co-authored-by: Alison Zhang <alisonzhang at ibm.com>
Co-authored-by: James Henderson <46713263+jh7370 at users.noreply.github.com>
[bazel] Improve liblldb building (#89095)
On Linux using --version-script doesn't force loading of the underlying
archives that contain the symbols. By setting alwayslink=True on the API
cc_library we virtually get this behavior. This also allows downstream
users to use the exports files used by cmake. We could build more
configurability into this but there are also a lot of possible
variations users might want.
[flang][cuda] Lower DEALLOCATE for device variables (#89091)
Replace the runtime call to `AllocatableDeallocate` for CUDA device
variable to the newly added `fir.cuda_deallocate` operation.
This is similar with #88980
A third patch will handle the case of automatic dealloctaion of device
allocatable variables
[InstCombine] Don't use dominating conditions to transform sub into xor. (#88566)
Other passes are unable to reverse this transform if we use dominating
conditions.
Fixes #88239.
[lldb][DynamicLoader] Fix lldb unable to stop at _dl_debug_state if user set it before the process launched. (#88792)
If user sets a breakpoint at `_dl_debug_state` before the process
launched, the breakpoint is not resolved yet. When lldb loads dynamic
loader module, it's created with `Target::GetOrCreateModule` which
notifies any pending breakpoint to resolve. However, the module's
sections are not loaded at this time. They are loaded after returned
from
[Target::GetOrCreateModule](https://github.com/llvm/llvm-project/blob/0287a5cc4e2a5ded1ae2e4079f91052e6a6b8d9b/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp#L574-L577).
This change fixes it by manually resolving breakpoints after creating
dynamic loader module.
[LiveIns] Improve recomputeLiveIns() (#88951)
Some small changes to recomputeLiveIns() to improve performance:
- Instead of copying the list of old live-ins, and then clearing
them, a new method swaps the list for an empty one.
- getLiveIns() now returns a constant reference to the list
As result, the list-data is never copied. Depending on the
implementation
details of the vector container, it can also save calls to allocate
and deallocate memory.
I see a small improvement on CTMark with these changes.
---------
Co-authored-by: Nikita Popov <github at npopov.com>
CodeGenPrepare: Add support for llvm.threadlocal.address address-mode sinking (#87844)
Depending on the TLSMode many thread-local accesses on x86 can be
expressed by adding a %fs: segment register to an addressing mode. Even
if there are mutliple users of a `llvm.threadlocal.address` intrinsic it
is generally not worth sharing the value in a register but instead fold
the %fs access into multiple addressing modes.
Hence this changes CodeGenPrepare to duplicate the
`llvm.threadlocal.address` intrinsic as necessary.
Introduces a new `TargetLowering::addressingModeSupportsTLS` callback
that allows targets to indicate whether TLS accesses can be part of an
addressing mode.
This is fixing a performance problem, as this folding of TLS-accesses
into multiple addressing modes happened naturally before the
introduction of the `llvm.threadlocal.address` intrinsic, but regressed
due to `SelectionDAG` keeping things in registers when accessed across
[3 lines not shown]
AMDGPU: Fix not handling atomicrmw fadd in exotic address spaces correctly
We try to interpret unknown address space numbers as aliases of global,
but this wasn't applied here. Also improve test coverage for the
buffer fat pointer address space.
[X86] Always use 64-bit relocations in no-PIC large code model (#89101)
This matches other types of relocations, e.g. to constant pool. And
makes things more consistent with PIC large code model.
Some users of the large code model may not place small data in the lower
2GB of the address space (e.g.
https://github.com/ClangBuiltLinux/linux/issues/2016), so just
unconditionally use 64-bit relocations in the large code model.
So now functions in a section not marked large will use 64-bit
relocations to reference everything when using the large code model.
This also fixes some lldb tests broken by #88172
(https://lab.llvm.org/buildbot/#/builders/68/builds/72458).
[clang]Treat arguments to builtin type traits as template type arguments (#87132)
This change improves error messages for builtins in case of empty
parentheses.
Fixes llvm#86997
[libc++][TZDB] Adds local_info formatter.
Note the code using a local_info object will be done in a separate
commit.
Implements parts of:
- P0355 Extending to Calendars and Time Zones
- P1361 Integration of chrono with text formatting
[libc++][TZDB] Adds sys_info formatter. (#85896)
Implements parts of:
- P0355 Extending <chrono> to Calendars and Time Zones
- P1361 Integration of chrono with text formatting
[llvm][NVPTX] Don't emit unused var 'temp_param_reg' (NFC) (#89004)
Don't emit unused variable 'temp_param_reg' which has been around since
ae556d3ef72dfe5f40a337b7071f42b7bf5b66a4 .
[AArch64] Update latencies for Cortex-A510 scheduling model (#87293)
Updated according to the Software Optimization Guide for Arm®
Cortex®‑A510 Core Revision: r1p3 Issue 6.0.
[CostModel][X86] Recognise vector rotation by uniform constant patterns
Adds suitable costs for AVX512 targets (we still rely on default expansion for AVX2 and earlier)