Linux/linux 8d025e2. MAINTAINERS, fs/erofs super.c

Merge tag 'erofs-for-6.9-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs

Pull erofs fixes from Gao Xiang:

 - Add a new reviewer Sandeep Dhavale to build a healthier community

 - Drop experimental warning for FSDAX

* tag 'erofs-for-6.9-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  MAINTAINERS: erofs: add myself as reviewer
  erofs: drop experimental warning for FSDAX
DeltaFile
+1-0MAINTAINERS
+0-1fs/erofs/super.c
+1-12 files

Linux/linux 4076fa1fs/9p vfs_inode.c vfs_inode_dotl.c

Merge tag '9p-fixes-for-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs

Pull 9p fixes from Eric Van Hensbergen:
 "Two of these fix syzbot reported issues, and the other fixes a unused
  variable in some configurations"

* tag '9p-fixes-for-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
  fs/9p: fix uninitialized values during inode evict
  fs/9p: remove redundant pointer v9ses
  fs/9p: fix uaf in in v9fs_stat2inode_dotl
DeltaFile
+10-6fs/9p/vfs_inode.c
+1-5fs/9p/vfs_inode_dotl.c
+11-112 files

Linux/linux 400dd45fs/btrfs volumes.c extent_map.c

Merge tag 'for-6.9-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - fix race when reading extent buffer and 'uptodate' status is missed
   by one thread (introduced in 6.5)

 - do additional validation of devices using major:minor numbers

 - zoned mode fixes:
     - use zone-aware super block access during scrub
     - fix use-after-free during device replace (found by KASAN)
     - also delete zones that are 100% unusable to reclaim space

 - extent unpinning fixes:
     - fix extent map leak after error handling
     - print correct range in error message

 - error code and message updates

    [12 lines not shown]
DeltaFile
+22-5fs/btrfs/volumes.c
+8-8fs/btrfs/extent_map.c
+7-7fs/btrfs/zoned.c
+13-0fs/btrfs/extent_io.c
+11-1fs/btrfs/scrub.c
+2-1fs/btrfs/block-group.c
+63-226 files

Linux/linux dc189b8arch/arm/include/asm mman.h, arch/parisc/include/asm mman.h

Merge tag 'mm-hotfixes-stable-2024-03-27-11-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "Various hotfixes. About half are cc:stable and the remainder address
  post-6.8 issues or aren't considered suitable for backporting.

  zswap figures prominently in the post-6.8 issues - folloup against the
  large amount of changes we have just made to that code.

  Apart from that, all over the map"

* tag 'mm-hotfixes-stable-2024-03-27-11-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (21 commits)
  crash: use macro to add crashk_res into iomem early for specific arch
  mm: zswap: fix data loss on SWP_SYNCHRONOUS_IO devices
  selftests/mm: fix ARM related issue with fork after pthread_create
  hexagon: vmlinux.lds.S: handle attributes section
  userfaultfd: fix deadlock warning when locking src and dst VMAs
  tmpfs: fix race on handling dquot rbtree
  selftests/mm: sigbus-wp test requires UFFD_FEATURE_WP_HUGETLBFS_SHMEM

    [14 lines not shown]
DeltaFile
+39-6mm/zswap.c
+23-10mm/page_owner.c
+16-0mm/filemap.c
+14-0arch/parisc/include/asm/mman.h
+14-0arch/arm/include/asm/mman.h
+12-1tools/testing/selftests/mm/uffd-unit-tests.c
+118-1719 files not shown
+177-4025 files

Linux/linux 9624905kernel/trace trace_probe.c

Merge tag 'probes-fixes-v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull probes fixlet from Masami Hiramatsu:

 - tracing/probes: initialize a 'val' local variable with zero.

   This variable is read by FETCH_OP_ST_EDATA in a loop, and is
   initialized by FETCH_OP_ARG in the same loop. Since this
   initialization is not obvious, smatch warns about it.

   Explicitly initializing 'val' with zero fixes this warning.

* tag 'probes-fixes-v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: probes: Fix to zero initialize a local variable
DeltaFile
+1-1kernel/trace/trace_probe.c
+1-11 files

Linux/linux f4a4329fs binfmt_elf_fdpic.c, tools/testing/selftests/exec recursion-depth.c load_address.c

Merge tag 'execve-v6.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull execve fixes from Kees Cook:

 - Fix selftests to conform to the TAP output format (Muhammad Usama
   Anjum)

 - Fix NOMMU linux_binprm::exec pointer in auxv (Max Filippov)

 - Replace deprecated strncpy usage (Justin Stitt)

 - Replace another /bin/sh instance in selftests

* tag 'execve-v6.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  binfmt: replace deprecated strncpy
  exec: Fix NOMMU linux_binprm::exec in transfer_args_to_stack()
  selftests/exec: Convert remaining /bin/sh to /bin/bash
  selftests/exec: execveat: Improve debug reporting
  selftests/exec: recursion-depth: conform test to TAP format output

    [2 lines not shown]
DeltaFile
+26-27tools/testing/selftests/exec/recursion-depth.c
+16-20tools/testing/selftests/exec/load_address.c
+7-5tools/testing/selftests/exec/execveat.c
+9-1tools/testing/selftests/exec/binfmt_script.py
+2-2tools/testing/selftests/exec/Makefile
+1-1fs/binfmt_elf_fdpic.c
+61-561 files not shown
+62-567 files

Linux/linux 498e47cdrivers/uio uio_dmem_genirq.c uio.c

Fix build errors due to new UIO_MEM_DMA_COHERENT mess

Commit 576882ef5e7f ("uio: introduce UIO_MEM_DMA_COHERENT type")
introduced a new use-case for 'struct uio_mem' where the 'mem' field now
contains a kernel virtual address when 'memtype' is set to
UIO_MEM_DMA_COHERENT.

That in turn causes build errors, because 'mem' is of type
'phys_addr_t', and a virtual address is a pointer type.  When the code
just blindly uses cast to mix the two, it caused problems when
phys_addr_t isn't the same size as a pointer - notably on 32-bit
architectures with PHYS_ADDR_T_64BIT.

The proper thing to do would probably be to use a union member, and not
have any casts, and make the 'mem' member be a union of 'mem.physaddr'
and 'mem.vaddr', based on 'memtype'.

This is not that proper thing.  This is just fixing the ugly casts to be
even uglier, but at least not cause build errors on 32-bit platforms

    [11 lines not shown]
DeltaFile
+2-2drivers/uio/uio_dmem_genirq.c
+1-1drivers/uio/uio.c
+1-1drivers/uio/uio_pruss.c
+4-43 files

Linux/linux 5b4cdd9kernel/time posix-clock.c

Fix memory leak in posix_clock_open()

If the clk ops.open() function returns an error, we don't release the
pccontext we allocated for this clock.

Re-organize the code slightly to make it all more obvious.

Reported-by: Rohit Keshri <rkeshri at redhat.com>
Acked-by: Oleg Nesterov <oleg at redhat.com>
Fixes: 60c6946675fc ("posix-clock: introduce posix_clock_context concept")
Cc: Jakub Kicinski <kuba at kernel.org>
Cc: David S. Miller <davem at davemloft.net>
Cc: Thomas Gleixner <tglx at linutronix.de>
Signed-off-by: Linus Torvalds <torvalds at linuxfoundation.org>
DeltaFile
+9-7kernel/time/posix-clock.c
+9-71 files

Linux/linux 32fbe52arch/x86/include/asm crash_reserve.h, kernel crash_reserve.c

crash: use macro to add crashk_res into iomem early for specific arch

There are regression reports[1][2] that crashkernel region on x86_64 can't
be added into iomem tree sometime.  This causes the later failure of kdump
loading.

This happened after commit 4a693ce65b18 ("kdump: defer the insertion of
crashkernel resources") was merged.

Even though, these reported issues are proved to be related to other
component, they are just exposed after above commmit applied, I still
would like to keep crashk_res and crashk_low_res being added into iomem
early as before because the early adding has been always there on x86_64
and working very well.  For safety of kdump, Let's change it back.

Here, add a macro HAVE_ARCH_ADD_CRASH_RES_TO_IOMEM_EARLY to limit that
only ARCH defining the macro can have the early adding
crashk_res/_low_res into iomem. Then define
HAVE_ARCH_ADD_CRASH_RES_TO_IOMEM_EARLY on x86 to enable it.

    [23 lines not shown]
DeltaFile
+7-0kernel/crash_reserve.c
+2-0arch/x86/include/asm/crash_reserve.h
+9-02 files

Linux/linux 25cd241mm zswap.c

mm: zswap: fix data loss on SWP_SYNCHRONOUS_IO devices

Zhongkun He reports data corruption when combining zswap with zram.

The issue is the exclusive loads we're doing in zswap. They assume
that all reads are going into the swapcache, which can assume
authoritative ownership of the data and so the zswap copy can go.

However, zram files are marked SWP_SYNCHRONOUS_IO, and faults will try to
bypass the swapcache.  This results in an optimistic read of the swap data
into a page that will be dismissed if the fault fails due to races.  In
this case, zswap mustn't drop its authoritative copy.

Link: https://lore.kernel.org/all/CACSyD1N+dUvsu8=zV9P691B9bVq33erwOXNTmEaUbi9DrDeJzw@mail.gmail.com/
Fixes: b9c91c43412f ("mm: zswap: support exclusive loads")
Link: https://lkml.kernel.org/r/20240324210447.956973-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes at cmpxchg.org>
Reported-by: Zhongkun He <hezhongkun.hzk at bytedance.com>
Tested-by: Zhongkun He <hezhongkun.hzk at bytedance.com>

    [7 lines not shown]
DeltaFile
+19-4mm/zswap.c
+19-41 files

Linux/linux 8c86437tools/testing/selftests/mm uffd-unit-tests.c uffd-common.c

selftests/mm: fix ARM related issue with fork after pthread_create

Following issue was observed while running the uffd-unit-tests selftest
on ARM devices. On x86_64 no issues were detected:

pthread_create followed by fork caused deadlock in certain cases wherein
fork required some work to be completed by the created thread.  Used
synchronization to ensure that created thread's start function has started
before invoking fork.

[edliaw at google.com: refactored to use atomic_bool]
Link: https://lkml.kernel.org/r/20240325194100.775052-1-edliaw@google.com
Fixes: 760aee0b71e3 ("selftests/mm: add tests for RO pinning vs fork()")
Signed-off-by: Lokesh Gidra <lokeshgidra at google.com>
Signed-off-by: Edward Liaw <edliaw at google.com>
Cc: Peter Xu <peterx at redhat.com>
Cc: <stable at vger.kernel.org>
Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
DeltaFile
+10-0tools/testing/selftests/mm/uffd-unit-tests.c
+3-0tools/testing/selftests/mm/uffd-common.c
+2-0tools/testing/selftests/mm/uffd-common.h
+15-03 files

Linux/linux 30af24fmm userfaultfd.c

userfaultfd: fix deadlock warning when locking src and dst VMAs

Use down_read_nested() to avoid the warning.

Link: https://lkml.kernel.org/r/20240321235818.125118-1-lokeshgidra@google.com
Fixes: 867a43a34ff8 ("userfaultfd: use per-vma locks in userfaultfd operations")
Reported-by: syzbot+49056626fe41e01f2ba7 at syzkaller.appspotmail.com
Signed-off-by: Lokesh Gidra <lokeshgidra at google.com>
Cc: Andrea Arcangeli <aarcange at redhat.com>
Cc: Axel Rasmussen <axelrasmussen at google.com>
Cc: Brian Geffon <bgeffon at google.com>
Cc: David Hildenbrand <david at redhat.com>
Cc: Hillf Danton <hdanton at sina.com>
Cc: Jann Horn <jannh at google.com> [Bug #2]
Cc: Kalesh Singh <kaleshsingh at google.com>
Cc: Lokesh Gidra <lokeshgidra at google.com>
Cc: Mike Rapoport (IBM) <rppt at kernel.org>
Cc: Nicolas Geoffray <ngeoffray at google.com>
Cc: Peter Xu <peterx at redhat.com>

    [2 lines not shown]
DeltaFile
+2-1mm/userfaultfd.c
+2-11 files

Linux/linux 549aa96arch/hexagon/kernel vmlinux.lds.S

hexagon: vmlinux.lds.S: handle attributes section

After the linked LLVM change, the build fails with
CONFIG_LD_ORPHAN_WARN_LEVEL="error", which happens with allmodconfig:

  ld.lld: error: vmlinux.a(init/main.o):(.hexagon.attributes) is being placed in '.hexagon.attributes'

Handle the attributes section in a similar manner as arm and riscv by
adding it after the primary ELF_DETAILS grouping in vmlinux.lds.S, which
fixes the error.

Link: https://lkml.kernel.org/r/20240319-hexagon-handle-attributes-section-vmlinux-lds-s-v1-1-59855dab8872@kernel.org
Fixes: 113616ec5b64 ("hexagon: select ARCH_WANT_LD_ORPHAN_WARN")
Link: https://github.com/llvm/llvm-project/commit/31f4b329c8234fab9afa59494d7f8bdaeaefeaad
Signed-off-by: Nathan Chancellor <nathan at kernel.org>
Reviewed-by: Brian Cain <bcain at quicinc.com>
Cc: Bill Wendling <morbo at google.com>
Cc: Justin Stitt <justinstitt at google.com>
Cc: Nick Desaulniers <ndesaulniers at google.com>

    [2 lines not shown]
DeltaFile
+1-0arch/hexagon/kernel/vmlinux.lds.S
+1-01 files

Linux/linux 0a69b6bmm shmem_quota.c

tmpfs: fix race on handling dquot rbtree

A syzkaller reproducer found a race while attempting to remove dquot
information from the rb tree.

Fetching the rb_tree root node must also be protected by the
dqopt->dqio_sem, otherwise, giving the right timing, shmem_release_dquot()
will trigger a warning because it couldn't find a node in the tree, when
the real reason was the root node changing before the search starts:

Thread 1                                Thread 2
- shmem_release_dquot()                 - shmem_{acquire,release}_dquot()

- fetch ROOT                            - Fetch ROOT

                                        - acquire dqio_sem
- wait dqio_sem

                                        - do something, triger a tree rebalance

    [15 lines not shown]
DeltaFile
+7-3mm/shmem_quota.c
+7-31 files

Linux/linux 105840etools/testing/selftests/mm uffd-unit-tests.c

selftests/mm: sigbus-wp test requires UFFD_FEATURE_WP_HUGETLBFS_SHMEM

The sigbus-wp test requires the UFFD_FEATURE_WP_HUGETLBFS_SHMEM flag for
shmem and hugetlb targets.  Otherwise it is not backwards compatible with
kernels <5.19 and fails with EINVAL.

Link: https://lkml.kernel.org/r/20240321232023.2064975-1-edliaw@google.com
Fixes: 73c1ea939b65 ("selftests/mm: move uffd sig/events tests into uffd unit tests")
Signed-off-by: Edward Liaw <edliaw at google.com>
Cc: Shuah Khan <shuah at kernel.org>
Cc: Peter Xu <peterx at redhat.com
Cc: <stable at vger.kernel.org>
Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
DeltaFile
+2-1tools/testing/selftests/mm/uffd-unit-tests.c
+2-11 files

Linux/linux 30fb6a8mm zswap.c

mm: zswap: fix writeback shinker GFP_NOIO/GFP_NOFS recursion

Kent forwards this bug report of zswap re-entering the block layer
from an IO request allocation and locking up:

[10264.128242] sysrq: Show Blocked State
[10264.128268] task:kworker/20:0H   state:D stack:0     pid:143   tgid:143   ppid:2      flags:0x00004000
[10264.128271] Workqueue: bcachefs_io btree_write_submit [bcachefs]
[10264.128295] Call Trace:
[10264.128295]  <TASK>
[10264.128297]  __schedule+0x3e6/0x1520
[10264.128303]  schedule+0x32/0xd0
[10264.128304]  schedule_timeout+0x98/0x160
[10264.128308]  io_schedule_timeout+0x50/0x80
[10264.128309]  wait_for_completion_io_timeout+0x7f/0x180
[10264.128310]  submit_bio_wait+0x78/0xb0
[10264.128313]  swap_writepage_bdev_sync+0xf6/0x150
[10264.128317]  zswap_writeback_entry+0xf2/0x180
[10264.128319]  shrink_memcg_cb+0xe7/0x2f0

    [44 lines not shown]
DeltaFile
+8-0mm/zswap.c
+8-01 files

Linux/linux d5aad4carch/parisc/include/asm mman.h, include/linux mman.h

prctl: generalize PR_SET_MDWE support check to be per-arch

Patch series "ARM: prctl: Reject PR_SET_MDWE where not supported".

I noticed after a recent kernel update that my ARM926 system started
segfaulting on any execve() after calling prctl(PR_SET_MDWE).  After some
investigation it appears that ARMv5 is incapable of providing the
appropriate protections for MDWE, since any readable memory is also
implicitly executable.

The prctl_set_mdwe() function already had some special-case logic added
disabling it on PARISC (commit 793838138c15, "prctl: Disable
prctl(PR_SET_MDWE) on parisc"); this patch series (1) generalizes that
check to use an arch_*() function, and (2) adds a corresponding override
for ARM to disable MDWE on pre-ARMv6 CPUs.

With the series applied, prctl(PR_SET_MDWE) is rejected on ARMv5 and
subsequent execve() calls (as well as mmap(PROT_READ|PROT_WRITE)) can
succeed instead of unconditionally failing; on ARMv6 the prctl works as it

    [34 lines not shown]
DeltaFile
+14-0arch/parisc/include/asm/mman.h
+8-0include/linux/mman.h
+5-2kernel/sys.c
+27-23 files

Linux/linux 166ce84arch/arm/include/asm mman.h

ARM: prctl: reject PR_SET_MDWE on pre-ARMv6

On v5 and lower CPUs we can't provide MDWE protection, so ensure we fail
any attempt to enable it via prctl(PR_SET_MDWE).

Previously such an attempt would misleadingly succeed, leading to any
subsequent mmap(PROT_READ|PROT_WRITE) or execve() failing unconditionally
(the latter somewhat violently via force_fatal_sig(SIGSEGV) due to
READ_IMPLIES_EXEC).

Link: https://lkml.kernel.org/r/20240227013546.15769-6-zev@bewilderbeest.net
Signed-off-by: Zev Weiss <zev at bewilderbeest.net>
Cc: <stable at vger.kernel.org>    [6.3+]
Cc: Borislav Petkov <bp at alien8.de>
Cc: David Hildenbrand <david at redhat.com>
Cc: Florent Revest <revest at chromium.org>
Cc: Helge Deller <deller at gmx.de>
Cc: "James E.J. Bottomley" <James.Bottomley at HansenPartnership.com>
Cc: Josh Triplett <josh at joshtriplett.org>

    [12 lines not shown]
DeltaFile
+14-0arch/arm/include/asm/mman.h
+14-01 files

Linux/linux 950bf45tools Makefile

tools/Makefile: remove cgroup target

The tools/cgroup directory no longer contains a Makefile.  This patch
updates the top-level tools/Makefile to remove references to building and
installing cgroup components.  This change reflects the current structure
of the tools directory and fixes the build failure when building tools in
the top-level directory.

linux/tools$ make cgroup
  DESCEND cgroup
make[1]: *** No targets specified and no makefile found.  Stop.
make: *** [Makefile:73: cgroup] Error 2

Link: https://lkml.kernel.org/r/20240315012249.439639-1-liucong2@kylinos.cn
Signed-off-by: Cong Liu <liucong2 at kylinos.cn>
Acked-by: Stanislav Fomichev <sdf at google.com>
Reviewed-by: Dmitry Rokosov <ddrokosov at salutedevices.com>
Cc: Cong Liu <liucong2 at kylinos.cn>
Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
DeltaFile
+6-7tools/Makefile
+6-71 files

Linux/linux db09f2d. MAINTAINERS

MAINTAINERS: remove incorrect M: tag for dm-devel at lists.linux.dev

The dm-devel at lists.linux.dev mailing list should only be listed under the
L: (List) tag in the MAINTAINERS file.  However, it was incorrectly listed
under both L: and M: (Maintainers) tags, which is not accurate.  Remove
the M: tag for dm-devel at lists.linux.dev in the MAINTAINERS file to reflect
the correct categorization.

Link: https://lkml.kernel.org/r/20240319181842.249547-1-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw at gmail.com>
Cc: Ching-Chun (Jim) Huang <jserv at ccns.ncku.edu.tw>
Cc: Matthew Sakai <msakai at redhat.com>
Cc: Michael Sclafani <dm-devel at lists.linux.dev>
Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
DeltaFile
+0-1MAINTAINERS
+0-11 files

Linux/linux 9c50083mm zswap.c

mm: zswap: fix kernel BUG in sg_init_one

sg_init_one() relies on linearly mapped low memory for the safe
utilization of virt_to_page().  Otherwise, we trigger a kernel BUG,

kernel BUG at include/linux/scatterlist.h:187!
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 0 PID: 2997 Comm: syz-executor198 Not tainted 6.8.0-syzkaller #0
Hardware name: ARM-Versatile Express
PC is at sg_set_buf include/linux/scatterlist.h:187 [inline]
PC is at sg_init_one+0x9c/0xa8 lib/scatterlist.c:143
LR is at sg_init_table+0x2c/0x40 lib/scatterlist.c:128
Backtrace:
[<807e16ac>] (sg_init_one) from [<804c1824>] (zswap_decompress+0xbc/0x208 mm/zswap.c:1089)
 r7:83471c80 r6:def6d08c r5:844847d0 r4:ff7e7ef4
[<804c1768>] (zswap_decompress) from [<804c4468>] (zswap_load+0x15c/0x198 mm/zswap.c:1637)
 r9:8446eb80 r8:8446eb80 r7:8446eb84 r6:def6d08c r5:00000001 r4:844847d0
[<804c430c>] (zswap_load) from [<804b9644>] (swap_read_folio+0xa8/0x498 mm/page_io.c:518)

    [56 lines not shown]
DeltaFile
+12-2mm/zswap.c
+12-21 files

Linux/linux c52eb6dtools/testing/selftests/mm protection_keys.c

selftests: mm: restore settings from only parent process

The atexit() is called from parent process as well as forked processes. 
Hence the child restores the settings at exit while the parent is still
executing.  Fix this by checking pid of atexit() calling process and only
restore THP number from parent process.

Link: https://lkml.kernel.org/r/20240314094045.157149-1-usama.anjum@collabora.com
Fixes: c23ea61726d5 ("selftests/mm: protection_keys: save/restore nr_hugepages settings")
Signed-off-by: Muhammad Usama Anjum <usama.anjum at collabora.com>
Tested-by: Joey Gouly <joey.gouly at arm.com>
Cc: Shuah Khan <shuah at kernel.org>
Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
DeltaFile
+5-1tools/testing/selftests/mm/protection_keys.c
+5-11 files

Linux/linux 9cecde8include/linux pagevec.h

mm: increase folio batch size

On a 104 thread, 2 socket Skylake system, Intel report a 4.7% performance
reduction with will-it-scale page_fault2.  This was due to reducing the
size of the batch from 32 to 15.  Increasing the folio batch size from 15
to 31 gives a performance increase of 12.5% relative to the original, or
17.2% relative to the reduced performance commit.

The penalty of this commit is an additional 128 bytes of stack usage.  Six
folio_batches are also allocated from percpu memory in cpu_fbatches so
that will be an additional 768 bytes of percpu memory (per CPU).  Tim Chen
originally submitted a patch like this in 2020:
https://lore.kernel.org/linux-mm/d1cc9f12a8ad6c2a52cb600d93b06b064f2bbc57.1593205965.git.tim.c.chen@linux.intel.com/

Link: https://lkml.kernel.org/r/20240315140823.2478146-1-willy@infradead.org
Fixes: 99fbb6bfc16f ("mm: make folios_put() the basis of release_pages()")
Signed-off-by: Matthew Wilcox (Oracle) <willy at infradead.org>
Tested-by: Yujie Liu <yujie.liu at intel.com>
Reported-by: kernel test robot <oliver.sang at intel.com>

    [2 lines not shown]
DeltaFile
+2-2include/linux/pagevec.h
+2-21 files

Linux/linux d5d39c7mm filemap.c

mm: cachestat: fix two shmem bugs

When cachestat on shmem races with swapping and invalidation, there
are two possible bugs:

1) A swapin error can have resulted in a poisoned swap entry in the
   shmem inode's xarray. Calling get_shadow_from_swap_cache() on it
   will result in an out-of-bounds access to swapper_spaces[].

   Validate the entry with non_swap_entry() before going further.

2) When we find a valid swap entry in the shmem's inode, the shadow
   entry in the swapcache might not exist yet: swap IO is still in
   progress and we're before __remove_mapping; swapin, invalidation,
   or swapoff have removed the shadow from swapcache after we saw the
   shmem swap entry.

   This will send a NULL to workingset_test_recent(). The latter
   purely operates on pointer bits, so it won't crash - node 0, memcg

    [18 lines not shown]
DeltaFile
+16-0mm/filemap.c
+16-01 files

Linux/linux 3290032. .mailmap

mailmap: update entry for Leonard Crestez

Put my personal email first because NXP employment ended some time ago.
Also add my old intel email address.

Link: https://lkml.kernel.org/r/f568faa0-2380-4e93-a312-b80c1e367645@gmail.com
Signed-off-by: Leonard Crestez <cdleonard at gmail.com>
Cc: Florian Fainelli <f.fainelli at gmail.com>
Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
DeltaFile
+2-1.mailmap
+2-11 files

Linux/linux 7844c01mm page_owner.c

mm,page_owner: fix recursion

Prior to 217b2119b9e2 ("mm,page_owner: implement the tracking of the
stacks count") the only place where page_owner could potentially go into
recursion due to its need of allocating more memory was in save_stack(),
which ends up calling into stackdepot code with the possibility of
allocating memory.

We made sure to guard against that by signaling that the current task was
already in page_owner code, so in case a recursion attempt was made, we
could catch that and return dummy_handle.

After above commit, a new place in page_owner code was introduced where we
could allocate memory, meaning we could go into recursion would we take
that path.

Make sure to signal that we are in page_owner in that codepath as well. 
Move the guard code into two helpers {un}set_current_in_page_owner() and
use them prior to calling in the two functions that might allocate memory.

    [11 lines not shown]
DeltaFile
+23-10mm/page_owner.c
+23-101 files

Linux/linux 4624b34init initramfs.c

init: open /initrd.image with O_LARGEFILE

If initrd data is larger than 2Gb, we'll eventually fail to write to the
/initrd.image file when we hit that limit, unless O_LARGEFILE is set.

Link: https://lkml.kernel.org/r/20240317221522.896040-1-jsperbeck@google.com
Signed-off-by: John Sperbeck <jsperbeck at google.com>
Cc: Jens Axboe <axboe at kernel.dk>
Cc: Nick Desaulniers <ndesaulniers at google.com>
Cc: Peter Zijlstra <peterz at infradead.org>
Cc: Thomas Gleixner <tglx at linutronix.de>
Cc: <stable at vger.kernel.org>
Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
DeltaFile
+1-1init/initramfs.c
+1-11 files

Linux/linux f857236mm memory.c

mm/memory: fix missing pte marker for !page on pte zaps

Commit 0cf18e839f64 of large folio zap work broke uffd-wp.  Now mm's uffd
unit test "wp-unpopulated" will trigger this WARN_ON_ONCE().

The WARN_ON_ONCE() asserts that an VMA cannot be registered with
userfaultfd-wp if it contains a !normal page, but it's actually possible. 
One example is an anonymous vma, register with uffd-wp, read anything will
install a zero page.  Then when zap on it, this should trigger.

What's more, removing that WARN_ON_ONCE may not be enough either, because
we should also not rely on "whether it's a normal page" to decide whether
pte marker is needed.  For example, one can register wr-protect over some
DAX regions to track writes when UFFD_FEATURE_WP_ASYNC enabled, in which
case it can have page==NULL for a devmap but we may want to keep the
marker around.

Link: https://lkml.kernel.org/r/20240313213107.235067-1-peterx@redhat.com
Fixes: 0cf18e839f64 ("mm/memory: handle !page case in zap_present_pte() separately")

    [4 lines not shown]
DeltaFile
+3-1mm/memory.c
+3-11 files

Linux/linux 8b65ef5tools/testing/selftests/mm gup_test.c soft-dirty.c

selftests/mm: Fix build with _FORTIFY_SOURCE

Add missing flags argument to open(2) call with O_CREAT.

Some tests fail to compile if _FORTIFY_SOURCE is defined (to any valid
value) (together with -O), resulting in similar error messages such as:

  In file included from /usr/include/fcntl.h:342,
                   from gup_test.c:1:
  In function 'open',
      inlined from 'main' at gup_test.c:206:10:
  /usr/include/bits/fcntl2.h:50:11: error: call to '__open_missing_mode' declared with attribute error: open with O_CREAT or O_TMPFILE in second argument needs 3 arguments
     50 |           __open_missing_mode ();
        |           ^~~~~~~~~~~~~~~~~~~~~~

_FORTIFY_SOURCE is enabled by default in some distributions, so the
tests are not built by default and are skipped.

open(2) man-page warns about missing flags argument: "if it is not

    [16 lines not shown]
DeltaFile
+1-1tools/testing/selftests/mm/gup_test.c
+1-1tools/testing/selftests/mm/soft-dirty.c
+1-1tools/testing/selftests/mm/split_huge_page_test.c
+3-33 files

Linux/linux 7033999kernel/printk printk.c

Merge tag 'printk-for-6.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk fix from Petr Mladek:

 - Prevent scheduling in an atomic context when printk() takes over the
   console flushing duty

* tag 'printk-for-6.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  printk: Update @console_may_schedule in console_trylock_spinning()
DeltaFile
+6-0kernel/printk/printk.c
+6-01 files