On MacOS, UI event loop has to be ran in the main thread of a process.
Because of that restriction, on this platform, qemu main event loop is
ran on another thread [1].
This breaks record/replay feature, which expects thread running qemu_init
to initialize hold this lock, breaking associated functional tests on
MacOS.
Thus, as a generalization, and similar to how BQL is handled, we release
it after init, and reacquire the lock before entering main event loop,
avoiding a special case if a separate thread is used.
Tested on MacOS with:
$ meson test -C build --setup thorough --print-errorlogs \
func-x86_64-x86_64_replay func-arm-arm_replay func-aarch64-aarch64_replay
$ ./build/qemu-system-x86_64 -nographic -icount shift=auto,rr=record,rrfile=replay.log
$ ./build/qemu-system-x86_64 -nographic -icount shift=auto,rr=replay,rrfile=replay.log
[1] f5ab12caba
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/2907
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20250410225550.46807-2-pierrick.bouvier@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
With aux-ram-share=off, booting an SNP guest fails with:
../util/error.c:68: error_setv: Assertion `*errp == NULL' failed.
This is because a CPR blocker for the guest_memfd ramblock is added
twice, once in ram_block_add_cpr_blocker because aux-ram-share=off so
rb->fd < 0, and once in ram_block_add for a specific guest_memfd blocker.
To fix, add the guest_memfd blocker iff a generic one would not be
added by ram_block_add_cpr_blocker.
Fixes: 094a3dbc55df ("migration: ram block cpr blockers")
Reported-by: Tom Lendacky <thomas.lendacky@amd.com>
Reported-by: Michael Roth <michael.roth@amd.com>
Tested-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Message-ID: <1743087130-429075-1-git-send-email-steven.sistare@oracle.com>
[reword subject line]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
In the past a single AioContext was used for block I/O and it was
fetched using blk_get_aio_context(). Nowadays the block layer supports
running I/O from any AioContext and multiple AioContexts at the same
time. Remove the dma_blk_io() AioContext argument and use the current
AioContext instead.
This makes calling the function easier and enables multiple IOThreads to
use dma_blk_io() concurrently for the same block device.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20250311132616.1049687-3-stefanha@redhat.com>
Tested-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qemu_arch_available() is a bit simpler to understand while
reviewing than the undocumented arch_type variable.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20250305005225.95051-5-philmd@linaro.org>
We shouldn't use target specific globals for machine properties.
These ones could be desugarized, as explained in [*]. While
certainly doable, not trivial nor my priority for now. Just move
them to a different file to clarify they are *globals*, like the
generic globals residing in system/globals.c.
Since arch_init.c was introduced using the MIT license (see commit
ad96090a01d), retain the same license for the new globals-target.c
file.
[*] https://lore.kernel.org/qemu-devel/e514d6db-781d-4afe-b057-9046c70044dc@redhat.com/
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250305005225.95051-2-philmd@linaro.org>
Unlike cpr-reboot mode, cpr-transfer mode cannot save volatile ram blocks
in the migration stream file and recreate them later, because the physical
memory for the blocks is pinned and registered for vfio. Add a blocker
for volatile ram blocks.
Also add a blocker for RAM_GUEST_MEMFD. Preserving guest_memfd may be
sufficient for CPR, but it has not been tested yet.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-ID: <1740667681-257312-1-git-send-email-steven.sistare@oracle.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Now that watchpoint.c uses cputlb.h instead of exec-all.h,
it can be built once.
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Move CPU TLB related methods to "exec/cputlb.h".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-ID: <20241114011310.3615-19-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Move CPU TLB related methods to "exec/cputlb.h".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-ID: <20241114011310.3615-14-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Mostly revert commit c80cafa0c73 ("system: Add qemu_init_arch_modules")
but using target_name() instead of the target specific 'TARGET_NAME'
definition.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20250305005225.95051-3-philmd@linaro.org>
The heavily imported "system/cpus.h" header includes "accel-ops.h"
to get AccelOpsClass type declaration. Reduce headers pressure by
forward declaring it in "qemu/typedefs.h", where we already
declare the AccelCPUState type.
Reduce "system/cpus.h" inclusions by only including
"system/accel-ops.h" when necessary.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20250123234415.59850-14-philmd@linaro.org>
TCGCPUOps structure makes more sense in the accelerator context
rather than hardware emulation. Move it under the accel/tcg/ scope.
Mechanical change doing:
$ sed -i -e 's,hw/core/tcg-cpu-ops.h,accel/tcg/cpu-ops.h,g' \
$(git grep -l hw/core/tcg-cpu-ops.h)
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20250123234415.59850-11-philmd@linaro.org>
Since commit 740b1759734 ("cpu-timers, icount: new modules")
we don't need to expose icount_align_option to all the
system code, we can restrict it to TCG. Since it is used as
a boolean, declare it as 'bool' type.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20250123234415.59850-10-philmd@linaro.org>
Currently if the user requests via -machine dumpdtb=file.dtb that we
dump the DTB, but the machine doesn't have a DTB, we silently ignore
the option. This is confusing to users, and is a legacy of the old
board-specific implementation of the option, where if the execution
codepath didn't go via a call to qemu_fdt_dumpdtb() we would never
handle the option.
Now we handle the option in one place in machine.c, we can provide
the user with a useful message if they asked us to dump a DTB when
none exists. qmp_dumpdtb() already produces this error; remove the
logic in handle_machine_dumpdtb() that was there specifically to
avoid hitting it.
While we're here, beef up the error message a bit with a hint, and
make it consistent about "an FDT" rather than "a FDT". (In the
qmp_dumpdtb() case this needs an ERRP_GUARD to make
error_append_hint() work when the caller passes error_fatal.)
Note that the three places where we might report "doesn't have an
FDT" are hit in different situations:
(1) in handle_machine_dumpdtb(), if CONFIG_FDT is not set: this is
because the QEMU binary was built without libfdt at all. The
build system will not let you build with a machine type that
needs an FDT but no libfdt, so here we know both that the machine
doesn't use FDT and that QEMU doesn't have the support:
(2) in the device_tree-stub.c qmp_dumpdtb(): this is used when
we had libfdt at build time but the target architecture didn't
enable any machines which did "select DEVICE_TREE", so here we
know that the machine doesn't use FDT.
(3) in qmp_dumpdtb(), if current_machine->fdt is NULL all we know
is that this machine never set it. That might be because it doesn't
use FDT, or it might be because the user didn't pass an FDT
on the command line and the machine doesn't autogenerate an FDT.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2733
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250206151214.2947842-7-peter.maydell@linaro.org
It is possible to start QEMU with a confidential-guest-support object
even in TCG mode. While there is already a check in qemu_machine_creation_done:
if (machine->cgs && !machine->cgs->ready) {
error_setg(errp, "accelerator does not support confidential guest %s",
object_get_typename(OBJECT(machine->cgs)));
exit(1);
}
the creation of RAMBlocks happens earlier, in qemu_init_board(), if
the command line does not override the default memory backend with
-M memdev. Then the RAMBlock will try to use guest_memfd (because
machine_require_guest_memfd correctly returns true; at least correctly
according to the current implementation) and trigger the assertion
failure for kvm_enabled(). This happend with a command line as
simple as the following:
qemu-system-x86_64 -m 512 -nographic -object sev-snp-guest,reduced-phys-bits=48,id=sev0 \
-M q35,kernel-irqchip=split,confidential-guest-support=sev0
qemu-system-x86_64: ../system/physmem.c:1871: ram_block_add: Assertion `kvm_enabled()' failed.
Cc: Xiaoyao Li <xiaoyao.li@intel.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Link: https://lore.kernel.org/r/20250217120812.396522-1-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently we handle the 'dumpdtb' machine sub-option ad-hoc in every
board model that has an FDT. It's up to the board code to make sure
it calls qemu_fdt_dumpdtb() in the right place.
This means we're inconsistent and often just ignore the user's
command line argument:
* if the board doesn't have an FDT at all
* if the board supports FDT, but there happens not to be one
present (usually because of a missing -fdt option)
This isn't very helpful because it gives the user no clue why their
option was ignored.
However, in order to support the QMP/HMP dumpdtb commands we require
now that every FDT machine stores a pointer to the FDT in
MachineState::fdt. This means we can handle -machine dumpdtb
centrally by calling the qmp_dumpdtb() function, unifying its
handling with the QMP/HMP commands. All the board code calls to
qemu_fdt_dumpdtb() can then be removed.
For this commit we retain the existing behaviour that if there
is no FDT we silently ignore the -machine dumpdtb option.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
v2 changelog:
- Fix Mac (and possibly some other) build issues for two patches
- os: add an ability to lock memory on_fault
- memory: pass MemTxAttrs to memory_access_is_direct()
List of features:
- William's fix on ram hole punching when with file offset
- Daniil's patchset to introduce mem-lock=on-fault
- William's hugetlb hwpoison fix for size report & remap
- David's series to allow qemu debug writes to MMIOs
-----BEGIN PGP SIGNATURE-----
iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZ6zcQBIccGV0ZXJ4QHJl
ZGhhdC5jb20ACgkQO1/MzfOr1wbL3wEAqx94NpB/tEEBj6WXE3uV9LqQ0GCTYmV+
MbM51Vep8ksA/35yFn3ltM2yoSnUf9WJW6LXEEKhQlwswI0vChQERgkE
=++O1
-----END PGP SIGNATURE-----
Merge tag 'mem-next-pull-request' of https://gitlab.com/peterx/qemu into staging
Memory pull request for 10.0
v2 changelog:
- Fix Mac (and possibly some other) build issues for two patches
- os: add an ability to lock memory on_fault
- memory: pass MemTxAttrs to memory_access_is_direct()
List of features:
- William's fix on ram hole punching when with file offset
- Daniil's patchset to introduce mem-lock=on-fault
- William's hugetlb hwpoison fix for size report & remap
- David's series to allow qemu debug writes to MMIOs
# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZ6zcQBIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wbL3wEAqx94NpB/tEEBj6WXE3uV9LqQ0GCTYmV+
# MbM51Vep8ksA/35yFn3ltM2yoSnUf9WJW6LXEEKhQlwswI0vChQERgkE
# =++O1
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 13 Feb 2025 01:37:04 HKT
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [full]
# gpg: aka "Peter Xu <peterx@redhat.com>" [full]
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706
* tag 'mem-next-pull-request' of https://gitlab.com/peterx/qemu:
overcommit: introduce mem-lock=on-fault
system: introduce a new MlockState enum
system/vl: extract overcommit option parsing into a helper
os: add an ability to lock memory on_fault
system/physmem: poisoned memory discard on reboot
system/physmem: handle hugetlb correctly in qemu_ram_remap()
physmem: teach cpu_memory_rw_debug() to write to more memory regions
hmp: use cpu_get_phys_page_debug() in hmp_gva2gpa()
memory: pass MemTxAttrs to memory_access_is_direct()
physmem: disallow direct access to RAM DEVICE in address_space_write_rom()
physmem: factor out direct access check into memory_region_supports_direct_access()
physmem: factor out RAM/ROMD check in memory_access_is_direct()
physmem: factor out memory_region_is_ram_device() check in memory_access_is_direct()
system/physmem: take into account fd_offset for file fallocate
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Using the auto_create_sdcard feature without SD Bus is irrelevant.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20250204200934.65279-8-philmd@linaro.org>
Invert the 'no_sdcard' logic, renaming it as the more explicit
"auto_create_sdcard". Machines are supposed to create a SD Card
drive when this flag is set. In many cases it doesn't make much
sense (as boards don't expose SD Card host controller), but this
is patch only aims to expose that nonsense; so no logical change
intended (mechanical patch using gsed).
Most of the changes are:
- mc->no_sdcard = ON_OFF_AUTO_OFF;
+ mc->auto_create_sdcard = true;
Except in
. hw/core/null-machine.c
. hw/arm/xilinx_zynq.c
. hw/s390x/s390-virtio-ccw.c
where the disabled option is manually removed (since default):
- mc->no_sdcard = ON_OFF_AUTO_ON;
+ mc->auto_create_sdcard = false;
- mc->auto_create_sdcard = false;
and in system/vl.c we change the 'default_sdcard' type to boolean.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20250204200934.65279-4-philmd@linaro.org>
Update MachineClass::no_sdcard default implicit AUTO
initialization to explicit OFF. This flag is consumed
in system/vl.c::qemu_disable_default_devices(). Use
this place to assert we don't have anymore AUTO state.
In hw/ppc/e500.c we add the ppce500_machine_class_init()
method to initialize once all the inherited classes.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20250204200934.65279-3-philmd@linaro.org>
MachineClass::no_sdcard is initialized as false by default.
To catch all uses, convert it to a tri-state, having the
current default (false) becoming AUTO.
No logical change intended.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20250204200934.65279-2-philmd@linaro.org>
Locking the memory without MCL_ONFAULT instantly prefaults any mmaped
anonymous memory with a write-fault, which introduces a lot of extra
overhead in terms of memory usage when all you want to do is to prevent
kcompactd from migrating and compacting QEMU pages. Add an option to
only lock pages lazily as they're faulted by the process by using
MCL_ONFAULT if asked.
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Link: https://lore.kernel.org/r/20250212143920.1269754-5-d-tatianin@yandex-team.ru
Signed-off-by: Peter Xu <peterx@redhat.com>
Replace the boolean value enable_mlock with an enum and add a helper to
decide whether we should be calling os_mlock.
This is a stepping stone towards introducing a new mlock mode, which
will be the third possible state of this enum.
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Link: https://lore.kernel.org/r/20250212143920.1269754-4-d-tatianin@yandex-team.ru
Signed-off-by: Peter Xu <peterx@redhat.com>
This will be extended in the future commits, let's move it out of line
right away so that it's easier to read.
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Link: https://lore.kernel.org/r/20250212143920.1269754-3-d-tatianin@yandex-team.ru
Signed-off-by: Peter Xu <peterx@redhat.com>
This will be used in the following commits to make it possible to only
lock memory on fault instead of right away.
Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Link: https://lore.kernel.org/r/20250212143920.1269754-2-d-tatianin@yandex-team.ru
[peterx: fail os_mlock(on_fault=1) when not supported]
[peterx: use G_GNUC_UNUSED instead of "(void)on_fault", per Dan]
Signed-off-by: Peter Xu <peterx@redhat.com>
Repair poisoned memory location(s), calling ram_block_discard_range():
punching a hole in the backend file when necessary and regenerating
a usable memory.
If the kernel doesn't support the madvise calls used by this function
and we are dealing with anonymous memory, fall back to remapping the
location(s).
Signed-off-by: William Roche <william.roche@oracle.com>
Acked-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20250211212707.302391-3-william.roche@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
The list of hwpoison pages used to remap the memory on reset
is based on the backend real page size.
To correctly handle hugetlb, we must mmap(MAP_FIXED) a complete
hugetlb page; hugetlb pages cannot be partially mapped.
Signed-off-by: William Roche <william.roche@oracle.com>
Co-developed-by: David Hildenbrand <david@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20250211212707.302391-2-william.roche@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Right now, we only allow for writing to memory regions that allow direct
access using memcpy etc; all other writes are simply ignored. This
implies that debugging guests will not work as expected when writing
to MMIO device regions.
Let's extend cpu_memory_rw_debug() to write to more memory regions,
including MMIO device regions. Reshuffle the condition in
memory_access_is_direct() to make it easier to read and add a comment.
While this change implies that debug access can now also write to MMIO
devices, we now are also permit ELF image loads and similar users of
cpu_memory_rw_debug() to write to MMIO devices; currently we ignore
these writes.
Peter assumes [1] that there's probably a class of guest images, which
will start writing junk (likely zeroes) into device model registers; we
previously would silently ignore any such bogus ELF sections. Likely
these images are of questionable correctness and this can be ignored. If
ever a problem, we could make these cases use address_space_write_rom()
instead, which is left unchanged for now.
This patch is based on previous work by Stefan Zabka.
[1] https://lore.kernel.org/all/CAFEAcA_2CEJKFyjvbwmpt=on=GgMVamQ5hiiVt+zUr6AY3X=Xg@mail.gmail.com/
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/213
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20250210084648.33798-8-david@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
We want to pass another flag that will be stored in MemTxAttrs. So pass
MemTxAttrs directly.
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20250210084648.33798-6-david@redhat.com
[peterx: Fix MacOS builds]
Signed-off-by: Peter Xu <peterx@redhat.com>
As documented in commit 4a2e242bbb306 ("memory: Don't use memcpy for
ram_device regions"), we disallow direct access to RAM DEVICE regions.
This change implies that address_space_write_rom() and
cpu_memory_rw_debug() won't be able to write to RAM DEVICE regions. It
will also affect cpu_flush_icache_range(), but it's only used by
hw/core/loader.c after writing to ROM, so it is expected to not apply
here with RAM DEVICE.
This fixes direct access to these regions where we don't want direct
access. We'll extend cpu_memory_rw_debug() next to also be able to write to
these (and IO) regions.
This is a preparation for further changes.
Cc: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20250210084648.33798-5-david@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Punching a hole in a file with fallocate needs to take into account the
fd_offset value for a correct file location.
But guest_memfd internal use doesn't currently consider fd_offset.
Fixes: 4b870dc4d0c0 ("hostmem-file: add offset option")
Signed-off-by: William Roche <william.roche@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20250122194053.3103617-2-william.roche@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
- add a check-rust test to docker builds
- re-factor the qtest logic to be cleaner
- fix tests to not clock_step when no timers enabled
- roll-up log prefix into qtest_send
- cleaner error reporting when qtest_clock_set fails
- revert old deadlock fix now tests are updated
- only run full set of migration tests under HW acceleration
- support late attachment to user-mode gdbstubs
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmeqBSsACgkQ+9DbCVqe
KkQS/Af+K0hpdGc1msiuMsqmuESBvhoQniYZFLN1/pwe2KpG8i/+fq2fsCuxJhJ1
2TzPH7aj54p9MGCZf2k9JLhO22XldN+oezZMc1crhoWK0AtrWhnLs58I2oEPIsUo
NmGO6Zfm98ge89o2y8GCvd0QXAtUf+jduDKnW0mfnOnw+w/mky5KzWS7/1091VGW
42LSY4KnqgdLSqLyuLBOrgADEjB1ChWS4/bSC+kEYSGrmNQB+n1KeIzzlJBGpOr0
Z9yzmhMCm7TWdkFNPmnVfYH/7ZUNcpv6PtQSpkku4f6b/gybyvJBknHpM4i+Gpb5
87wSjljrCpdNm/9KFRjiJuUWdS/jCg==
=UF0n
-----END PGP SIGNATURE-----
Merge tag 'pull-10.0-testing-and-gdstub-updates-100225-1' of https://gitlab.com/stsquad/qemu into staging
testing and gdbstub updates:
- add a check-rust test to docker builds
- re-factor the qtest logic to be cleaner
- fix tests to not clock_step when no timers enabled
- roll-up log prefix into qtest_send
- cleaner error reporting when qtest_clock_set fails
- revert old deadlock fix now tests are updated
- only run full set of migration tests under HW acceleration
- support late attachment to user-mode gdbstubs
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmeqBSsACgkQ+9DbCVqe
# KkQS/Af+K0hpdGc1msiuMsqmuESBvhoQniYZFLN1/pwe2KpG8i/+fq2fsCuxJhJ1
# 2TzPH7aj54p9MGCZf2k9JLhO22XldN+oezZMc1crhoWK0AtrWhnLs58I2oEPIsUo
# NmGO6Zfm98ge89o2y8GCvd0QXAtUf+jduDKnW0mfnOnw+w/mky5KzWS7/1091VGW
# 42LSY4KnqgdLSqLyuLBOrgADEjB1ChWS4/bSC+kEYSGrmNQB+n1KeIzzlJBGpOr0
# Z9yzmhMCm7TWdkFNPmnVfYH/7ZUNcpv6PtQSpkku4f6b/gybyvJBknHpM4i+Gpb5
# 87wSjljrCpdNm/9KFRjiJuUWdS/jCg==
# =UF0n
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 10 Feb 2025 08:54:51 EST
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* tag 'pull-10.0-testing-and-gdstub-updates-100225-1' of https://gitlab.com/stsquad/qemu:
tests/tcg: Add late gdbstub attach test
docs/user: Document the %d placeholder and suspend=n QEMU_GDB features
gdbstub: Allow late attachment
osdep: Introduce qemu_kill_thread()
user: Introduce host_interrupt_signal
user: Introduce user/signal.h
gdbstub: Try unlinking the unix socket before binding
gdbstub: Allow the %d placeholder in the socket path
tests/qtest/migration: Pick smoke tests
tests/qtest/migration: Add --full option
Revert "util/timer: avoid deadlock when shutting down"
tests/qtest: tighten up the checks on clock_step
tests/qtest: rename qtest_send_prefix and roll-up into qtest_send
tests/qtest: simplify qtest_process_inbuf
tests/qtest: don't step clock at start of npcm7xx periodic IRQ test
tests/qtest: don't attempt to clock_step while waiting for virtio ISR
tests/docker: replicate the check-rust-tools-nightly CI job
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The general expectation is that header files should follow the same
file/path naming scheme as the corresponding source file. There are
various historical exceptions to this practice in QEMU, with one of
the most notable being the include/qapi/qmp/ directory.
include/qapi/qmp/dispatch.h corresponds mostly to qapi/qmp-registry.c.
Move and rename it to include/qapi/qmp-registry.h.
Now just qerror.h is left in include/qapi/qmp/. Since it's deprecated
& (slowly) getting eliminated anyway, it isn't worth moving.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20241118151235.2665921-3-armbru@redhat.com>
The general expectation is that header files should follow the same
file/path naming scheme as the corresponding source file. There are
various historical exceptions to this practice in QEMU, with one of
the most notable being the include/qapi/qmp/ directory. Most of the
headers there correspond to source files in qobject/.
This patch corrects most of that inconsistency by creating
include/qobject/ and moving the headers for qobject/ there.
This also fixes MAINTAINERS for include/qapi/qmp/dispatch.h:
scripts/get_maintainer.pl now reports "QAPI" instead of "No
maintainers found".
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com> #s390x
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20241118151235.2665921-2-armbru@redhat.com>
[Rebased]
It is invalid to call clock_step with an implied time to step forward
as if no timers are running we won't be able to advance.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250207153112.3939799-7-alex.bennee@linaro.org>
qtest_send_prefix never actually sent something over the chardev, all
it does is print the timestamp to the QTEST_LOG when enabled. So
rename the function, make it static, remove the unused CharDev and
simplify all the call sites by handling that directly with
qtest_send (and qtest_log_send).
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250207153112.3939799-6-alex.bennee@linaro.org>
Don't both creating a GString to temporarily hold our qtest command.
Instead do a simpler g_strndup and use autofree to clean up
afterwards.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250207153112.3939799-5-alex.bennee@linaro.org>
The '-old-param' command line option is specific to Arm targets; it
is very briefly documented as "old param mode". What this option
actually does is change the behaviour when directly booting a guest
kernel, so that command line arguments are passed to the kernel using
the extremely old "param_struct" ABI, rather than the newer ATAGS or
even newer DTB mechanisms.
This support was added back in 2007 to support an old vendor kernel
on the akita/terrier board types:
https://mail.gnu.org/archive/html/qemu-devel/2007-07/msg00344.html
Even then, it was an out-of-date mechanism from the kernel's
point of view -- the kernel has had a comment since 2001 marking
it as deprecated. As of mid-2024, the kernel only retained
param_struct support for the RiscPC and Footbridge platforms:
https://lore.kernel.org/linux-arm-kernel/2831c5a6-cfbf-4fe0-b51c-0396e5b0aeb7@app.fastmail.com/
None of the board types QEMU supports need param_struct support;
mark this option as deprecated.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20250127123113.2947620-1-peter.maydell@linaro.org
qemu_ram_alloc_from_fd allocates space if file_size == 0. If non-zero,
it uses the existing space and verifies it is large enough, but the
verification was broken when the offset parameter was introduced. As
a result, a file smaller than offset passes the verification and causes
errors later. Fix that, and update the error message to include offset.
Peter provides this concise reproducer:
$ touch ramfile
$ truncate -s 64M ramfile
$ ./qemu-system-x86_64 -object memory-backend-file,mem-path=./ramfile,offset=128M,size=128M,id=mem1,prealloc=on
qemu-system-x86_64: qemu_prealloc_mem: preallocating memory failed: Bad address
With the fix, the error message is:
qemu-system-x86_64: mem1 backing store size 0x4000000 is too small for 'size' option 0x8000000 plus 'offset' option 0x8000000
Cc: qemu-stable@nongnu.org
Fixes: 4b870dc4d0c0 ("hostmem-file: add offset option")
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/1736967650-129648-3-git-send-email-steven.sistare@oracle.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
(cherry picked from commit 719168fba7c3215cc996dcfd32a6e5e9c7b8eee0)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Add the cpr-transfer migration mode, which allows the user to transfer
a guest to a new QEMU instance on the same host with minimal guest pause
time, by preserving guest RAM in place, albeit with new virtual addresses
in new QEMU, and by preserving device file descriptors. Pages that were
locked in memory for DMA in old QEMU remain locked in new QEMU, because the
descriptor of the device that locked them remains open.
cpr-transfer preserves memory and devices descriptors by sending them to
new QEMU over a unix domain socket using SCM_RIGHTS. Such CPR state cannot
be sent over the normal migration channel, because devices and backends
are created prior to reading the channel, so this mode sends CPR state
over a second "cpr" migration channel. New QEMU reads the cpr channel
prior to creating devices or backends. The user specifies the cpr channel
in the channel arguments on the outgoing side, and in a second -incoming
command-line parameter on the incoming side.
The user must start old QEMU with the the '-machine aux-ram-share=on' option,
which allows anonymous memory to be transferred in place to the new process
by transferring a memory descriptor for each ram block. Memory-backend
objects must have the share=on attribute, but memory-backend-epc is not
supported.
The user starts new QEMU on the same host as old QEMU, with command-line
arguments to create the same machine, plus the -incoming option for the
main migration channel, like normal live migration. In addition, the user
adds a second -incoming option with channel type "cpr". This CPR channel
must support file descriptor transfer with SCM_RIGHTS, i.e. it must be a
UNIX domain socket.
To initiate CPR, the user issues a migrate command to old QEMU, adding
a second migration channel of type "cpr" in the channels argument.
Old QEMU stops the VM, saves state to the migration channels, and enters
the postmigrate state. New QEMU mmap's memory descriptors, and execution
resumes.
The implementation splits qmp_migrate into start and finish functions.
Start sends CPR state to new QEMU, which responds by closing the CPR
channel. Old QEMU detects the HUP then calls finish, which connects the
main migration channel.
In summary, the usage is:
qemu-system-$arch -machine aux-ram-share=on ...
start new QEMU with "-incoming <main-uri> -incoming <cpr-channel>"
Issue commands to old QEMU:
migrate_set_parameter mode cpr-transfer
{"execute": "migrate", ...
{"channel-type": "main"...}, {"channel-type": "cpr"...} ... }
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Link: https://lore.kernel.org/r/1736967650-129648-17-git-send-email-steven.sistare@oracle.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Extend the -incoming option to allow an @MigrationChannel to be specified.
This allows channels other than 'main' to be described on the command
line, which will be needed for CPR.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Acked-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1736967650-129648-13-git-send-email-steven.sistare@oracle.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Save the memfd for ramblocks in CPR state, along with a name that
uniquely identifies it. The block's idstr is not yet set, so it
cannot be used for this purpose. Find the saved memfd in new QEMU when
creating a block. If size of a resizable block is larger in new QEMU,
extend it via the file_ram_alloc truncate parameter, and the extra space
will be usable after a guest reset.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1736967650-129648-9-git-send-email-steven.sistare@oracle.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>