Andrea Fioraldi 21dda465fc
Fast mem and devices snapshots (#16)
* Run docker probe only if docker or podman are available

The docker probe uses "sudo -n" which can cause an e-mail with a security warning
each time when configure is run. Therefore run docker probe only if either docker
or podman are available.

That avoids the problematic "sudo -n" on build environments which have neither
docker nor podman installed.

Fixes: c4575b59155e2e00 ("configure: store container engine in config-host.mak")
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Message-Id: <20221030083510.310584-1-sw@weilnetz.de>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20221117172532.538149-2-alex.bennee@linaro.org>

* tests/avocado/machine_aspeed.py: Reduce noise on the console for SDK tests

The Aspeed SDK images are based on OpenBMC which starts a lot of
services. The output noise on the console can break from time to time
the test waiting for the logging prompt.

Change the U-Boot bootargs variable to add "quiet" to the kernel
command line and reduce the output volume. This also drops the test on
the CPU id which was nice to have but not essential.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20221104075347.370503-1-clg@kaod.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20221117172532.538149-3-alex.bennee@linaro.org>

* tests/docker: allow user to override check target

This is useful when trying to bisect a particular failing test behind
a docker run. For example:

  make docker-test-clang@fedora \
    TARGET_LIST=arm-softmmu \
    TEST_COMMAND="meson test qtest-arm/qos-test" \
    J=9 V=1

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221117172532.538149-4-alex.bennee@linaro.org>

* docs/devel: add a maintainers section to development process

We don't currently have a clear place in the documentation to describe
the roles and responsibilities of a maintainer. Lets create one so we
can. I've moved a few small bits out of other files to try and keep
everything in one place.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221117172532.538149-5-alex.bennee@linaro.org>

* docs/devel: make language a little less code centric

We welcome all sorts of patches.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221117172532.538149-6-alex.bennee@linaro.org>

* docs/devel: simplify the minimal checklist

The bullet points are quite long and contain process tips. Move those
bits of the bullet to the relevant sections and link to them. Use a
table for nicer formatting of the checklist.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221117172532.538149-7-alex.bennee@linaro.org>

* docs/devel: try and improve the language around patch review

It is important that contributors take the review process seriously
and we collaborate in a respectful way while avoiding personal
attacks. Try and make this clear in the language.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221117172532.538149-8-alex.bennee@linaro.org>

* tests/avocado: Raise timeout for boot_linux.py:BootLinuxPPC64.test_pseries_tcg

On my machine, a debug build of QEMU takes about 260 seconds to
complete this test, so with the current timeout value of 180 seconds
it always times out.  Double the timeout value to 360 so the test
definitely has enough time to complete.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20221110142901.3832318-1-peter.maydell@linaro.org>
Message-Id: <20221117172532.538149-9-alex.bennee@linaro.org>

* tests/avocado: introduce alpine virt test for CI

The boot_linux tests download and run a full cloud image boot and
start a full distro. While the ability to test the full boot chain is
worthwhile it is perhaps a little too heavy weight and causes issues
in CI. Fix this by introducing a new alpine linux ISO boot in
machine_aarch64_virt.

This boots a fully loaded -cpu max with all the bells and whistles in
31s on my machine. A full debug build takes around 180s on my machine
so we set a more generous timeout to cover that.

We don't add a test for lesser GIC versions although there is some
coverage for that already in the boot_xen.py tests. If we want to
introduce more comprehensive testing we can do it with a custom kernel
and initrd rather than a full distro boot.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221117172532.538149-10-alex.bennee@linaro.org>

* tests/avocado: skip aarch64 cloud TCG tests in CI

We now have a much lighter weight test in machine_aarch64_virt which
tests the full boot chain in less time. Rename the tests while we are
at it to make it clear it is a Fedora cloud image.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221117172532.538149-11-alex.bennee@linaro.org>

* gitlab: integrate coverage report

This should hopefully give is nice coverage information about what our
tests (or at least the subset we are running) have hit. Ideally we
would want a way to trigger coverage on tests likely to be affected by
the current commit.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221117172532.538149-12-alex.bennee@linaro.org>

* vhost: mask VIRTIO_F_RING_RESET for vhost and vhost-user devices

Commit 69e1c14aa2 ("virtio: core: vq reset feature negotation support")
enabled VIRTIO_F_RING_RESET by default for all virtio devices.

This feature is not currently emulated by QEMU, so for vhost and
vhost-user devices we need to make sure it is supported by the offloaded
device emulation (in-kernel or in another process).
To do this we need to add VIRTIO_F_RING_RESET to the features bitmap
passed to vhost_get_features(). This way it will be masked if the device
does not support it.

This issue was initially discovered with vhost-vsock and vhost-user-vsock,
and then also tested with vhost-user-rng which confirmed the same issue.
They fail when sending features through VHOST_SET_FEATURES ioctl or
VHOST_USER_SET_FEATURES message, since VIRTIO_F_RING_RESET is negotiated
by the guest (Linux >= v6.0), but not supported by the device.

Fixes: 69e1c14aa2 ("virtio: core: vq reset feature negotation support")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1318
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20221121101101.29400-1-sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Acked-by: Jason Wang <jasowang@redhat.com>

* tests: acpi: whitelist DSDT before moving PRQx to _SB scope

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20221121153613.3972225-2-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

* acpi: x86: move RPQx field back to _SB scope

Commit 47a373faa6b2 (acpi: pc/q35: drop ad-hoc PCI-ISA bridge AML routines and let bus ennumeration generate AML)
moved ISA bridge AML generation to respective devices and was using
aml_alias() to provide PRQx fields in _SB. scope. However, it turned
out that SeaBIOS was not able to process Alias opcode when parsing DSDT,
resulting in lack of keyboard during boot (SeaBIOS console, grub, FreeDOS).

While fix for SeaBIOS is posted
  https://mail.coreboot.org/hyperkitty/list/seabios@seabios.org/thread/RGPL7HESH5U5JRLEO6FP77CZVHZK5J65/
fixed SeaBIOS might not make into QEMU-7.2 in time.
Hence this workaround that puts PRQx back into _SB scope
and gets rid of aliases in ISA bridge description, so
DSDT will be parsable by broken SeaBIOS.

That brings back hardcoded references to ISA bridge
  PCI0.S08.P40C/PCI0.SF8.PIRQ
where middle part now is auto generated based on slot it's
plugged in, but it should be fine as bridge initialization
also hardcodes PCI address of the bridge so it can't ever
move. Once QEMU tree has fixed SeaBIOS blob, we should be able
to drop this part and revert back to alias based approach

Reported-by: Volker Rümelin <vr_qemu@t-online.de>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20221121153613.3972225-3-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

* tests: acpi: x86: update expected DSDT after moving PRQx fields in _SB scope

Expected DSDT changes,
pc:
  -                Field (P40C, ByteAcc, NoLock, Preserve)
  +                Scope (\_SB)
                   {
  -                    PRQ0,   8,
  -                    PRQ1,   8,
  -                    PRQ2,   8,
  -                    PRQ3,   8
  +                    Field (PCI0.S08.P40C, ByteAcc, NoLock, Preserve)
  +                    {
  +                        PRQ0,   8,
  +                        PRQ1,   8,
  +                        PRQ2,   8,
  +                        PRQ3,   8
  +                    }
                   }

  -                Alias (PRQ0, \_SB.PRQ0)
  -                Alias (PRQ1, \_SB.PRQ1)
  -                Alias (PRQ2, \_SB.PRQ2)
  -                Alias (PRQ3, \_SB.PRQ3)

q35:
  -                Field (PIRQ, ByteAcc, NoLock, Preserve)
  -                {
  -                    PRQA,   8,
  -                    PRQB,   8,
  -                    PRQC,   8,
  -                    PRQD,   8,
  -                    Offset (0x08),
  -                    PRQE,   8,
  -                    PRQF,   8,
  -                    PRQG,   8,
  -                    PRQH,   8
  +                Scope (\_SB)
  +                {
  +                    Field (PCI0.SF8.PIRQ, ByteAcc, NoLock, Preserve)
  +                    {
  +                        PRQA,   8,
  +                        PRQB,   8,
  +                        PRQC,   8,
  +                        PRQD,   8,
  +                        Offset (0x08),
  +                        PRQE,   8,
  +                        PRQF,   8,
  +                        PRQG,   8,
  +                        PRQH,   8
  +                    }
                   }

  -                Alias (PRQA, \_SB.PRQA)
  -                Alias (PRQB, \_SB.PRQB)
  -                Alias (PRQC, \_SB.PRQC)
  -                Alias (PRQD, \_SB.PRQD)
  -                Alias (PRQE, \_SB.PRQE)
  -                Alias (PRQF, \_SB.PRQF)
  -                Alias (PRQG, \_SB.PRQG)
  -                Alias (PRQH, \_SB.PRQH)

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20221121153613.3972225-4-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

* MAINTAINERS: add mst to list of biosbits maintainers

Adding Michael's name to the list of bios bits maintainers so that all changes
and fixes into biosbits framework can go through his tree and he is notified.

Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Ani Sinha <ani@anisinha.ca>
Message-Id: <20221111151138.36988-1-ani@anisinha.ca>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

* tests/avocado: configure acpi-bits to use avocado timeout

Instead of using a hardcoded timeout, just rely on Avocado's built-in
test case timeout. This helps avoid timeout issues on machines where 60
seconds is not sufficient.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20221115212759.3095751-1-jsnow@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Ani Sinha <ani@anisinha.ca>

* acpi/tests/avocado/bits: keep the work directory when BITS_DEBUG is set in env

Debugging bits issue often involves running the QEMU command line manually
outside of the avocado environment with the generated ISO. Hence, its
inconvenient if the iso gets cleaned up after the test has finished. This change
makes sure that the work directory is kept after the test finishes if the test
is run with BITS_DEBUG=1 in the environment so that the iso is available for use
with the QEMU command line.

CC: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Ani Sinha <ani@anisinha.ca>
Message-Id: <20221117113630.543495-1-ani@anisinha.ca>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

* virtio: disable error for out of spec queue-enable

Virtio 1.0 is pretty clear that features have to be
negotiated before enabling VQs. Unfortunately Seabios
ignored this ever since gaining 1.0 support (UEFI is ok).
Comment the error out for now, and add a TODO.

Fixes: 3c37f8b8d1 ("virtio: introduce virtio_queue_enable()")
Cc: "Kangjie Xu" <kangjie.xu@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20221121200339.362452-1-mst@redhat.com>

* hw/loongarch: Add default stdout uart in fdt

Add "chosen" subnode into LoongArch fdt, and set it's
"stdout-path" prop to uart node.

Signed-off-by: Xiaojuan Yang <yangxiaojuan@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20221115114923.3372414-1-yangxiaojuan@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>

* hw/loongarch: Fix setprop_sized method in fdt rtc node.

Fix setprop_sized method in fdt rtc node.

Signed-off-by: Xiaojuan Yang <yangxiaojuan@loongson.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20221116040300.3459818-1-yangxiaojuan@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>

* hw/loongarch: Replace the value of uart info with macro

Using macro to replace the value of uart info such as addr, size
in acpi_build method.

Signed-off-by: Xiaojuan Yang <yangxiaojuan@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20221115115008.3372489-1-yangxiaojuan@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>

* target/arm: Don't do two-stage lookup if stage 2 is disabled

In get_phys_addr_with_struct(), we call get_phys_addr_twostage() if
the CPU supports EL2.  However, we don't check here that stage 2 is
actually enabled.  Instead we only check that inside
get_phys_addr_twostage() to skip stage 2 translation.  This means
that even if stage 2 is disabled we still tell the stage 1 lookup to
do its page table walks via stage 2.

This works by luck for normal CPU accesses, but it breaks for debug
accesses, which are used by the disassembler and also by semihosting
file reads and writes, because the debug case takes a different code
path inside S1_ptw_translate().

This means that setups that use semihosting for file loads are broken
(a regression since 7.1, introduced in recent ptw refactoring), and
that sometimes disassembly in debug logs reports "unable to read
memory" rather than showing the guest insns.

Fix the bug by hoisting the "is stage 2 enabled?" check up to
get_phys_addr_with_struct(), so that we handle S2 disabled the same
way we do the "no EL2" case, with a simple single stage lookup.

Reported-by: Jens Wiklander <jens.wiklander@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20221121212404.1450382-1-peter.maydell@linaro.org

* target/arm: Use signed quantity to represent VMSAv8-64 translation level

The LPA2 extension implements 52-bit virtual addressing for 4k and 16k
translation granules, and for the former, this means an additional level
of translation is needed. This means we start counting at -1 instead of
0 when doing a walk, and so 'level' is now a signed quantity, and should
be typed as such. So turn it from uint32_t into int32_t.

This avoids a level of -1 getting misinterpreted as being >= 3, and
terminating a page table walk prematurely with a bogus output address.

Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
Cc: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

* Update VERSION for v7.2.0-rc2

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

* tests/avocado: Update the URLs of the advent calendar images

The qemu-advent-calendar.org server will be decommissioned soon.
I've mirrored the images that we use for the QEMU CI to gitlab,
so update their URLs to point to the new location.

Message-Id: <20221121102436.78635-1-thuth@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>

* tests/qtest: Decrease the amount of output from the qom-test

The logs in the gitlab-CI have a size constraint, and sometimes
we already hit this limit. The biggest part of the log then seems
to be filled by the qom-test, so we should decrease the size of
the output - which can be done easily by not printing the path
for each property, since the path has already been logged at the
beginning of each node that we handle here.

However, if we omit the path, we should make sure to not recurse
into child nodes in between, so that it is clear to which node
each property belongs. Thus store the children and links in a
temporary list and recurse only at the end of each node, when
all properties have already been printed.

Message-Id: <20221121194240.149268-1-thuth@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>

* tests/avocado: use new rootfs for orangepi test

The old URL wasn't stable. I suspect the current URL will only be
stable for a few months so maybe we need another strategy for hosting
rootfs snapshots?

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20221118113309.1057790-1-alex.bennee@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>

* Revert "usbredir: avoid queuing hello packet on snapshot restore"

Run state is also in RUN_STATE_PRELAUNCH while "-S" is used.

This reverts commit 0631d4b448454ae8a1ab091c447e3f71ab6e088a

Signed-off-by: Joelle van Dyne <j@getutm.app>
Reviewed-by: Ján Tomko <jtomko@redhat.com>

The original commit broke the usage of usbredir with libvirt, which
starts every domain with "-S".

This workaround is no longer needed because the usbredir behavior
has been fixed in the meantime:
https://gitlab.freedesktop.org/spice/usbredir/-/merge_requests/61

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Message-Id: <1689cec3eadcea87255e390cb236033aca72e168.1669193161.git.jtomko@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

* gtk: disable GTK Clipboard with a new meson option

The GTK Clipboard implementation may cause guest hangs.

Therefore implement new configure switch: --enable-gtk-clipboard,

as a meson option disabled by default, which warns in the help
text about the experimental nature of the feature.
Regenerate the meson build options to include it.

The initialization of the clipboard is gtk.c, as well as the
compilation of gtk-clipboard.c are now conditional on this new
option to be set.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1150
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
Message-Id: <20221121135538.14625-1-cfontana@suse.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

* hw/usb/hcd-xhci.c: spelling: tranfer

Fixes: effaf5a240e03020f4ae953e10b764622c3e87cc
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Message-Id: <20221105114851.306206-1-mjt@msgid.tls.msk.ru>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

* ui/gtk: prevent ui lock up when dpy_gl_update called again before current draw event occurs

A warning, "qemu: warning: console: no gl-unblock within" followed by
guest scanout lockup can happen if dpy_gl_update is called in a row
and the second call is made before gd_draw_event scheduled by the first
call is taking place. This is because draw call returns without decrementing
gl_block ref count if the dmabuf was already submitted as shown below.

(gd_gl_area_draw/gd_egl_draw)

        if (dmabuf) {
            if (!dmabuf->draw_submitted) {
                return;
            } else {
                dmabuf->draw_submitted = false;
            }
        }

So it should not schedule any redundant draw event in case draw_submitted is
already set in gd_egl_fluch/gd_gl_area_scanout_flush.

Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>
Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20221021192315.9110-1-dongwon.kim@intel.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

* hw/usb/hcd-xhci: Reset the XHCIState with device_cold_reset()

Currently the hcd-xhci-pci and hcd-xhci-sysbus devices, which are
mostly wrappers around the TYPE_XHCI device, which is a direct
subclass of TYPE_DEVICE.  Since TYPE_DEVICE devices are not on any
qbus and do not get automatically reset, the wrapper devices both
reset the TYPE_XHCI device in their own reset functions.  However,
they do this using device_legacy_reset(), which will reset the device
itself but not any bus it has.

Switch to device_cold_reset(), which avoids using a deprecated
function and also propagates reset along any child buses.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20221014145423.2102706-1-peter.maydell@linaro.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

* hw/audio/intel-hda: don't reset codecs twice

Currently the intel-hda device has a reset method which manually
resets all the codecs by calling device_legacy_reset() on them.  This
means they get reset twice, once because child devices on a qbus get
reset before the parent device's reset method is called, and then
again because we're manually resetting them.

Drop the manual reset call, and ensure that codecs are still reset
when the guest does a reset via ICH6_GCTL_RESET by using
device_cold_reset() (which resets all the devices on the qbus as well
as the device itself) instead of a direct call to the reset function.

This is a slight ordering change because the (only) codec reset now
happens before the controller registers etc are reset, rather than
once before and then once after, but the codec reset function
hda_audio_reset() doesn't care.

This lets us drop a use of device_legacy_reset(), which is
deprecated.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221014142632.2092404-2-peter.maydell@linaro.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

* hw/audio/intel-hda: Drop unnecessary prototype

The only use of intel_hda_reset() is after its definition, so we
don't need to separately declare its prototype at the top of the
file; drop the unnecessary line.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221014142632.2092404-3-peter.maydell@linaro.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

* add syx snapshot extras

* it compiles!

* virtiofsd: Add `sigreturn` to the seccomp whitelist

The virtiofsd currently crashes on s390x. This is because of a
`sigreturn` system call. See audit log below:

type=SECCOMP msg=audit(1669382477.611:459): auid=4294967295 uid=0 gid=0 ses=4294967295 subj=system_u:system_r:virtd_t:s0-s0:c0.c1023 pid=6649 comm="virtiofsd" exe="/usr/libexec/virtiofsd" sig=31 arch=80000016 syscall=119 compat=0 ip=0x3fff15f748a code=0x80000000AUID="unset" UID="root" GID="root" ARCH=s390x SYSCALL=sigreturn

Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Reviewed-by: German Maglione <gmaglione@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221125143946.27717-1-mhartmay@linux.ibm.com>

* libvhost-user: Fix wrong type of argument to formatting function (reported by LGTM)

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Message-Id: <20220422070144.1043697-2-sw@weilnetz.de>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221126152507.283271-2-sw@weilnetz.de>

* libvhost-user: Fix format strings

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20220422070144.1043697-3-sw@weilnetz.de>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221126152507.283271-3-sw@weilnetz.de>

* libvhost-user: Fix two more format strings

This fix is required for 32 bit hosts. The bug was detected by CI
for arm-linux, but is also relevant for i386-linux.

Reported-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221126152507.283271-4-sw@weilnetz.de>

* libvhost-user: Add format attribute to local function vu_panic

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20220422070144.1043697-4-sw@weilnetz.de>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221126152507.283271-5-sw@weilnetz.de>

* MAINTAINERS: Add subprojects/libvhost-user to section "vhost"

Signed-off-by: Stefan Weil <sw@weilnetz.de>
[Michael agreed to act as maintainer for libvhost-user via email in
https://lore.kernel.org/qemu-devel/20221123015218-mutt-send-email-mst@kernel.org/.
--Stefan]
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221126152507.283271-6-sw@weilnetz.de>

* Add G_GNUC_PRINTF to function qemu_set_info_str and fix related issues

With the G_GNUC_PRINTF function attribute the compiler detects
two potential insecure format strings:

../../../net/stream.c:248:31: warning: format string is not a string literal (potentially insecure) [-Wformat-security]
    qemu_set_info_str(&s->nc, uri);
                              ^~~
../../../net/stream.c:322:31: warning: format string is not a string literal (potentially insecure) [-Wformat-security]
    qemu_set_info_str(&s->nc, uri);
                              ^~~

There are also two other warnings:

../../../net/socket.c:182:35: warning: zero-length gnu_printf format string [-Wformat-zero-length]
  182 |         qemu_set_info_str(&s->nc, "");
      |                                   ^~
../../../net/stream.c:170:35: warning: zero-length gnu_printf format string [-Wformat-zero-length]
  170 |         qemu_set_info_str(&s->nc, "");

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221126152507.283271-7-sw@weilnetz.de>

* del ramfile

* update seabios source from 1.16.0 to 1.16.1

git shortlog rel-1.16.0..rel-1.16.1
===================================

Gerd Hoffmann (3):
      malloc: use variable for ZoneHigh size
      malloc: use large ZoneHigh when there is enough memory
      virtio-blk: use larger default request size

Igor Mammedov (1):
      acpi: parse Alias object

Volker Rümelin (2):
      pci: refactor the pci_config_*() functions
      reset: force standard PCI configuration access

Xiaofei Lee (1):
      virtio-blk: Fix incorrect type conversion in virtio_blk_op()

Xuan Zhuo (2):
      virtio-mmio: read/write the hi 32 features for mmio
      virtio: finalize features before using device

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

* update seabios binaries to 1.16.1

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

* fix for non i386 archs

* replay: Fix declaration of replay_read_next_clock

Fixes the build with gcc 13:

replay/replay-time.c:34:6: error: conflicting types for  \
  'replay_read_next_clock' due to enum/integer mismatch; \
  have 'void(ReplayClockKind)' [-Werror=enum-int-mismatch]
   34 | void replay_read_next_clock(ReplayClockKind kind)
      |      ^~~~~~~~~~~~~~~~~~~~~~
In file included from ../qemu/replay/replay-time.c:14:
replay/replay-internal.h:139:6: note: previous declaration of \
  'replay_read_next_clock' with type 'void(unsigned int)'
  139 | void replay_read_next_clock(unsigned int kind);
      |      ^~~~~~~~~~~~~~~~~~~~~~

Fixes: 8eda206e090 ("replay: recording and replaying clock ticks")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221129010547.284051-1-richard.henderson@linaro.org>

* hw/display/qxl: Have qxl_log_command Return early if no log_cmd handler

Only 3 command types are logged: no need to call qxl_phys2virt()
for the other types. Using different cases will help to pass
different structure sizes to qxl_phys2virt() in a pair of commits.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221128202741.4945-2-philmd@linaro.org>

* hw/display/qxl: Document qxl_phys2virt()

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221128202741.4945-3-philmd@linaro.org>

* hw/display/qxl: Pass requested buffer size to qxl_phys2virt()

Currently qxl_phys2virt() doesn't check for buffer overrun.
In order to do so in the next commit, pass the buffer size
as argument.

For QXLCursor in qxl_render_cursor() -> qxl_cursor() we
verify the size of the chunked data ahead, checking we can
access 'sizeof(QXLCursor) + chunk->data_size' bytes.
Since in the SPICE_CURSOR_TYPE_MONO case the cursor is
assumed to fit in one chunk, no change are required.
In SPICE_CURSOR_TYPE_ALPHA the ahead read is handled in
qxl_unpack_chunks().

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221128202741.4945-4-philmd@linaro.org>

* hw/display/qxl: Avoid buffer overrun in qxl_phys2virt (CVE-2022-4144)

Have qxl_get_check_slot_offset() return false if the requested
buffer size does not fit within the slot memory region.

Similarly qxl_phys2virt() now returns NULL in such case, and
qxl_dirty_one_surface() aborts.

This avoids buffer overrun in the host pointer returned by
memory_region_get_ram_ptr().

Fixes: CVE-2022-4144 (out-of-bounds read)
Reported-by: Wenxu Yin (@awxylitol)
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1336
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221128202741.4945-5-philmd@linaro.org>

* hw/display/qxl: Assert memory slot fits in preallocated MemoryRegion

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221128202741.4945-6-philmd@linaro.org>

* block-backend: avoid bdrv_unregister_buf() NULL pointer deref

bdrv_*() APIs expect a valid BlockDriverState. Calling them with bs=NULL
leads to undefined behavior.

Jonathan Cameron reported this following NULL pointer dereference when a
VM with a virtio-blk device and a memory-backend-file object is
terminated:
1. qemu_cleanup() closes all drives, setting blk->root to NULL
2. qemu_cleanup() calls user_creatable_cleanup(), which results in a RAM
   block notifier callback because the memory-backend-file is destroyed.
3. blk_unregister_buf() is called by virtio-blk's BlockRamRegistrar
   notifier callback and undefined behavior occurs.

Fixes: baf422684d73 ("virtio-blk: use BDRV_REQ_REGISTERED_BUF optimization hint")
Co-authored-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221121211923.1993171-1-stefanha@redhat.com>

* target/arm: Set TCGCPUOps.restore_state_to_opc for v7m

This setting got missed, breaking v7m.

Fixes: 56c6c98df85c ("target/arm: Convert to tcg_ops restore_state_to_opc")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1347
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Evgeny Ermakov <evgeny.v.ermakov@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221129204146.550394-1-richard.henderson@linaro.org>

* Update VERSION for v7.2.0-rc3

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

* hooks are now post mem access

* tests/qtests: override "force-legacy" for gpio virtio-mmio tests

The GPIO device is a VIRTIO_F_VERSION_1 devices but running with a
legacy MMIO interface we miss out that feature bit causing confusion.
For the GPIO test force the mmio bus to support non-legacy so we can
properly test it.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1333
Message-Id: <20221130112439.2527228-2-alex.bennee@linaro.org>
Acked-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

* vhost: enable vrings in vhost_dev_start() for vhost-user devices

Commit 02b61f38d3 ("hw/virtio: incorporate backend features in features")
properly negotiates VHOST_USER_F_PROTOCOL_FEATURES with the vhost-user
backend, but we forgot to enable vrings as specified in
docs/interop/vhost-user.rst:

    If ``VHOST_USER_F_PROTOCOL_FEATURES`` has not been negotiated, the
    ring starts directly in the enabled state.

    If ``VHOST_USER_F_PROTOCOL_FEATURES`` has been negotiated, the ring is
    initialized in a disabled state and is enabled by
    ``VHOST_USER_SET_VRING_ENABLE`` with parameter 1.

Some vhost-user front-ends already did this by calling
vhost_ops.vhost_set_vring_enable() directly:
- backends/cryptodev-vhost.c
- hw/net/virtio-net.c
- hw/virtio/vhost-user-gpio.c

But most didn't do that, so we would leave the vrings disabled and some
backends would not work. We observed this issue with the rust version of
virtiofsd [1], which uses the event loop [2] provided by the
vhost-user-backend crate where requests are not processed if vring is
not enabled.

Let's fix this issue by enabling the vrings in vhost_dev_start() for
vhost-user front-ends that don't already do this directly. Same thing
also in vhost_dev_stop() where we disable vrings.

[1] https://gitlab.com/virtio-fs/virtiofsd
[2] https://github.com/rust-vmm/vhost/blob/240fc2966/crates/vhost-user-backend/src/event_loop.rs#L217

Fixes: 02b61f38d3 ("hw/virtio: incorporate backend features in features")
Reported-by: German Maglione <gmaglione@redhat.com>
Tested-by: German Maglione <gmaglione@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20221123131630.52020-1-sgarzare@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20221130112439.2527228-3-alex.bennee@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

* hw/virtio: add started_vu status field to vhost-user-gpio

As per the fix to vhost-user-blk in f5b22d06fb (vhost: recheck dev
state in the vhost_migration_log routine) we really should track the
connection and starting separately.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20221130112439.2527228-4-alex.bennee@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

* hw/virtio: generalise CHR_EVENT_CLOSED handling

..and use for both virtio-user-blk and virtio-user-gpio. This avoids
the circular close by deferring shutdown due to disconnection until a
later point. virtio-user-blk already had this mechanism in place so
generalise it as a vhost-user helper function and use for both blk and
gpio devices.

While we are at it we also fix up vhost-user-gpio to re-establish the
event handler after close down so we can reconnect later.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20221130112439.2527228-5-alex.bennee@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

* include/hw: VM state takes precedence in virtio_device_should_start

The VM status should always preempt the device status for these
checks. This ensures the device is in the correct state when we
suspend the VM prior to migrations. This restores the checks to the
order they where in before the refactoring moved things around.

While we are at it lets improve our documentation of the various
fields involved and document the two functions.

Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
Fixes: 259d69c00b (hw/virtio: introduce virtio_device_should_start)
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20221130112439.2527228-6-alex.bennee@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

* hw/nvme: fix aio cancel in format

There are several bugs in the async cancel code for the Format command.

Firstly, cancelling a format operation neglects to set iocb->ret as well
as clearing the iocb->aiocb after cancelling the underlying aiocb which
causes the aio callback to ignore the cancellation. Trivial fix.

Secondly, and worse, because the request is queued up for posting to the
CQ in a bottom half, if the cancellation is due to the submission queue
being deleted (which calls blk_aio_cancel), the req structure is
deallocated in nvme_del_sq prior to the bottom half being schedulued.

Fix this by simply removing the bottom half, there is no reason to defer
it anyway.

Fixes: 3bcf26d3d619 ("hw/nvme: reimplement format nvm to allow cancellation")
Reported-by: Jonathan Derrick <jonathan.derrick@linux.dev>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>

* hw/nvme: fix aio cancel in flush

Make sure that iocb->aiocb is NULL'ed when cancelling.

Fix a potential use-after-free by removing the bottom half and enqueuing
the completion directly.

Fixes: 38f4ac65ac88 ("hw/nvme: reimplement flush to allow cancellation")
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>

* hw/nvme: fix aio cancel in zone reset

If the zone reset operation is cancelled but the block unmap operation
completes normally, the callback will continue resetting the next zone
since it neglects to check iocb->ret which will have been set to
-ECANCELED. Make sure that this is checked and bail out if an error is
present.

Secondly, fix a potential use-after-free by removing the bottom half and
enqueuing the completion directly.

Fixes: 63d96e4ffd71 ("hw/nvme: reimplement zone reset to allow cancellation")
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>

* hw/nvme: fix aio cancel in dsm

When the DSM operation is cancelled asynchronously, we set iocb->ret to
-ECANCELED. However, the callback function only checks the return value
of the completed aio, which may have completed succesfully prior to the
cancellation and thus the callback ends up continuing the dsm operation
instead of bailing out. Fix this.

Secondly, fix a potential use-after-free by removing the bottom half and
enqueuing the completion directly.

Fixes: d7d1474fd85d ("hw/nvme: reimplement dsm to allow cancellation")
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>

* hw/nvme: remove copy bh scheduling

Fix a potential use-after-free by removing the bottom half and enqueuing
the completion directly.

Fixes: 796d20681d9b ("hw/nvme: reimplement the copy command to allow aio cancellation")
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>

* target/i386: allow MMX instructions with CR4.OSFXSR=0

MMX state is saved/restored by FSAVE/FRSTOR so the instructions are
not illegal opcodes even if CR4.OSFXSR=0.  Make sure that validate_vex
takes into account the prefix and only checks HF_OSFXSR_MASK in the
presence of an SSE instruction.

Fixes: 20581aadec5e ("target/i386: validate VEX prefixes via the instructions' exception classes", 2022-10-18)
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1350
Reported-by: Helge Konetzka (@hejko on gitlab.com)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

* target/i386: Always completely initialize TranslateFault

In get_physical_address, the canonical address check failed to
set TranslateFault.stage2, which resulted in an uninitialized
read from the struct when reporting the fault in x86_cpu_tlb_fill.

Adjust all error paths to use structure assignment so that the
entire struct is always initialized.

Reported-by: Daniel Hoffman <dhoff749@gmail.com>
Fixes: 9bbcf372193a ("target/i386: Reorg GET_HPHYS")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221201074522.178498-1-richard.henderson@linaro.org>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1324
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

* hw/loongarch/virt: Add cfi01 pflash device

Add cfi01 pflash device for LoongArch virt machine

Signed-off-by: Xiaojuan Yang <yangxiaojuan@loongson.cn>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221130100647.398565-1-yangxiaojuan@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>

* Sync pc on breakpoints

* tests/qtest/migration-test: Fix unlink error and memory leaks

When running the migration test compiled with Clang from Fedora 37
and sanitizers enabled, there is an error complaining about unlink():

 ../tests/qtest/migration-test.c:1072:12: runtime error: null pointer
  passed as argument 1, which is declared to never be null
 /usr/include/unistd.h:858:48: note: nonnull attribute specified here
 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior
  ../tests/qtest/migration-test.c:1072:12 in
 (test program exited with status code 1)
 TAP parsing error: Too few tests run (expected 33, got 20)

The data->clientcert and data->clientkey pointers can indeed be unset
in some tests, so we have to check them before calling unlink() with
those.

While we're at it, I also noticed that the code is only freeing
some but not all of the allocated strings in this function, and
indeed, valgrind is also complaining about memory leaks here.
So let's call g_free() on all allocated strings to avoid leaking
memory here.

Message-Id: <20221125083054.117504-1-thuth@redhat.com>
Tested-by: Bin Meng <bmeng@tinylab.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>

* target/s390x/tcg: Fix and improve the SACF instruction

The SET ADDRESS SPACE CONTROL FAST instruction is not privileged, it can be
used from problem space, too. Just the switching to the home address space
is privileged and should still generate a privilege exception. This bug is
e.g. causing programs like Java that use the "getcpu" vdso kernel function
to crash (see https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=990417#26 ).

While we're at it, also check if DAT is not enabled. In that case the
instruction is supposed to generate a special operation exception.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/655
Message-Id: <20221201184443.136355-1-thuth@redhat.com>
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>

* hw/display/next-fb: Fix comment typo

Signed-off-by: Evgeny Ermakov <evgeny.v.ermakov@gmail.com>
Message-Id: <20221125160849.23711-1-evgeny.v.ermakov@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>

* fix dev snapshots

* working syx snaps

* Revert "hw/loongarch/virt: Add cfi01 pflash device"

This reverts commit 14dccc8ea6ece7ee63273144fb55e4770a05e0fd.

Signed-off-by: Song Gao <gaosong@loongson.cn>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221205113007.683505-1-gaosong@loongson.cn>

* Update VERSION for v7.2.0-rc4

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Ani Sinha <ani@anisinha.ca>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Xiaojuan Yang <yangxiaojuan@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Evgeny Ermakov <evgeny.v.ermakov@gmail.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Co-authored-by: Stefan Weil <sw@weilnetz.de>
Co-authored-by: Cédric Le Goater <clg@kaod.org>
Co-authored-by: Alex Bennée <alex.bennee@linaro.org>
Co-authored-by: Peter Maydell <peter.maydell@linaro.org>
Co-authored-by: Stefano Garzarella <sgarzare@redhat.com>
Co-authored-by: Igor Mammedov <imammedo@redhat.com>
Co-authored-by: Ani Sinha <ani@anisinha.ca>
Co-authored-by: John Snow <jsnow@redhat.com>
Co-authored-by: Michael S. Tsirkin <mst@redhat.com>
Co-authored-by: Xiaojuan Yang <yangxiaojuan@loongson.cn>
Co-authored-by: Stefan Hajnoczi <stefanha@redhat.com>
Co-authored-by: Ard Biesheuvel <ardb@kernel.org>
Co-authored-by: Thomas Huth <thuth@redhat.com>
Co-authored-by: Joelle van Dyne <j@getutm.app>
Co-authored-by: Claudio Fontana <cfontana@suse.de>
Co-authored-by: Michael Tokarev <mjt@tls.msk.ru>
Co-authored-by: Dongwon Kim <dongwon.kim@intel.com>
Co-authored-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Co-authored-by: Stefan Weil via <qemu-devel@nongnu.org>
Co-authored-by: Gerd Hoffmann <kraxel@redhat.com>
Co-authored-by: Richard Henderson <richard.henderson@linaro.org>
Co-authored-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Co-authored-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Co-authored-by: Evgeny Ermakov <evgeny.v.ermakov@gmail.com>
Co-authored-by: Klaus Jensen <k.jensen@samsung.com>
Co-authored-by: Paolo Bonzini <pbonzini@redhat.com>
Co-authored-by: Song Gao <gaosong@loongson.cn>
2022-12-08 10:32:18 +01:00

3320 lines
96 KiB
C

/*
* QEMU System Emulator
*
* Copyright (c) 2003-2008 Fabrice Bellard
* Copyright (c) 2009-2015 Red Hat Inc
*
* Authors:
* Juan Quintela <quintela@redhat.com>
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "hw/boards.h"
#include "net/net.h"
#include "migration.h"
#include "migration/snapshot.h"
#include "migration/vmstate.h"
#include "migration/misc.h"
#include "migration/register.h"
#include "migration/global_state.h"
#include "migration/channel-block.h"
#include "ram.h"
#include "qemu-file.h"
#include "savevm.h"
#include "postcopy-ram.h"
#include "qapi/error.h"
#include "qapi/qapi-commands-migration.h"
#include "qapi/qmp/json-writer.h"
#include "qapi/clone-visitor.h"
#include "qapi/qapi-builtin-visit.h"
#include "qapi/qmp/qerror.h"
#include "qemu/error-report.h"
#include "sysemu/cpus.h"
#include "exec/memory.h"
#include "exec/target_page.h"
#include "trace.h"
#include "qemu/iov.h"
#include "qemu/main-loop.h"
#include "block/snapshot.h"
#include "qemu/cutils.h"
#include "io/channel-buffer.h"
#include "io/channel-file.h"
#include "sysemu/replay.h"
#include "sysemu/runstate.h"
#include "sysemu/sysemu.h"
#include "sysemu/xen.h"
#include "migration/colo.h"
#include "qemu/bitmap.h"
#include "net/announce.h"
#include "qemu/yank.h"
#include "yank_functions.h"
const unsigned int postcopy_ram_discard_version;
/* Subcommands for QEMU_VM_COMMAND */
enum qemu_vm_cmd {
MIG_CMD_INVALID = 0, /* Must be 0 */
MIG_CMD_OPEN_RETURN_PATH, /* Tell the dest to open the Return path */
MIG_CMD_PING, /* Request a PONG on the RP */
MIG_CMD_POSTCOPY_ADVISE, /* Prior to any page transfers, just
warn we might want to do PC */
MIG_CMD_POSTCOPY_LISTEN, /* Start listening for incoming
pages as it's running. */
MIG_CMD_POSTCOPY_RUN, /* Start execution */
MIG_CMD_POSTCOPY_RAM_DISCARD, /* A list of pages to discard that
were previously sent during
precopy but are dirty. */
MIG_CMD_PACKAGED, /* Send a wrapped stream within this stream */
MIG_CMD_ENABLE_COLO, /* Enable COLO */
MIG_CMD_POSTCOPY_RESUME, /* resume postcopy on dest */
MIG_CMD_RECV_BITMAP, /* Request for recved bitmap on dst */
MIG_CMD_MAX
};
#define MAX_VM_CMD_PACKAGED_SIZE UINT32_MAX
static struct mig_cmd_args {
ssize_t len; /* -1 = variable */
const char *name;
} mig_cmd_args[] = {
[MIG_CMD_INVALID] = { .len = -1, .name = "INVALID" },
[MIG_CMD_OPEN_RETURN_PATH] = { .len = 0, .name = "OPEN_RETURN_PATH" },
[MIG_CMD_PING] = { .len = sizeof(uint32_t), .name = "PING" },
[MIG_CMD_POSTCOPY_ADVISE] = { .len = -1, .name = "POSTCOPY_ADVISE" },
[MIG_CMD_POSTCOPY_LISTEN] = { .len = 0, .name = "POSTCOPY_LISTEN" },
[MIG_CMD_POSTCOPY_RUN] = { .len = 0, .name = "POSTCOPY_RUN" },
[MIG_CMD_POSTCOPY_RAM_DISCARD] = {
.len = -1, .name = "POSTCOPY_RAM_DISCARD" },
[MIG_CMD_POSTCOPY_RESUME] = { .len = 0, .name = "POSTCOPY_RESUME" },
[MIG_CMD_PACKAGED] = { .len = 4, .name = "PACKAGED" },
[MIG_CMD_RECV_BITMAP] = { .len = -1, .name = "RECV_BITMAP" },
[MIG_CMD_MAX] = { .len = -1, .name = "MAX" },
};
/* Note for MIG_CMD_POSTCOPY_ADVISE:
* The format of arguments is depending on postcopy mode:
* - postcopy RAM only
* uint64_t host page size
* uint64_t taget page size
*
* - postcopy RAM and postcopy dirty bitmaps
* format is the same as for postcopy RAM only
*
* - postcopy dirty bitmaps only
* Nothing. Command length field is 0.
*
* Be careful: adding a new postcopy entity with some other parameters should
* not break format self-description ability. Good way is to introduce some
* generic extendable format with an exception for two old entities.
*/
/***********************************************************/
/* savevm/loadvm support */
static QEMUFile *qemu_fopen_bdrv(BlockDriverState *bs, int is_writable)
{
if (is_writable) {
return qemu_file_new_output(QIO_CHANNEL(qio_channel_block_new(bs)));
} else {
return qemu_file_new_input(QIO_CHANNEL(qio_channel_block_new(bs)));
}
}
/* QEMUFile timer support.
* Not in qemu-file.c to not add qemu-timer.c as dependency to qemu-file.c
*/
void timer_put(QEMUFile *f, QEMUTimer *ts)
{
uint64_t expire_time;
expire_time = timer_expire_time_ns(ts);
qemu_put_be64(f, expire_time);
}
void timer_get(QEMUFile *f, QEMUTimer *ts)
{
uint64_t expire_time;
expire_time = qemu_get_be64(f);
if (expire_time != -1) {
timer_mod_ns(ts, expire_time);
} else {
timer_del(ts);
}
}
/* VMState timer support.
* Not in vmstate.c to not add qemu-timer.c as dependency to vmstate.c
*/
static int get_timer(QEMUFile *f, void *pv, size_t size,
const VMStateField *field)
{
QEMUTimer *v = pv;
timer_get(f, v);
return 0;
}
static int put_timer(QEMUFile *f, void *pv, size_t size,
const VMStateField *field, JSONWriter *vmdesc)
{
QEMUTimer *v = pv;
timer_put(f, v);
return 0;
}
const VMStateInfo vmstate_info_timer = {
.name = "timer",
.get = get_timer,
.put = put_timer,
};
typedef struct CompatEntry {
char idstr[256];
int instance_id;
} CompatEntry;
typedef struct SaveStateEntry {
QTAILQ_ENTRY(SaveStateEntry) entry;
char idstr[256];
uint32_t instance_id;
int alias_id;
int version_id;
/* version id read from the stream */
int load_version_id;
int section_id;
/* section id read from the stream */
int load_section_id;
const SaveVMHandlers *ops;
const VMStateDescription *vmsd;
void *opaque;
CompatEntry *compat;
int is_ram;
} SaveStateEntry;
typedef struct SaveState {
QTAILQ_HEAD(, SaveStateEntry) handlers;
SaveStateEntry *handler_pri_head[MIG_PRI_MAX + 1];
int global_section_id;
uint32_t len;
const char *name;
uint32_t target_page_bits;
uint32_t caps_count;
MigrationCapability *capabilities;
QemuUUID uuid;
} SaveState;
/* static */ SaveState savevm_state = {
.handlers = QTAILQ_HEAD_INITIALIZER(savevm_state.handlers),
.handler_pri_head = { [MIG_PRI_DEFAULT ... MIG_PRI_MAX] = NULL },
.global_section_id = 0,
};
static bool should_validate_capability(int capability)
{
assert(capability >= 0 && capability < MIGRATION_CAPABILITY__MAX);
/* Validate only new capabilities to keep compatibility. */
switch (capability) {
case MIGRATION_CAPABILITY_X_IGNORE_SHARED:
return true;
default:
return false;
}
}
static uint32_t get_validatable_capabilities_count(void)
{
MigrationState *s = migrate_get_current();
uint32_t result = 0;
int i;
for (i = 0; i < MIGRATION_CAPABILITY__MAX; i++) {
if (should_validate_capability(i) && s->enabled_capabilities[i]) {
result++;
}
}
return result;
}
static int configuration_pre_save(void *opaque)
{
SaveState *state = opaque;
const char *current_name = MACHINE_GET_CLASS(current_machine)->name;
MigrationState *s = migrate_get_current();
int i, j;
state->len = strlen(current_name);
state->name = current_name;
state->target_page_bits = qemu_target_page_bits();
state->caps_count = get_validatable_capabilities_count();
state->capabilities = g_renew(MigrationCapability, state->capabilities,
state->caps_count);
for (i = j = 0; i < MIGRATION_CAPABILITY__MAX; i++) {
if (should_validate_capability(i) && s->enabled_capabilities[i]) {
state->capabilities[j++] = i;
}
}
state->uuid = qemu_uuid;
return 0;
}
static int configuration_post_save(void *opaque)
{
SaveState *state = opaque;
g_free(state->capabilities);
state->capabilities = NULL;
state->caps_count = 0;
return 0;
}
static int configuration_pre_load(void *opaque)
{
SaveState *state = opaque;
/* If there is no target-page-bits subsection it means the source
* predates the variable-target-page-bits support and is using the
* minimum possible value for this CPU.
*/
state->target_page_bits = qemu_target_page_bits_min();
return 0;
}
static bool configuration_validate_capabilities(SaveState *state)
{
bool ret = true;
MigrationState *s = migrate_get_current();
unsigned long *source_caps_bm;
int i;
source_caps_bm = bitmap_new(MIGRATION_CAPABILITY__MAX);
for (i = 0; i < state->caps_count; i++) {
MigrationCapability capability = state->capabilities[i];
set_bit(capability, source_caps_bm);
}
for (i = 0; i < MIGRATION_CAPABILITY__MAX; i++) {
bool source_state, target_state;
if (!should_validate_capability(i)) {
continue;
}
source_state = test_bit(i, source_caps_bm);
target_state = s->enabled_capabilities[i];
if (source_state != target_state) {
error_report("Capability %s is %s, but received capability is %s",
MigrationCapability_str(i),
target_state ? "on" : "off",
source_state ? "on" : "off");
ret = false;
/* Don't break here to report all failed capabilities */
}
}
g_free(source_caps_bm);
return ret;
}
static int configuration_post_load(void *opaque, int version_id)
{
SaveState *state = opaque;
const char *current_name = MACHINE_GET_CLASS(current_machine)->name;
int ret = 0;
if (strncmp(state->name, current_name, state->len) != 0) {
error_report("Machine type received is '%.*s' and local is '%s'",
(int) state->len, state->name, current_name);
ret = -EINVAL;
goto out;
}
if (state->target_page_bits != qemu_target_page_bits()) {
error_report("Received TARGET_PAGE_BITS is %d but local is %d",
state->target_page_bits, qemu_target_page_bits());
ret = -EINVAL;
goto out;
}
if (!configuration_validate_capabilities(state)) {
ret = -EINVAL;
goto out;
}
out:
g_free((void *)state->name);
state->name = NULL;
state->len = 0;
g_free(state->capabilities);
state->capabilities = NULL;
state->caps_count = 0;
return ret;
}
static int get_capability(QEMUFile *f, void *pv, size_t size,
const VMStateField *field)
{
MigrationCapability *capability = pv;
char capability_str[UINT8_MAX + 1];
uint8_t len;
int i;
len = qemu_get_byte(f);
qemu_get_buffer(f, (uint8_t *)capability_str, len);
capability_str[len] = '\0';
for (i = 0; i < MIGRATION_CAPABILITY__MAX; i++) {
if (!strcmp(MigrationCapability_str(i), capability_str)) {
*capability = i;
return 0;
}
}
error_report("Received unknown capability %s", capability_str);
return -EINVAL;
}
static int put_capability(QEMUFile *f, void *pv, size_t size,
const VMStateField *field, JSONWriter *vmdesc)
{
MigrationCapability *capability = pv;
const char *capability_str = MigrationCapability_str(*capability);
size_t len = strlen(capability_str);
assert(len <= UINT8_MAX);
qemu_put_byte(f, len);
qemu_put_buffer(f, (uint8_t *)capability_str, len);
return 0;
}
static const VMStateInfo vmstate_info_capability = {
.name = "capability",
.get = get_capability,
.put = put_capability,
};
/* The target-page-bits subsection is present only if the
* target page size is not the same as the default (ie the
* minimum page size for a variable-page-size guest CPU).
* If it is present then it contains the actual target page
* bits for the machine, and migration will fail if the
* two ends don't agree about it.
*/
static bool vmstate_target_page_bits_needed(void *opaque)
{
return qemu_target_page_bits()
> qemu_target_page_bits_min();
}
static const VMStateDescription vmstate_target_page_bits = {
.name = "configuration/target-page-bits",
.version_id = 1,
.minimum_version_id = 1,
.needed = vmstate_target_page_bits_needed,
.fields = (VMStateField[]) {
VMSTATE_UINT32(target_page_bits, SaveState),
VMSTATE_END_OF_LIST()
}
};
static bool vmstate_capabilites_needed(void *opaque)
{
return get_validatable_capabilities_count() > 0;
}
static const VMStateDescription vmstate_capabilites = {
.name = "configuration/capabilities",
.version_id = 1,
.minimum_version_id = 1,
.needed = vmstate_capabilites_needed,
.fields = (VMStateField[]) {
VMSTATE_UINT32_V(caps_count, SaveState, 1),
VMSTATE_VARRAY_UINT32_ALLOC(capabilities, SaveState, caps_count, 1,
vmstate_info_capability,
MigrationCapability),
VMSTATE_END_OF_LIST()
}
};
static bool vmstate_uuid_needed(void *opaque)
{
return qemu_uuid_set && migrate_validate_uuid();
}
static int vmstate_uuid_post_load(void *opaque, int version_id)
{
SaveState *state = opaque;
char uuid_src[UUID_FMT_LEN + 1];
char uuid_dst[UUID_FMT_LEN + 1];
if (!qemu_uuid_set) {
/*
* It's warning because user might not know UUID in some cases,
* e.g. load an old snapshot
*/
qemu_uuid_unparse(&state->uuid, uuid_src);
warn_report("UUID is received %s, but local uuid isn't set",
uuid_src);
return 0;
}
if (!qemu_uuid_is_equal(&state->uuid, &qemu_uuid)) {
qemu_uuid_unparse(&state->uuid, uuid_src);
qemu_uuid_unparse(&qemu_uuid, uuid_dst);
error_report("UUID received is %s and local is %s", uuid_src, uuid_dst);
return -EINVAL;
}
return 0;
}
static const VMStateDescription vmstate_uuid = {
.name = "configuration/uuid",
.version_id = 1,
.minimum_version_id = 1,
.needed = vmstate_uuid_needed,
.post_load = vmstate_uuid_post_load,
.fields = (VMStateField[]) {
VMSTATE_UINT8_ARRAY_V(uuid.data, SaveState, sizeof(QemuUUID), 1),
VMSTATE_END_OF_LIST()
}
};
static const VMStateDescription vmstate_configuration = {
.name = "configuration",
.version_id = 1,
.pre_load = configuration_pre_load,
.post_load = configuration_post_load,
.pre_save = configuration_pre_save,
.post_save = configuration_post_save,
.fields = (VMStateField[]) {
VMSTATE_UINT32(len, SaveState),
VMSTATE_VBUFFER_ALLOC_UINT32(name, SaveState, 0, NULL, len),
VMSTATE_END_OF_LIST()
},
.subsections = (const VMStateDescription *[]) {
&vmstate_target_page_bits,
&vmstate_capabilites,
&vmstate_uuid,
NULL
}
};
static void dump_vmstate_vmsd(FILE *out_file,
const VMStateDescription *vmsd, int indent,
bool is_subsection);
static void dump_vmstate_vmsf(FILE *out_file, const VMStateField *field,
int indent)
{
fprintf(out_file, "%*s{\n", indent, "");
indent += 2;
fprintf(out_file, "%*s\"field\": \"%s\",\n", indent, "", field->name);
fprintf(out_file, "%*s\"version_id\": %d,\n", indent, "",
field->version_id);
fprintf(out_file, "%*s\"field_exists\": %s,\n", indent, "",
field->field_exists ? "true" : "false");
fprintf(out_file, "%*s\"size\": %zu", indent, "", field->size);
if (field->vmsd != NULL) {
fprintf(out_file, ",\n");
dump_vmstate_vmsd(out_file, field->vmsd, indent, false);
}
fprintf(out_file, "\n%*s}", indent - 2, "");
}
static void dump_vmstate_vmss(FILE *out_file,
const VMStateDescription **subsection,
int indent)
{
if (*subsection != NULL) {
dump_vmstate_vmsd(out_file, *subsection, indent, true);
}
}
static void dump_vmstate_vmsd(FILE *out_file,
const VMStateDescription *vmsd, int indent,
bool is_subsection)
{
if (is_subsection) {
fprintf(out_file, "%*s{\n", indent, "");
} else {
fprintf(out_file, "%*s\"%s\": {\n", indent, "", "Description");
}
indent += 2;
fprintf(out_file, "%*s\"name\": \"%s\",\n", indent, "", vmsd->name);
fprintf(out_file, "%*s\"version_id\": %d,\n", indent, "",
vmsd->version_id);
fprintf(out_file, "%*s\"minimum_version_id\": %d", indent, "",
vmsd->minimum_version_id);
if (vmsd->fields != NULL) {
const VMStateField *field = vmsd->fields;
bool first;
fprintf(out_file, ",\n%*s\"Fields\": [\n", indent, "");
first = true;
while (field->name != NULL) {
if (field->flags & VMS_MUST_EXIST) {
/* Ignore VMSTATE_VALIDATE bits; these don't get migrated */
field++;
continue;
}
if (!first) {
fprintf(out_file, ",\n");
}
dump_vmstate_vmsf(out_file, field, indent + 2);
field++;
first = false;
}
fprintf(out_file, "\n%*s]", indent, "");
}
if (vmsd->subsections != NULL) {
const VMStateDescription **subsection = vmsd->subsections;
bool first;
fprintf(out_file, ",\n%*s\"Subsections\": [\n", indent, "");
first = true;
while (*subsection != NULL) {
if (!first) {
fprintf(out_file, ",\n");
}
dump_vmstate_vmss(out_file, subsection, indent + 2);
subsection++;
first = false;
}
fprintf(out_file, "\n%*s]", indent, "");
}
fprintf(out_file, "\n%*s}", indent - 2, "");
}
static void dump_machine_type(FILE *out_file)
{
MachineClass *mc;
mc = MACHINE_GET_CLASS(current_machine);
fprintf(out_file, " \"vmschkmachine\": {\n");
fprintf(out_file, " \"Name\": \"%s\"\n", mc->name);
fprintf(out_file, " },\n");
}
void dump_vmstate_json_to_file(FILE *out_file)
{
GSList *list, *elt;
bool first;
fprintf(out_file, "{\n");
dump_machine_type(out_file);
first = true;
list = object_class_get_list(TYPE_DEVICE, true);
for (elt = list; elt; elt = elt->next) {
DeviceClass *dc = OBJECT_CLASS_CHECK(DeviceClass, elt->data,
TYPE_DEVICE);
const char *name;
int indent = 2;
if (!dc->vmsd) {
continue;
}
if (!first) {
fprintf(out_file, ",\n");
}
name = object_class_get_name(OBJECT_CLASS(dc));
fprintf(out_file, "%*s\"%s\": {\n", indent, "", name);
indent += 2;
fprintf(out_file, "%*s\"Name\": \"%s\",\n", indent, "", name);
fprintf(out_file, "%*s\"version_id\": %d,\n", indent, "",
dc->vmsd->version_id);
fprintf(out_file, "%*s\"minimum_version_id\": %d,\n", indent, "",
dc->vmsd->minimum_version_id);
dump_vmstate_vmsd(out_file, dc->vmsd, indent, false);
fprintf(out_file, "\n%*s}", indent - 2, "");
first = false;
}
fprintf(out_file, "\n}\n");
fclose(out_file);
g_slist_free(list);
}
static uint32_t calculate_new_instance_id(const char *idstr)
{
SaveStateEntry *se;
uint32_t instance_id = 0;
QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
if (strcmp(idstr, se->idstr) == 0
&& instance_id <= se->instance_id) {
instance_id = se->instance_id + 1;
}
}
/* Make sure we never loop over without being noticed */
assert(instance_id != VMSTATE_INSTANCE_ID_ANY);
return instance_id;
}
static int calculate_compat_instance_id(const char *idstr)
{
SaveStateEntry *se;
int instance_id = 0;
QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
if (!se->compat) {
continue;
}
if (strcmp(idstr, se->compat->idstr) == 0
&& instance_id <= se->compat->instance_id) {
instance_id = se->compat->instance_id + 1;
}
}
return instance_id;
}
static inline MigrationPriority save_state_priority(SaveStateEntry *se)
{
if (se->vmsd) {
return se->vmsd->priority;
}
return MIG_PRI_DEFAULT;
}
static void savevm_state_handler_insert(SaveStateEntry *nse)
{
MigrationPriority priority = save_state_priority(nse);
SaveStateEntry *se;
int i;
assert(priority <= MIG_PRI_MAX);
for (i = priority - 1; i >= 0; i--) {
se = savevm_state.handler_pri_head[i];
if (se != NULL) {
assert(save_state_priority(se) < priority);
break;
}
}
if (i >= 0) {
QTAILQ_INSERT_BEFORE(se, nse, entry);
} else {
QTAILQ_INSERT_TAIL(&savevm_state.handlers, nse, entry);
}
if (savevm_state.handler_pri_head[priority] == NULL) {
savevm_state.handler_pri_head[priority] = nse;
}
}
static void savevm_state_handler_remove(SaveStateEntry *se)
{
SaveStateEntry *next;
MigrationPriority priority = save_state_priority(se);
if (se == savevm_state.handler_pri_head[priority]) {
next = QTAILQ_NEXT(se, entry);
if (next != NULL && save_state_priority(next) == priority) {
savevm_state.handler_pri_head[priority] = next;
} else {
savevm_state.handler_pri_head[priority] = NULL;
}
}
QTAILQ_REMOVE(&savevm_state.handlers, se, entry);
}
/* TODO: Individual devices generally have very little idea about the rest
of the system, so instance_id should be removed/replaced.
Meanwhile pass -1 as instance_id if you do not already have a clearly
distinguishing id for all instances of your device class. */
int register_savevm_live(const char *idstr,
uint32_t instance_id,
int version_id,
const SaveVMHandlers *ops,
void *opaque)
{
SaveStateEntry *se;
se = g_new0(SaveStateEntry, 1);
se->version_id = version_id;
se->section_id = savevm_state.global_section_id++;
se->ops = ops;
se->opaque = opaque;
se->vmsd = NULL;
/* if this is a live_savem then set is_ram */
if (ops->save_setup != NULL) {
se->is_ram = 1;
}
pstrcat(se->idstr, sizeof(se->idstr), idstr);
if (instance_id == VMSTATE_INSTANCE_ID_ANY) {
se->instance_id = calculate_new_instance_id(se->idstr);
} else {
se->instance_id = instance_id;
}
assert(!se->compat || se->instance_id == 0);
savevm_state_handler_insert(se);
return 0;
}
void unregister_savevm(VMStateIf *obj, const char *idstr, void *opaque)
{
SaveStateEntry *se, *new_se;
char id[256] = "";
if (obj) {
char *oid = vmstate_if_get_id(obj);
if (oid) {
pstrcpy(id, sizeof(id), oid);
pstrcat(id, sizeof(id), "/");
g_free(oid);
}
}
pstrcat(id, sizeof(id), idstr);
QTAILQ_FOREACH_SAFE(se, &savevm_state.handlers, entry, new_se) {
if (strcmp(se->idstr, id) == 0 && se->opaque == opaque) {
savevm_state_handler_remove(se);
g_free(se->compat);
g_free(se);
}
}
}
int vmstate_register_with_alias_id(VMStateIf *obj, uint32_t instance_id,
const VMStateDescription *vmsd,
void *opaque, int alias_id,
int required_for_version,
Error **errp)
{
SaveStateEntry *se;
/* If this triggers, alias support can be dropped for the vmsd. */
assert(alias_id == -1 || required_for_version >= vmsd->minimum_version_id);
se = g_new0(SaveStateEntry, 1);
se->version_id = vmsd->version_id;
se->section_id = savevm_state.global_section_id++;
se->opaque = opaque;
se->vmsd = vmsd;
se->alias_id = alias_id;
if (obj) {
char *id = vmstate_if_get_id(obj);
if (id) {
if (snprintf(se->idstr, sizeof(se->idstr), "%s/", id) >=
sizeof(se->idstr)) {
error_setg(errp, "Path too long for VMState (%s)", id);
g_free(id);
g_free(se);
return -1;
}
g_free(id);
se->compat = g_new0(CompatEntry, 1);
pstrcpy(se->compat->idstr, sizeof(se->compat->idstr), vmsd->name);
se->compat->instance_id = instance_id == VMSTATE_INSTANCE_ID_ANY ?
calculate_compat_instance_id(vmsd->name) : instance_id;
instance_id = VMSTATE_INSTANCE_ID_ANY;
}
}
pstrcat(se->idstr, sizeof(se->idstr), vmsd->name);
if (instance_id == VMSTATE_INSTANCE_ID_ANY) {
se->instance_id = calculate_new_instance_id(se->idstr);
} else {
se->instance_id = instance_id;
}
assert(!se->compat || se->instance_id == 0);
savevm_state_handler_insert(se);
return 0;
}
void vmstate_unregister(VMStateIf *obj, const VMStateDescription *vmsd,
void *opaque)
{
SaveStateEntry *se, *new_se;
QTAILQ_FOREACH_SAFE(se, &savevm_state.handlers, entry, new_se) {
if (se->vmsd == vmsd && se->opaque == opaque) {
savevm_state_handler_remove(se);
g_free(se->compat);
g_free(se);
}
}
}
static int vmstate_load(QEMUFile *f, SaveStateEntry *se)
{
trace_vmstate_load(se->idstr, se->vmsd ? se->vmsd->name : "(old)");
if (!se->vmsd) { /* Old style */
return se->ops->load_state(f, se->opaque, se->load_version_id);
}
return vmstate_load_state(f, se->vmsd, se->opaque, se->load_version_id);
}
static void vmstate_save_old_style(QEMUFile *f, SaveStateEntry *se,
JSONWriter *vmdesc)
{
int64_t old_offset, size;
old_offset = qemu_file_total_transferred_fast(f);
se->ops->save_state(f, se->opaque);
size = qemu_file_total_transferred_fast(f) - old_offset;
if (vmdesc) {
json_writer_int64(vmdesc, "size", size);
json_writer_start_array(vmdesc, "fields");
json_writer_start_object(vmdesc, NULL);
json_writer_str(vmdesc, "name", "data");
json_writer_int64(vmdesc, "size", size);
json_writer_str(vmdesc, "type", "buffer");
json_writer_end_object(vmdesc);
json_writer_end_array(vmdesc);
}
}
//// --- Begin LibAFL code ---
void save_section_header(QEMUFile *f, SaveStateEntry *se, uint8_t section_type);
void save_section_footer(QEMUFile *f, SaveStateEntry *se);
int vmstate_save(QEMUFile *f, SaveStateEntry *se, JSONWriter *vmdesc);
//// --- End LibAFL code ---
/* static */ int vmstate_save(QEMUFile *f, SaveStateEntry *se,
JSONWriter *vmdesc)
{
trace_vmstate_save(se->idstr, se->vmsd ? se->vmsd->name : "(old)");
if (!se->vmsd) {
vmstate_save_old_style(f, se, vmdesc);
return 0;
}
return vmstate_save_state(f, se->vmsd, se->opaque, vmdesc);
}
/*
* Write the header for device section (QEMU_VM_SECTION START/END/PART/FULL)
*/
/* static */ void save_section_header(QEMUFile *f, SaveStateEntry *se,
uint8_t section_type)
{
qemu_put_byte(f, section_type);
qemu_put_be32(f, se->section_id);
if (section_type == QEMU_VM_SECTION_FULL ||
section_type == QEMU_VM_SECTION_START) {
/* ID string */
size_t len = strlen(se->idstr);
qemu_put_byte(f, len);
qemu_put_buffer(f, (uint8_t *)se->idstr, len);
qemu_put_be32(f, se->instance_id);
qemu_put_be32(f, se->version_id);
}
}
/*
* Write a footer onto device sections that catches cases misformatted device
* sections.
*/
/* static */ void save_section_footer(QEMUFile *f, SaveStateEntry *se)
{
if (migrate_get_current()->send_section_footer) {
qemu_put_byte(f, QEMU_VM_SECTION_FOOTER);
qemu_put_be32(f, se->section_id);
}
}
/**
* qemu_savevm_command_send: Send a 'QEMU_VM_COMMAND' type element with the
* command and associated data.
*
* @f: File to send command on
* @command: Command type to send
* @len: Length of associated data
* @data: Data associated with command.
*/
static void qemu_savevm_command_send(QEMUFile *f,
enum qemu_vm_cmd command,
uint16_t len,
uint8_t *data)
{
trace_savevm_command_send(command, len);
qemu_put_byte(f, QEMU_VM_COMMAND);
qemu_put_be16(f, (uint16_t)command);
qemu_put_be16(f, len);
qemu_put_buffer(f, data, len);
qemu_fflush(f);
}
void qemu_savevm_send_colo_enable(QEMUFile *f)
{
trace_savevm_send_colo_enable();
qemu_savevm_command_send(f, MIG_CMD_ENABLE_COLO, 0, NULL);
}
void qemu_savevm_send_ping(QEMUFile *f, uint32_t value)
{
uint32_t buf;
trace_savevm_send_ping(value);
buf = cpu_to_be32(value);
qemu_savevm_command_send(f, MIG_CMD_PING, sizeof(value), (uint8_t *)&buf);
}
void qemu_savevm_send_open_return_path(QEMUFile *f)
{
trace_savevm_send_open_return_path();
qemu_savevm_command_send(f, MIG_CMD_OPEN_RETURN_PATH, 0, NULL);
}
/* We have a buffer of data to send; we don't want that all to be loaded
* by the command itself, so the command contains just the length of the
* extra buffer that we then send straight after it.
* TODO: Must be a better way to organise that
*
* Returns:
* 0 on success
* -ve on error
*/
int qemu_savevm_send_packaged(QEMUFile *f, const uint8_t *buf, size_t len)
{
uint32_t tmp;
if (len > MAX_VM_CMD_PACKAGED_SIZE) {
error_report("%s: Unreasonably large packaged state: %zu",
__func__, len);
return -1;
}
tmp = cpu_to_be32(len);
trace_qemu_savevm_send_packaged();
qemu_savevm_command_send(f, MIG_CMD_PACKAGED, 4, (uint8_t *)&tmp);
qemu_put_buffer(f, buf, len);
return 0;
}
/* Send prior to any postcopy transfer */
void qemu_savevm_send_postcopy_advise(QEMUFile *f)
{
if (migrate_postcopy_ram()) {
uint64_t tmp[2];
tmp[0] = cpu_to_be64(ram_pagesize_summary());
tmp[1] = cpu_to_be64(qemu_target_page_size());
trace_qemu_savevm_send_postcopy_advise();
qemu_savevm_command_send(f, MIG_CMD_POSTCOPY_ADVISE,
16, (uint8_t *)tmp);
} else {
qemu_savevm_command_send(f, MIG_CMD_POSTCOPY_ADVISE, 0, NULL);
}
}
/* Sent prior to starting the destination running in postcopy, discard pages
* that have already been sent but redirtied on the source.
* CMD_POSTCOPY_RAM_DISCARD consist of:
* byte version (0)
* byte Length of name field (not including 0)
* n x byte RAM block name
* byte 0 terminator (just for safety)
* n x Byte ranges within the named RAMBlock
* be64 Start of the range
* be64 Length
*
* name: RAMBlock name that these entries are part of
* len: Number of page entries
* start_list: 'len' addresses
* length_list: 'len' addresses
*
*/
void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, const char *name,
uint16_t len,
uint64_t *start_list,
uint64_t *length_list)
{
uint8_t *buf;
uint16_t tmplen;
uint16_t t;
size_t name_len = strlen(name);
trace_qemu_savevm_send_postcopy_ram_discard(name, len);
assert(name_len < 256);
buf = g_malloc0(1 + 1 + name_len + 1 + (8 + 8) * len);
buf[0] = postcopy_ram_discard_version;
buf[1] = name_len;
memcpy(buf + 2, name, name_len);
tmplen = 2 + name_len;
buf[tmplen++] = '\0';
for (t = 0; t < len; t++) {
stq_be_p(buf + tmplen, start_list[t]);
tmplen += 8;
stq_be_p(buf + tmplen, length_list[t]);
tmplen += 8;
}
qemu_savevm_command_send(f, MIG_CMD_POSTCOPY_RAM_DISCARD, tmplen, buf);
g_free(buf);
}
/* Get the destination into a state where it can receive postcopy data. */
void qemu_savevm_send_postcopy_listen(QEMUFile *f)
{
trace_savevm_send_postcopy_listen();
qemu_savevm_command_send(f, MIG_CMD_POSTCOPY_LISTEN, 0, NULL);
}
/* Kick the destination into running */
void qemu_savevm_send_postcopy_run(QEMUFile *f)
{
trace_savevm_send_postcopy_run();
qemu_savevm_command_send(f, MIG_CMD_POSTCOPY_RUN, 0, NULL);
}
void qemu_savevm_send_postcopy_resume(QEMUFile *f)
{
trace_savevm_send_postcopy_resume();
qemu_savevm_command_send(f, MIG_CMD_POSTCOPY_RESUME, 0, NULL);
}
void qemu_savevm_send_recv_bitmap(QEMUFile *f, char *block_name)
{
size_t len;
char buf[256];
trace_savevm_send_recv_bitmap(block_name);
buf[0] = len = strlen(block_name);
memcpy(buf + 1, block_name, len);
qemu_savevm_command_send(f, MIG_CMD_RECV_BITMAP, len + 1, (uint8_t *)buf);
}
bool qemu_savevm_state_blocked(Error **errp)
{
SaveStateEntry *se;
QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
if (se->vmsd && se->vmsd->unmigratable) {
error_setg(errp, "State blocked by non-migratable device '%s'",
se->idstr);
return true;
}
}
return false;
}
void qemu_savevm_non_migratable_list(strList **reasons)
{
SaveStateEntry *se;
QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
if (se->vmsd && se->vmsd->unmigratable) {
QAPI_LIST_PREPEND(*reasons,
g_strdup_printf("non-migratable device: %s",
se->idstr));
}
}
}
void qemu_savevm_state_header(QEMUFile *f)
{
trace_savevm_state_header();
qemu_put_be32(f, QEMU_VM_FILE_MAGIC);
qemu_put_be32(f, QEMU_VM_FILE_VERSION);
if (migrate_get_current()->send_configuration) {
qemu_put_byte(f, QEMU_VM_CONFIGURATION);
vmstate_save_state(f, &vmstate_configuration, &savevm_state, 0);
}
}
bool qemu_savevm_state_guest_unplug_pending(void)
{
SaveStateEntry *se;
QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
if (se->vmsd && se->vmsd->dev_unplug_pending &&
se->vmsd->dev_unplug_pending(se->opaque)) {
return true;
}
}
return false;
}
void qemu_savevm_state_setup(QEMUFile *f)
{
SaveStateEntry *se;
Error *local_err = NULL;
int ret;
trace_savevm_state_setup();
QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
if (!se->ops || !se->ops->save_setup) {
continue;
}
if (se->ops->is_active) {
if (!se->ops->is_active(se->opaque)) {
continue;
}
}
save_section_header(f, se, QEMU_VM_SECTION_START);
ret = se->ops->save_setup(f, se->opaque);
save_section_footer(f, se);
if (ret < 0) {
qemu_file_set_error(f, ret);
break;
}
}
if (precopy_notify(PRECOPY_NOTIFY_SETUP, &local_err)) {
error_report_err(local_err);
}
}
int qemu_savevm_state_resume_prepare(MigrationState *s)
{
SaveStateEntry *se;
int ret;
trace_savevm_state_resume_prepare();
QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
if (!se->ops || !se->ops->resume_prepare) {
continue;
}
if (se->ops->is_active) {
if (!se->ops->is_active(se->opaque)) {
continue;
}
}
ret = se->ops->resume_prepare(s, se->opaque);
if (ret < 0) {
return ret;
}
}
return 0;
}
/*
* this function has three return values:
* negative: there was one error, and we have -errno.
* 0 : We haven't finished, caller have to go again
* 1 : We have finished, we can go to complete phase
*/
int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy)
{
SaveStateEntry *se;
int ret = 1;
trace_savevm_state_iterate();
QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
if (!se->ops || !se->ops->save_live_iterate) {
continue;
}
if (se->ops->is_active &&
!se->ops->is_active(se->opaque)) {
continue;
}
if (se->ops->is_active_iterate &&
!se->ops->is_active_iterate(se->opaque)) {
continue;
}
/*
* In the postcopy phase, any device that doesn't know how to
* do postcopy should have saved it's state in the _complete
* call that's already run, it might get confused if we call
* iterate afterwards.
*/
if (postcopy &&
!(se->ops->has_postcopy && se->ops->has_postcopy(se->opaque))) {
continue;
}
if (qemu_file_rate_limit(f)) {
return 0;
}
trace_savevm_section_start(se->idstr, se->section_id);
save_section_header(f, se, QEMU_VM_SECTION_PART);
ret = se->ops->save_live_iterate(f, se->opaque);
trace_savevm_section_end(se->idstr, se->section_id, ret);
save_section_footer(f, se);
if (ret < 0) {
error_report("failed to save SaveStateEntry with id(name): "
"%d(%s): %d",
se->section_id, se->idstr, ret);
qemu_file_set_error(f, ret);
}
if (ret <= 0) {
/* Do not proceed to the next vmstate before this one reported
completion of the current stage. This serializes the migration
and reduces the probability that a faster changing state is
synchronized over and over again. */
break;
}
}
return ret;
}
static bool should_send_vmdesc(void)
{
MachineState *machine = MACHINE(qdev_get_machine());
bool in_postcopy = migration_in_postcopy();
return !machine->suppress_vmdesc && !in_postcopy;
}
/*
* Calls the save_live_complete_postcopy methods
* causing the last few pages to be sent immediately and doing any associated
* cleanup.
* Note postcopy also calls qemu_savevm_state_complete_precopy to complete
* all the other devices, but that happens at the point we switch to postcopy.
*/
void qemu_savevm_state_complete_postcopy(QEMUFile *f)
{
SaveStateEntry *se;
int ret;
QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
if (!se->ops || !se->ops->save_live_complete_postcopy) {
continue;
}
if (se->ops->is_active) {
if (!se->ops->is_active(se->opaque)) {
continue;
}
}
trace_savevm_section_start(se->idstr, se->section_id);
/* Section type */
qemu_put_byte(f, QEMU_VM_SECTION_END);
qemu_put_be32(f, se->section_id);
ret = se->ops->save_live_complete_postcopy(f, se->opaque);
trace_savevm_section_end(se->idstr, se->section_id, ret);
save_section_footer(f, se);
if (ret < 0) {
qemu_file_set_error(f, ret);
return;
}
}
qemu_put_byte(f, QEMU_VM_EOF);
qemu_fflush(f);
}
static
int qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool in_postcopy)
{
SaveStateEntry *se;
int ret;
QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
if (!se->ops ||
(in_postcopy && se->ops->has_postcopy &&
se->ops->has_postcopy(se->opaque)) ||
!se->ops->save_live_complete_precopy) {
continue;
}
if (se->ops->is_active) {
if (!se->ops->is_active(se->opaque)) {
continue;
}
}
trace_savevm_section_start(se->idstr, se->section_id);
save_section_header(f, se, QEMU_VM_SECTION_END);
ret = se->ops->save_live_complete_precopy(f, se->opaque);
trace_savevm_section_end(se->idstr, se->section_id, ret);
save_section_footer(f, se);
if (ret < 0) {
qemu_file_set_error(f, ret);
return -1;
}
}
return 0;
}
int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
bool in_postcopy,
bool inactivate_disks)
{
g_autoptr(JSONWriter) vmdesc = NULL;
int vmdesc_len;
SaveStateEntry *se;
int ret;
vmdesc = json_writer_new(false);
json_writer_start_object(vmdesc, NULL);
json_writer_int64(vmdesc, "page_size", qemu_target_page_size());
json_writer_start_array(vmdesc, "devices");
QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
if ((!se->ops || !se->ops->save_state) && !se->vmsd) {
continue;
}
if (se->vmsd && !vmstate_save_needed(se->vmsd, se->opaque)) {
trace_savevm_section_skip(se->idstr, se->section_id);
continue;
}
trace_savevm_section_start(se->idstr, se->section_id);
json_writer_start_object(vmdesc, NULL);
json_writer_str(vmdesc, "name", se->idstr);
json_writer_int64(vmdesc, "instance_id", se->instance_id);
save_section_header(f, se, QEMU_VM_SECTION_FULL);
ret = vmstate_save(f, se, vmdesc);
if (ret) {
qemu_file_set_error(f, ret);
return ret;
}
trace_savevm_section_end(se->idstr, se->section_id, 0);
save_section_footer(f, se);
json_writer_end_object(vmdesc);
}
if (inactivate_disks) {
/* Inactivate before sending QEMU_VM_EOF so that the
* bdrv_activate_all() on the other end won't fail. */
ret = bdrv_inactivate_all();
if (ret) {
error_report("%s: bdrv_inactivate_all() failed (%d)",
__func__, ret);
qemu_file_set_error(f, ret);
return ret;
}
}
if (!in_postcopy) {
/* Postcopy stream will still be going */
qemu_put_byte(f, QEMU_VM_EOF);
}
json_writer_end_array(vmdesc);
json_writer_end_object(vmdesc);
vmdesc_len = strlen(json_writer_get(vmdesc));
if (should_send_vmdesc()) {
qemu_put_byte(f, QEMU_VM_VMDESCRIPTION);
qemu_put_be32(f, vmdesc_len);
qemu_put_buffer(f, (uint8_t *)json_writer_get(vmdesc), vmdesc_len);
}
return 0;
}
int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only,
bool inactivate_disks)
{
int ret;
Error *local_err = NULL;
bool in_postcopy = migration_in_postcopy();
if (precopy_notify(PRECOPY_NOTIFY_COMPLETE, &local_err)) {
error_report_err(local_err);
}
trace_savevm_state_complete_precopy();
cpu_synchronize_all_states();
if (!in_postcopy || iterable_only) {
ret = qemu_savevm_state_complete_precopy_iterable(f, in_postcopy);
if (ret) {
return ret;
}
}
if (iterable_only) {
goto flush;
}
ret = qemu_savevm_state_complete_precopy_non_iterable(f, in_postcopy,
inactivate_disks);
if (ret) {
return ret;
}
flush:
qemu_fflush(f);
return 0;
}
/* Give an estimate of the amount left to be transferred,
* the result is split into the amount for units that can and
* for units that can't do postcopy.
*/
void qemu_savevm_state_pending(QEMUFile *f, uint64_t threshold_size,
uint64_t *res_precopy_only,
uint64_t *res_compatible,
uint64_t *res_postcopy_only)
{
SaveStateEntry *se;
*res_precopy_only = 0;
*res_compatible = 0;
*res_postcopy_only = 0;
QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
if (!se->ops || !se->ops->save_live_pending) {
continue;
}
if (se->ops->is_active) {
if (!se->ops->is_active(se->opaque)) {
continue;
}
}
se->ops->save_live_pending(f, se->opaque, threshold_size,
res_precopy_only, res_compatible,
res_postcopy_only);
}
}
void qemu_savevm_state_cleanup(void)
{
SaveStateEntry *se;
Error *local_err = NULL;
if (precopy_notify(PRECOPY_NOTIFY_CLEANUP, &local_err)) {
error_report_err(local_err);
}
trace_savevm_state_cleanup();
QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
if (se->ops && se->ops->save_cleanup) {
se->ops->save_cleanup(se->opaque);
}
}
}
static int qemu_savevm_state(QEMUFile *f, Error **errp)
{
int ret;
MigrationState *ms = migrate_get_current();
MigrationStatus status;
if (migration_is_running(ms->state)) {
error_setg(errp, QERR_MIGRATION_ACTIVE);
return -EINVAL;
}
if (migrate_use_block()) {
error_setg(errp, "Block migration and snapshots are incompatible");
return -EINVAL;
}
migrate_init(ms);
memset(&ram_counters, 0, sizeof(ram_counters));
memset(&compression_counters, 0, sizeof(compression_counters));
ms->to_dst_file = f;
qemu_mutex_unlock_iothread();
qemu_savevm_state_header(f);
qemu_savevm_state_setup(f);
qemu_mutex_lock_iothread();
while (qemu_file_get_error(f) == 0) {
if (qemu_savevm_state_iterate(f, false) > 0) {
break;
}
}
ret = qemu_file_get_error(f);
if (ret == 0) {
qemu_savevm_state_complete_precopy(f, false, false);
ret = qemu_file_get_error(f);
}
qemu_savevm_state_cleanup();
if (ret != 0) {
error_setg_errno(errp, -ret, "Error while writing VM state");
}
if (ret != 0) {
status = MIGRATION_STATUS_FAILED;
} else {
status = MIGRATION_STATUS_COMPLETED;
}
migrate_set_state(&ms->state, MIGRATION_STATUS_SETUP, status);
/* f is outer parameter, it should not stay in global migration state after
* this function finished */
ms->to_dst_file = NULL;
return ret;
}
void qemu_savevm_live_state(QEMUFile *f)
{
/* save QEMU_VM_SECTION_END section */
qemu_savevm_state_complete_precopy(f, true, false);
qemu_put_byte(f, QEMU_VM_EOF);
}
int qemu_save_device_state(QEMUFile *f)
{
SaveStateEntry *se;
if (!migration_in_colo_state()) {
qemu_put_be32(f, QEMU_VM_FILE_MAGIC);
qemu_put_be32(f, QEMU_VM_FILE_VERSION);
}
cpu_synchronize_all_states();
QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
int ret;
if (se->is_ram) {
continue;
}
if ((!se->ops || !se->ops->save_state) && !se->vmsd) {
continue;
}
if (se->vmsd && !vmstate_save_needed(se->vmsd, se->opaque)) {
continue;
}
save_section_header(f, se, QEMU_VM_SECTION_FULL);
ret = vmstate_save(f, se, NULL);
if (ret) {
return ret;
}
save_section_footer(f, se);
}
qemu_put_byte(f, QEMU_VM_EOF);
return qemu_file_get_error(f);
}
static SaveStateEntry *find_se(const char *idstr, uint32_t instance_id)
{
SaveStateEntry *se;
QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
if (!strcmp(se->idstr, idstr) &&
(instance_id == se->instance_id ||
instance_id == se->alias_id))
return se;
/* Migrating from an older version? */
if (strstr(se->idstr, idstr) && se->compat) {
if (!strcmp(se->compat->idstr, idstr) &&
(instance_id == se->compat->instance_id ||
instance_id == se->alias_id))
return se;
}
}
return NULL;
}
enum LoadVMExitCodes {
/* Allow a command to quit all layers of nested loadvm loops */
LOADVM_QUIT = 1,
};
/* ------ incoming postcopy messages ------ */
/* 'advise' arrives before any transfers just to tell us that a postcopy
* *might* happen - it might be skipped if precopy transferred everything
* quickly.
*/
static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis,
uint16_t len)
{
PostcopyState ps = postcopy_state_set(POSTCOPY_INCOMING_ADVISE);
uint64_t remote_pagesize_summary, local_pagesize_summary, remote_tps;
size_t page_size = qemu_target_page_size();
Error *local_err = NULL;
trace_loadvm_postcopy_handle_advise();
if (ps != POSTCOPY_INCOMING_NONE) {
error_report("CMD_POSTCOPY_ADVISE in wrong postcopy state (%d)", ps);
return -1;
}
switch (len) {
case 0:
if (migrate_postcopy_ram()) {
error_report("RAM postcopy is enabled but have 0 byte advise");
return -EINVAL;
}
return 0;
case 8 + 8:
if (!migrate_postcopy_ram()) {
error_report("RAM postcopy is disabled but have 16 byte advise");
return -EINVAL;
}
break;
default:
error_report("CMD_POSTCOPY_ADVISE invalid length (%d)", len);
return -EINVAL;
}
if (!postcopy_ram_supported_by_host(mis)) {
postcopy_state_set(POSTCOPY_INCOMING_NONE);
return -1;
}
remote_pagesize_summary = qemu_get_be64(mis->from_src_file);
local_pagesize_summary = ram_pagesize_summary();
if (remote_pagesize_summary != local_pagesize_summary) {
/*
* This detects two potential causes of mismatch:
* a) A mismatch in host page sizes
* Some combinations of mismatch are probably possible but it gets
* a bit more complicated. In particular we need to place whole
* host pages on the dest at once, and we need to ensure that we
* handle dirtying to make sure we never end up sending part of
* a hostpage on it's own.
* b) The use of different huge page sizes on source/destination
* a more fine grain test is performed during RAM block migration
* but this test here causes a nice early clear failure, and
* also fails when passed to an older qemu that doesn't
* do huge pages.
*/
error_report("Postcopy needs matching RAM page sizes (s=%" PRIx64
" d=%" PRIx64 ")",
remote_pagesize_summary, local_pagesize_summary);
return -1;
}
remote_tps = qemu_get_be64(mis->from_src_file);
if (remote_tps != page_size) {
/*
* Again, some differences could be dealt with, but for now keep it
* simple.
*/
error_report("Postcopy needs matching target page sizes (s=%d d=%zd)",
(int)remote_tps, page_size);
return -1;
}
if (postcopy_notify(POSTCOPY_NOTIFY_INBOUND_ADVISE, &local_err)) {
error_report_err(local_err);
return -1;
}
if (ram_postcopy_incoming_init(mis)) {
return -1;
}
return 0;
}
/* After postcopy we will be told to throw some pages away since they're
* dirty and will have to be demand fetched. Must happen before CPU is
* started.
* There can be 0..many of these messages, each encoding multiple pages.
*/
static int loadvm_postcopy_ram_handle_discard(MigrationIncomingState *mis,
uint16_t len)
{
int tmp;
char ramid[256];
PostcopyState ps = postcopy_state_get();
trace_loadvm_postcopy_ram_handle_discard();
switch (ps) {
case POSTCOPY_INCOMING_ADVISE:
/* 1st discard */
tmp = postcopy_ram_prepare_discard(mis);
if (tmp) {
return tmp;
}
break;
case POSTCOPY_INCOMING_DISCARD:
/* Expected state */
break;
default:
error_report("CMD_POSTCOPY_RAM_DISCARD in wrong postcopy state (%d)",
ps);
return -1;
}
/* We're expecting a
* Version (0)
* a RAM ID string (length byte, name, 0 term)
* then at least 1 16 byte chunk
*/
if (len < (1 + 1 + 1 + 1 + 2 * 8)) {
error_report("CMD_POSTCOPY_RAM_DISCARD invalid length (%d)", len);
return -1;
}
tmp = qemu_get_byte(mis->from_src_file);
if (tmp != postcopy_ram_discard_version) {
error_report("CMD_POSTCOPY_RAM_DISCARD invalid version (%d)", tmp);
return -1;
}
if (!qemu_get_counted_string(mis->from_src_file, ramid)) {
error_report("CMD_POSTCOPY_RAM_DISCARD Failed to read RAMBlock ID");
return -1;
}
tmp = qemu_get_byte(mis->from_src_file);
if (tmp != 0) {
error_report("CMD_POSTCOPY_RAM_DISCARD missing nil (%d)", tmp);
return -1;
}
len -= 3 + strlen(ramid);
if (len % 16) {
error_report("CMD_POSTCOPY_RAM_DISCARD invalid length (%d)", len);
return -1;
}
trace_loadvm_postcopy_ram_handle_discard_header(ramid, len);
while (len) {
uint64_t start_addr, block_length;
start_addr = qemu_get_be64(mis->from_src_file);
block_length = qemu_get_be64(mis->from_src_file);
len -= 16;
int ret = ram_discard_range(ramid, start_addr, block_length);
if (ret) {
return ret;
}
}
trace_loadvm_postcopy_ram_handle_discard_end();
return 0;
}
/*
* Triggered by a postcopy_listen command; this thread takes over reading
* the input stream, leaving the main thread free to carry on loading the rest
* of the device state (from RAM).
* (TODO:This could do with being in a postcopy file - but there again it's
* just another input loop, not that postcopy specific)
*/
static void *postcopy_ram_listen_thread(void *opaque)
{
MigrationIncomingState *mis = migration_incoming_get_current();
QEMUFile *f = mis->from_src_file;
int load_res;
MigrationState *migr = migrate_get_current();
object_ref(OBJECT(migr));
migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
MIGRATION_STATUS_POSTCOPY_ACTIVE);
qemu_sem_post(&mis->thread_sync_sem);
trace_postcopy_ram_listen_thread_start();
rcu_register_thread();
/*
* Because we're a thread and not a coroutine we can't yield
* in qemu_file, and thus we must be blocking now.
*/
qemu_file_set_blocking(f, true);
load_res = qemu_loadvm_state_main(f, mis);
/*
* This is tricky, but, mis->from_src_file can change after it
* returns, when postcopy recovery happened. In the future, we may
* want a wrapper for the QEMUFile handle.
*/
f = mis->from_src_file;
/* And non-blocking again so we don't block in any cleanup */
qemu_file_set_blocking(f, false);
trace_postcopy_ram_listen_thread_exit();
if (load_res < 0) {
qemu_file_set_error(f, load_res);
dirty_bitmap_mig_cancel_incoming();
if (postcopy_state_get() == POSTCOPY_INCOMING_RUNNING &&
!migrate_postcopy_ram() && migrate_dirty_bitmaps())
{
error_report("%s: loadvm failed during postcopy: %d. All states "
"are migrated except dirty bitmaps. Some dirty "
"bitmaps may be lost, and present migrated dirty "
"bitmaps are correctly migrated and valid.",
__func__, load_res);
load_res = 0; /* prevent further exit() */
} else {
error_report("%s: loadvm failed: %d", __func__, load_res);
migrate_set_state(&mis->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
MIGRATION_STATUS_FAILED);
}
}
if (load_res >= 0) {
/*
* This looks good, but it's possible that the device loading in the
* main thread hasn't finished yet, and so we might not be in 'RUN'
* state yet; wait for the end of the main thread.
*/
qemu_event_wait(&mis->main_thread_load_event);
}
postcopy_ram_incoming_cleanup(mis);
if (load_res < 0) {
/*
* If something went wrong then we have a bad state so exit;
* depending how far we got it might be possible at this point
* to leave the guest running and fire MCEs for pages that never
* arrived as a desperate recovery step.
*/
rcu_unregister_thread();
exit(EXIT_FAILURE);
}
migrate_set_state(&mis->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
MIGRATION_STATUS_COMPLETED);
/*
* If everything has worked fine, then the main thread has waited
* for us to start, and we're the last use of the mis.
* (If something broke then qemu will have to exit anyway since it's
* got a bad migration state).
*/
migration_incoming_state_destroy();
qemu_loadvm_state_cleanup();
rcu_unregister_thread();
mis->have_listen_thread = false;
postcopy_state_set(POSTCOPY_INCOMING_END);
object_unref(OBJECT(migr));
return NULL;
}
/* After this message we must be able to immediately receive postcopy data */
static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
{
PostcopyState ps = postcopy_state_set(POSTCOPY_INCOMING_LISTENING);
Error *local_err = NULL;
trace_loadvm_postcopy_handle_listen("enter");
if (ps != POSTCOPY_INCOMING_ADVISE && ps != POSTCOPY_INCOMING_DISCARD) {
error_report("CMD_POSTCOPY_LISTEN in wrong postcopy state (%d)", ps);
return -1;
}
if (ps == POSTCOPY_INCOMING_ADVISE) {
/*
* A rare case, we entered listen without having to do any discards,
* so do the setup that's normally done at the time of the 1st discard.
*/
if (migrate_postcopy_ram()) {
postcopy_ram_prepare_discard(mis);
}
}
trace_loadvm_postcopy_handle_listen("after discard");
/*
* Sensitise RAM - can now generate requests for blocks that don't exist
* However, at this point the CPU shouldn't be running, and the IO
* shouldn't be doing anything yet so don't actually expect requests
*/
if (migrate_postcopy_ram()) {
if (postcopy_ram_incoming_setup(mis)) {
postcopy_ram_incoming_cleanup(mis);
return -1;
}
}
trace_loadvm_postcopy_handle_listen("after uffd");
if (postcopy_notify(POSTCOPY_NOTIFY_INBOUND_LISTEN, &local_err)) {
error_report_err(local_err);
return -1;
}
mis->have_listen_thread = true;
postcopy_thread_create(mis, &mis->listen_thread, "postcopy/listen",
postcopy_ram_listen_thread, QEMU_THREAD_DETACHED);
trace_loadvm_postcopy_handle_listen("return");
return 0;
}
static void loadvm_postcopy_handle_run_bh(void *opaque)
{
Error *local_err = NULL;
MigrationIncomingState *mis = opaque;
trace_loadvm_postcopy_handle_run_bh("enter");
/* TODO we should move all of this lot into postcopy_ram.c or a shared code
* in migration.c
*/
cpu_synchronize_all_post_init();
trace_loadvm_postcopy_handle_run_bh("after cpu sync");
qemu_announce_self(&mis->announce_timer, migrate_announce_params());
trace_loadvm_postcopy_handle_run_bh("after announce");
/* Make sure all file formats throw away their mutable metadata.
* If we get an error here, just don't restart the VM yet. */
bdrv_activate_all(&local_err);
if (local_err) {
error_report_err(local_err);
local_err = NULL;
autostart = false;
}
trace_loadvm_postcopy_handle_run_bh("after invalidate cache");
dirty_bitmap_mig_before_vm_start();
if (autostart) {
/* Hold onto your hats, starting the CPU */
vm_start();
} else {
/* leave it paused and let management decide when to start the CPU */
runstate_set(RUN_STATE_PAUSED);
}
qemu_bh_delete(mis->bh);
trace_loadvm_postcopy_handle_run_bh("return");
}
/* After all discards we can start running and asking for pages */
static int loadvm_postcopy_handle_run(MigrationIncomingState *mis)
{
PostcopyState ps = postcopy_state_get();
trace_loadvm_postcopy_handle_run();
if (ps != POSTCOPY_INCOMING_LISTENING) {
error_report("CMD_POSTCOPY_RUN in wrong postcopy state (%d)", ps);
return -1;
}
postcopy_state_set(POSTCOPY_INCOMING_RUNNING);
mis->bh = qemu_bh_new(loadvm_postcopy_handle_run_bh, mis);
qemu_bh_schedule(mis->bh);
/* We need to finish reading the stream from the package
* and also stop reading anything more from the stream that loaded the
* package (since it's now being read by the listener thread).
* LOADVM_QUIT will quit all the layers of nested loadvm loops.
*/
return LOADVM_QUIT;
}
/* We must be with page_request_mutex held */
static gboolean postcopy_sync_page_req(gpointer key, gpointer value,
gpointer data)
{
MigrationIncomingState *mis = data;
void *host_addr = (void *) key;
ram_addr_t rb_offset;
RAMBlock *rb;
int ret;
rb = qemu_ram_block_from_host(host_addr, true, &rb_offset);
if (!rb) {
/*
* This should _never_ happen. However be nice for a migrating VM to
* not crash/assert. Post an error (note: intended to not use *_once
* because we do want to see all the illegal addresses; and this can
* never be triggered by the guest so we're safe) and move on next.
*/
error_report("%s: illegal host addr %p", __func__, host_addr);
/* Try the next entry */
return FALSE;
}
ret = migrate_send_rp_message_req_pages(mis, rb, rb_offset);
if (ret) {
/* Please refer to above comment. */
error_report("%s: send rp message failed for addr %p",
__func__, host_addr);
return FALSE;
}
trace_postcopy_page_req_sync(host_addr);
return FALSE;
}
static void migrate_send_rp_req_pages_pending(MigrationIncomingState *mis)
{
WITH_QEMU_LOCK_GUARD(&mis->page_request_mutex) {
g_tree_foreach(mis->page_requested, postcopy_sync_page_req, mis);
}
}
static int loadvm_postcopy_handle_resume(MigrationIncomingState *mis)
{
if (mis->state != MIGRATION_STATUS_POSTCOPY_RECOVER) {
error_report("%s: illegal resume received", __func__);
/* Don't fail the load, only for this. */
return 0;
}
/*
* Reset the last_rb before we resend any page req to source again, since
* the source should have it reset already.
*/
mis->last_rb = NULL;
/*
* This means source VM is ready to resume the postcopy migration.
*/
migrate_set_state(&mis->state, MIGRATION_STATUS_POSTCOPY_RECOVER,
MIGRATION_STATUS_POSTCOPY_ACTIVE);
trace_loadvm_postcopy_handle_resume();
/* Tell source that "we are ready" */
migrate_send_rp_resume_ack(mis, MIGRATION_RESUME_ACK_VALUE);
/*
* After a postcopy recovery, the source should have lost the postcopy
* queue, or potentially the requested pages could have been lost during
* the network down phase. Let's re-sync with the source VM by re-sending
* all the pending pages that we eagerly need, so these threads won't get
* blocked too long due to the recovery.
*
* Without this procedure, the faulted destination VM threads (waiting for
* page requests right before the postcopy is interrupted) can keep hanging
* until the pages are sent by the source during the background copying of
* pages, or another thread faulted on the same address accidentally.
*/
migrate_send_rp_req_pages_pending(mis);
/*
* It's time to switch state and release the fault thread to continue
* service page faults. Note that this should be explicitly after the
* above call to migrate_send_rp_req_pages_pending(). In short:
* migrate_send_rp_message_req_pages() is not thread safe, yet.
*/
qemu_sem_post(&mis->postcopy_pause_sem_fault);
if (migrate_postcopy_preempt()) {
/* The channel should already be setup again; make sure of it */
assert(mis->postcopy_qemufile_dst);
/* Kick the fast ram load thread too */
qemu_sem_post(&mis->postcopy_pause_sem_fast_load);
}
return 0;
}
/**
* Immediately following this command is a blob of data containing an embedded
* chunk of migration stream; read it and load it.
*
* @mis: Incoming state
* @length: Length of packaged data to read
*
* Returns: Negative values on error
*
*/
static int loadvm_handle_cmd_packaged(MigrationIncomingState *mis)
{
int ret;
size_t length;
QIOChannelBuffer *bioc;
length = qemu_get_be32(mis->from_src_file);
trace_loadvm_handle_cmd_packaged(length);
if (length > MAX_VM_CMD_PACKAGED_SIZE) {
error_report("Unreasonably large packaged state: %zu", length);
return -1;
}
bioc = qio_channel_buffer_new(length);
qio_channel_set_name(QIO_CHANNEL(bioc), "migration-loadvm-buffer");
ret = qemu_get_buffer(mis->from_src_file,
bioc->data,
length);
if (ret != length) {
object_unref(OBJECT(bioc));
error_report("CMD_PACKAGED: Buffer receive fail ret=%d length=%zu",
ret, length);
return (ret < 0) ? ret : -EAGAIN;
}
bioc->usage += length;
trace_loadvm_handle_cmd_packaged_received(ret);
QEMUFile *packf = qemu_file_new_input(QIO_CHANNEL(bioc));
ret = qemu_loadvm_state_main(packf, mis);
trace_loadvm_handle_cmd_packaged_main(ret);
qemu_fclose(packf);
object_unref(OBJECT(bioc));
return ret;
}
/*
* Handle request that source requests for recved_bitmap on
* destination. Payload format:
*
* len (1 byte) + ramblock_name (<255 bytes)
*/
static int loadvm_handle_recv_bitmap(MigrationIncomingState *mis,
uint16_t len)
{
QEMUFile *file = mis->from_src_file;
RAMBlock *rb;
char block_name[256];
size_t cnt;
cnt = qemu_get_counted_string(file, block_name);
if (!cnt) {
error_report("%s: failed to read block name", __func__);
return -EINVAL;
}
/* Validate before using the data */
if (qemu_file_get_error(file)) {
return qemu_file_get_error(file);
}
if (len != cnt + 1) {
error_report("%s: invalid payload length (%d)", __func__, len);
return -EINVAL;
}
rb = qemu_ram_block_by_name(block_name);
if (!rb) {
error_report("%s: block '%s' not found", __func__, block_name);
return -EINVAL;
}
migrate_send_rp_recv_bitmap(mis, block_name);
trace_loadvm_handle_recv_bitmap(block_name);
return 0;
}
static int loadvm_process_enable_colo(MigrationIncomingState *mis)
{
int ret = migration_incoming_enable_colo();
if (!ret) {
ret = colo_init_ram_cache();
if (ret) {
migration_incoming_disable_colo();
}
}
return ret;
}
/*
* Process an incoming 'QEMU_VM_COMMAND'
* 0 just a normal return
* LOADVM_QUIT All good, but exit the loop
* <0 Error
*/
static int loadvm_process_command(QEMUFile *f)
{
MigrationIncomingState *mis = migration_incoming_get_current();
uint16_t cmd;
uint16_t len;
uint32_t tmp32;
cmd = qemu_get_be16(f);
len = qemu_get_be16(f);
/* Check validity before continue processing of cmds */
if (qemu_file_get_error(f)) {
return qemu_file_get_error(f);
}
if (cmd >= MIG_CMD_MAX || cmd == MIG_CMD_INVALID) {
error_report("MIG_CMD 0x%x unknown (len 0x%x)", cmd, len);
return -EINVAL;
}
trace_loadvm_process_command(mig_cmd_args[cmd].name, len);
if (mig_cmd_args[cmd].len != -1 && mig_cmd_args[cmd].len != len) {
error_report("%s received with bad length - expecting %zu, got %d",
mig_cmd_args[cmd].name,
(size_t)mig_cmd_args[cmd].len, len);
return -ERANGE;
}
switch (cmd) {
case MIG_CMD_OPEN_RETURN_PATH:
if (mis->to_src_file) {
error_report("CMD_OPEN_RETURN_PATH called when RP already open");
/* Not really a problem, so don't give up */
return 0;
}
mis->to_src_file = qemu_file_get_return_path(f);
if (!mis->to_src_file) {
error_report("CMD_OPEN_RETURN_PATH failed");
return -1;
}
break;
case MIG_CMD_PING:
tmp32 = qemu_get_be32(f);
trace_loadvm_process_command_ping(tmp32);
if (!mis->to_src_file) {
error_report("CMD_PING (0x%x) received with no return path",
tmp32);
return -1;
}
migrate_send_rp_pong(mis, tmp32);
break;
case MIG_CMD_PACKAGED:
return loadvm_handle_cmd_packaged(mis);
case MIG_CMD_POSTCOPY_ADVISE:
return loadvm_postcopy_handle_advise(mis, len);
case MIG_CMD_POSTCOPY_LISTEN:
return loadvm_postcopy_handle_listen(mis);
case MIG_CMD_POSTCOPY_RUN:
return loadvm_postcopy_handle_run(mis);
case MIG_CMD_POSTCOPY_RAM_DISCARD:
return loadvm_postcopy_ram_handle_discard(mis, len);
case MIG_CMD_POSTCOPY_RESUME:
return loadvm_postcopy_handle_resume(mis);
case MIG_CMD_RECV_BITMAP:
return loadvm_handle_recv_bitmap(mis, len);
case MIG_CMD_ENABLE_COLO:
return loadvm_process_enable_colo(mis);
}
return 0;
}
/*
* Read a footer off the wire and check that it matches the expected section
*
* Returns: true if the footer was good
* false if there is a problem (and calls error_report to say why)
*/
static bool check_section_footer(QEMUFile *f, SaveStateEntry *se)
{
int ret;
uint8_t read_mark;
uint32_t read_section_id;
if (!migrate_get_current()->send_section_footer) {
/* No footer to check */
return true;
}
read_mark = qemu_get_byte(f);
ret = qemu_file_get_error(f);
if (ret) {
error_report("%s: Read section footer failed: %d",
__func__, ret);
return false;
}
if (read_mark != QEMU_VM_SECTION_FOOTER) {
error_report("Missing section footer for %s", se->idstr);
return false;
}
read_section_id = qemu_get_be32(f);
if (read_section_id != se->load_section_id) {
error_report("Mismatched section id in footer for %s -"
" read 0x%x expected 0x%x",
se->idstr, read_section_id, se->load_section_id);
return false;
}
/* All good */
return true;
}
static int
qemu_loadvm_section_start_full(QEMUFile *f, MigrationIncomingState *mis)
{
uint32_t instance_id, version_id, section_id;
SaveStateEntry *se;
char idstr[256];
int ret;
/* Read section start */
section_id = qemu_get_be32(f);
if (!qemu_get_counted_string(f, idstr)) {
error_report("Unable to read ID string for section %u",
section_id);
return -EINVAL;
}
instance_id = qemu_get_be32(f);
version_id = qemu_get_be32(f);
ret = qemu_file_get_error(f);
if (ret) {
error_report("%s: Failed to read instance/version ID: %d",
__func__, ret);
return ret;
}
trace_qemu_loadvm_state_section_startfull(section_id, idstr,
instance_id, version_id);
/* Find savevm section */
se = find_se(idstr, instance_id);
if (se == NULL) {
error_report("Unknown savevm section or instance '%s' %"PRIu32". "
"Make sure that your current VM setup matches your "
"saved VM setup, including any hotplugged devices",
idstr, instance_id);
return -EINVAL;
}
/* Validate version */
if (version_id > se->version_id) {
error_report("savevm: unsupported version %d for '%s' v%d",
version_id, idstr, se->version_id);
return -EINVAL;
}
se->load_version_id = version_id;
se->load_section_id = section_id;
/* Validate if it is a device's state */
if (xen_enabled() && se->is_ram) {
error_report("loadvm: %s RAM loading not allowed on Xen", idstr);
return -EINVAL;
}
ret = vmstate_load(f, se);
if (ret < 0) {
error_report("error while loading state for instance 0x%"PRIx32" of"
" device '%s'", instance_id, idstr);
return ret;
}
if (!check_section_footer(f, se)) {
return -EINVAL;
}
return 0;
}
static int
qemu_loadvm_section_part_end(QEMUFile *f, MigrationIncomingState *mis)
{
uint32_t section_id;
SaveStateEntry *se;
int ret;
section_id = qemu_get_be32(f);
ret = qemu_file_get_error(f);
if (ret) {
error_report("%s: Failed to read section ID: %d",
__func__, ret);
return ret;
}
trace_qemu_loadvm_state_section_partend(section_id);
QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
if (se->load_section_id == section_id) {
break;
}
}
if (se == NULL) {
error_report("Unknown savevm section %d", section_id);
return -EINVAL;
}
ret = vmstate_load(f, se);
if (ret < 0) {
error_report("error while loading state section id %d(%s)",
section_id, se->idstr);
return ret;
}
if (!check_section_footer(f, se)) {
return -EINVAL;
}
return 0;
}
static int qemu_loadvm_state_header(QEMUFile *f)
{
unsigned int v;
int ret;
v = qemu_get_be32(f);
if (v != QEMU_VM_FILE_MAGIC) {
error_report("Not a migration stream");
return -EINVAL;
}
v = qemu_get_be32(f);
if (v == QEMU_VM_FILE_VERSION_COMPAT) {
error_report("SaveVM v2 format is obsolete and don't work anymore");
return -ENOTSUP;
}
if (v != QEMU_VM_FILE_VERSION) {
error_report("Unsupported migration stream version");
return -ENOTSUP;
}
if (migrate_get_current()->send_configuration) {
if (qemu_get_byte(f) != QEMU_VM_CONFIGURATION) {
error_report("Configuration section missing");
qemu_loadvm_state_cleanup();
return -EINVAL;
}
ret = vmstate_load_state(f, &vmstate_configuration, &savevm_state, 0);
if (ret) {
qemu_loadvm_state_cleanup();
return ret;
}
}
return 0;
}
static int qemu_loadvm_state_setup(QEMUFile *f)
{
SaveStateEntry *se;
int ret;
trace_loadvm_state_setup();
QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
if (!se->ops || !se->ops->load_setup) {
continue;
}
if (se->ops->is_active) {
if (!se->ops->is_active(se->opaque)) {
continue;
}
}
ret = se->ops->load_setup(f, se->opaque);
if (ret < 0) {
qemu_file_set_error(f, ret);
error_report("Load state of device %s failed", se->idstr);
return ret;
}
}
return 0;
}
void qemu_loadvm_state_cleanup(void)
{
SaveStateEntry *se;
trace_loadvm_state_cleanup();
QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
if (se->ops && se->ops->load_cleanup) {
se->ops->load_cleanup(se->opaque);
}
}
}
/* Return true if we should continue the migration, or false. */
static bool postcopy_pause_incoming(MigrationIncomingState *mis)
{
int i;
trace_postcopy_pause_incoming();
assert(migrate_postcopy_ram());
/*
* Unregister yank with either from/to src would work, since ioc behind it
* is the same
*/
migration_ioc_unregister_yank_from_file(mis->from_src_file);
assert(mis->from_src_file);
qemu_file_shutdown(mis->from_src_file);
qemu_fclose(mis->from_src_file);
mis->from_src_file = NULL;
assert(mis->to_src_file);
qemu_file_shutdown(mis->to_src_file);
qemu_mutex_lock(&mis->rp_mutex);
qemu_fclose(mis->to_src_file);
mis->to_src_file = NULL;
qemu_mutex_unlock(&mis->rp_mutex);
/*
* NOTE: this must happen before reset the PostcopyTmpPages below,
* otherwise it's racy to reset those fields when the fast load thread
* can be accessing it in parallel.
*/
if (mis->postcopy_qemufile_dst) {
qemu_file_shutdown(mis->postcopy_qemufile_dst);
/* Take the mutex to make sure the fast ram load thread halted */
qemu_mutex_lock(&mis->postcopy_prio_thread_mutex);
migration_ioc_unregister_yank_from_file(mis->postcopy_qemufile_dst);
qemu_fclose(mis->postcopy_qemufile_dst);
mis->postcopy_qemufile_dst = NULL;
qemu_mutex_unlock(&mis->postcopy_prio_thread_mutex);
}
migrate_set_state(&mis->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
MIGRATION_STATUS_POSTCOPY_PAUSED);
/* Notify the fault thread for the invalidated file handle */
postcopy_fault_thread_notify(mis);
/*
* If network is interrupted, any temp page we received will be useless
* because we didn't mark them as "received" in receivedmap. After a
* proper recovery later (which will sync src dirty bitmap with receivedmap
* on dest) these cached small pages will be resent again.
*/
for (i = 0; i < mis->postcopy_channels; i++) {
postcopy_temp_page_reset(&mis->postcopy_tmp_pages[i]);
}
error_report("Detected IO failure for postcopy. "
"Migration paused.");
while (mis->state == MIGRATION_STATUS_POSTCOPY_PAUSED) {
qemu_sem_wait(&mis->postcopy_pause_sem_dst);
}
trace_postcopy_pause_incoming_continued();
return true;
}
int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis)
{
uint8_t section_type;
int ret = 0;
retry:
while (true) {
section_type = qemu_get_byte(f);
ret = qemu_file_get_error_obj_any(f, mis->postcopy_qemufile_dst, NULL);
if (ret) {
break;
}
trace_qemu_loadvm_state_section(section_type);
switch (section_type) {
case QEMU_VM_SECTION_START:
case QEMU_VM_SECTION_FULL:
ret = qemu_loadvm_section_start_full(f, mis);
if (ret < 0) {
goto out;
}
break;
case QEMU_VM_SECTION_PART:
case QEMU_VM_SECTION_END:
ret = qemu_loadvm_section_part_end(f, mis);
if (ret < 0) {
goto out;
}
break;
case QEMU_VM_COMMAND:
ret = loadvm_process_command(f);
trace_qemu_loadvm_state_section_command(ret);
if ((ret < 0) || (ret == LOADVM_QUIT)) {
goto out;
}
break;
case QEMU_VM_EOF:
/* This is the end of migration */
goto out;
default:
error_report("Unknown savevm section type %d", section_type);
ret = -EINVAL;
goto out;
}
}
out:
if (ret < 0) {
qemu_file_set_error(f, ret);
/* Cancel bitmaps incoming regardless of recovery */
dirty_bitmap_mig_cancel_incoming();
/*
* If we are during an active postcopy, then we pause instead
* of bail out to at least keep the VM's dirty data. Note
* that POSTCOPY_INCOMING_LISTENING stage is still not enough,
* during which we're still receiving device states and we
* still haven't yet started the VM on destination.
*
* Only RAM postcopy supports recovery. Still, if RAM postcopy is
* enabled, canceled bitmaps postcopy will not affect RAM postcopy
* recovering.
*/
if (postcopy_state_get() == POSTCOPY_INCOMING_RUNNING &&
migrate_postcopy_ram() && postcopy_pause_incoming(mis)) {
/* Reset f to point to the newly created channel */
f = mis->from_src_file;
goto retry;
}
}
return ret;
}
int qemu_loadvm_state(QEMUFile *f)
{
MigrationIncomingState *mis = migration_incoming_get_current();
Error *local_err = NULL;
int ret;
if (qemu_savevm_state_blocked(&local_err)) {
error_report_err(local_err);
return -EINVAL;
}
ret = qemu_loadvm_state_header(f);
if (ret) {
return ret;
}
if (qemu_loadvm_state_setup(f) != 0) {
return -EINVAL;
}
cpu_synchronize_all_pre_loadvm();
ret = qemu_loadvm_state_main(f, mis);
qemu_event_set(&mis->main_thread_load_event);
trace_qemu_loadvm_state_post_main(ret);
if (mis->have_listen_thread) {
/* Listen thread still going, can't clean up yet */
return ret;
}
if (ret == 0) {
ret = qemu_file_get_error(f);
}
/*
* Try to read in the VMDESC section as well, so that dumping tools that
* intercept our migration stream have the chance to see it.
*/
/* We've got to be careful; if we don't read the data and just shut the fd
* then the sender can error if we close while it's still sending.
* We also mustn't read data that isn't there; some transports (RDMA)
* will stall waiting for that data when the source has already closed.
*/
if (ret == 0 && should_send_vmdesc()) {
uint8_t *buf;
uint32_t size;
uint8_t section_type = qemu_get_byte(f);
if (section_type != QEMU_VM_VMDESCRIPTION) {
error_report("Expected vmdescription section, but got %d",
section_type);
/*
* It doesn't seem worth failing at this point since
* we apparently have an otherwise valid VM state
*/
} else {
buf = g_malloc(0x1000);
size = qemu_get_be32(f);
while (size > 0) {
uint32_t read_chunk = MIN(size, 0x1000);
qemu_get_buffer(f, buf, read_chunk);
size -= read_chunk;
}
g_free(buf);
}
}
qemu_loadvm_state_cleanup();
cpu_synchronize_all_post_init();
return ret;
}
int qemu_load_device_state(QEMUFile *f)
{
MigrationIncomingState *mis = migration_incoming_get_current();
int ret;
/* Load QEMU_VM_SECTION_FULL section */
ret = qemu_loadvm_state_main(f, mis);
if (ret < 0) {
error_report("Failed to load device state: %d", ret);
return ret;
}
cpu_synchronize_all_post_init();
return 0;
}
bool save_snapshot(const char *name, bool overwrite, const char *vmstate,
bool has_devices, strList *devices, Error **errp)
{
BlockDriverState *bs;
QEMUSnapshotInfo sn1, *sn = &sn1;
int ret = -1, ret2;
QEMUFile *f;
int saved_vm_running;
uint64_t vm_state_size;
g_autoptr(GDateTime) now = g_date_time_new_now_local();
AioContext *aio_context;
GLOBAL_STATE_CODE();
if (migration_is_blocked(errp)) {
return false;
}
if (!replay_can_snapshot()) {
error_setg(errp, "Record/replay does not allow making snapshot "
"right now. Try once more later.");
return false;
}
if (!bdrv_all_can_snapshot(has_devices, devices, errp)) {
return false;
}
/* Delete old snapshots of the same name */
if (name) {
if (overwrite) {
if (bdrv_all_delete_snapshot(name, has_devices,
devices, errp) < 0) {
return false;
}
} else {
ret2 = bdrv_all_has_snapshot(name, has_devices, devices, errp);
if (ret2 < 0) {
return false;
}
if (ret2 == 1) {
error_setg(errp,
"Snapshot '%s' already exists in one or more devices",
name);
return false;
}
}
}
bs = bdrv_all_find_vmstate_bs(vmstate, has_devices, devices, errp);
if (bs == NULL) {
return false;
}
aio_context = bdrv_get_aio_context(bs);
saved_vm_running = runstate_is_running();
ret = global_state_store();
if (ret) {
error_setg(errp, "Error saving global state");
return false;
}
vm_stop(RUN_STATE_SAVE_VM);
bdrv_drain_all_begin();
aio_context_acquire(aio_context);
memset(sn, 0, sizeof(*sn));
/* fill auxiliary fields */
sn->date_sec = g_date_time_to_unix(now);
sn->date_nsec = g_date_time_get_microsecond(now) * 1000;
sn->vm_clock_nsec = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
if (replay_mode != REPLAY_MODE_NONE) {
sn->icount = replay_get_current_icount();
} else {
sn->icount = -1ULL;
}
if (name) {
pstrcpy(sn->name, sizeof(sn->name), name);
} else {
g_autofree char *autoname = g_date_time_format(now, "vm-%Y%m%d%H%M%S");
pstrcpy(sn->name, sizeof(sn->name), autoname);
}
/* save the VM state */
f = qemu_fopen_bdrv(bs, 1);
if (!f) {
error_setg(errp, "Could not open VM state file");
goto the_end;
}
ret = qemu_savevm_state(f, errp);
vm_state_size = qemu_file_total_transferred(f);
ret2 = qemu_fclose(f);
if (ret < 0) {
goto the_end;
}
if (ret2 < 0) {
ret = ret2;
goto the_end;
}
/* The bdrv_all_create_snapshot() call that follows acquires the AioContext
* for itself. BDRV_POLL_WHILE() does not support nested locking because
* it only releases the lock once. Therefore synchronous I/O will deadlock
* unless we release the AioContext before bdrv_all_create_snapshot().
*/
aio_context_release(aio_context);
aio_context = NULL;
ret = bdrv_all_create_snapshot(sn, bs, vm_state_size,
has_devices, devices, errp);
if (ret < 0) {
bdrv_all_delete_snapshot(sn->name, has_devices, devices, NULL);
goto the_end;
}
ret = 0;
the_end:
if (aio_context) {
aio_context_release(aio_context);
}
bdrv_drain_all_end();
if (saved_vm_running) {
vm_start();
}
return ret == 0;
}
void qmp_xen_save_devices_state(const char *filename, bool has_live, bool live,
Error **errp)
{
QEMUFile *f;
QIOChannelFile *ioc;
int saved_vm_running;
int ret;
if (!has_live) {
/* live default to true so old version of Xen tool stack can have a
* successful live migration */
live = true;
}
saved_vm_running = runstate_is_running();
vm_stop(RUN_STATE_SAVE_VM);
global_state_store_running();
ioc = qio_channel_file_new_path(filename, O_WRONLY | O_CREAT | O_TRUNC,
0660, errp);
if (!ioc) {
goto the_end;
}
qio_channel_set_name(QIO_CHANNEL(ioc), "migration-xen-save-state");
f = qemu_file_new_output(QIO_CHANNEL(ioc));
object_unref(OBJECT(ioc));
ret = qemu_save_device_state(f);
if (ret < 0 || qemu_fclose(f) < 0) {
error_setg(errp, QERR_IO_ERROR);
} else {
/* libxl calls the QMP command "stop" before calling
* "xen-save-devices-state" and in case of migration failure, libxl
* would call "cont".
* So call bdrv_inactivate_all (release locks) here to let the other
* side of the migration take control of the images.
*/
if (live && !saved_vm_running) {
ret = bdrv_inactivate_all();
if (ret) {
error_setg(errp, "%s: bdrv_inactivate_all() failed (%d)",
__func__, ret);
}
}
}
the_end:
if (saved_vm_running) {
vm_start();
}
}
void qmp_xen_load_devices_state(const char *filename, Error **errp)
{
QEMUFile *f;
QIOChannelFile *ioc;
int ret;
/* Guest must be paused before loading the device state; the RAM state
* will already have been loaded by xc
*/
if (runstate_is_running()) {
error_setg(errp, "Cannot update device state while vm is running");
return;
}
vm_stop(RUN_STATE_RESTORE_VM);
ioc = qio_channel_file_new_path(filename, O_RDONLY | O_BINARY, 0, errp);
if (!ioc) {
return;
}
qio_channel_set_name(QIO_CHANNEL(ioc), "migration-xen-load-state");
f = qemu_file_new_input(QIO_CHANNEL(ioc));
object_unref(OBJECT(ioc));
ret = qemu_loadvm_state(f);
qemu_fclose(f);
if (ret < 0) {
error_setg(errp, QERR_IO_ERROR);
}
migration_incoming_state_destroy();
}
bool load_snapshot(const char *name, const char *vmstate,
bool has_devices, strList *devices, Error **errp)
{
BlockDriverState *bs_vm_state;
QEMUSnapshotInfo sn;
QEMUFile *f;
int ret;
AioContext *aio_context;
MigrationIncomingState *mis = migration_incoming_get_current();
if (!bdrv_all_can_snapshot(has_devices, devices, errp)) {
return false;
}
ret = bdrv_all_has_snapshot(name, has_devices, devices, errp);
if (ret < 0) {
return false;
}
if (ret == 0) {
error_setg(errp, "Snapshot '%s' does not exist in one or more devices",
name);
return false;
}
bs_vm_state = bdrv_all_find_vmstate_bs(vmstate, has_devices, devices, errp);
if (!bs_vm_state) {
return false;
}
aio_context = bdrv_get_aio_context(bs_vm_state);
/* Don't even try to load empty VM states */
aio_context_acquire(aio_context);
ret = bdrv_snapshot_find(bs_vm_state, &sn, name);
aio_context_release(aio_context);
if (ret < 0) {
return false;
} else if (sn.vm_state_size == 0) {
error_setg(errp, "This is a disk-only snapshot. Revert to it "
" offline using qemu-img");
return false;
}
/*
* Flush the record/replay queue. Now the VM state is going
* to change. Therefore we don't need to preserve its consistency
*/
replay_flush_events();
/* Flush all IO requests so they don't interfere with the new state. */
bdrv_drain_all_begin();
ret = bdrv_all_goto_snapshot(name, has_devices, devices, errp);
if (ret < 0) {
goto err_drain;
}
/* restore the VM state */
f = qemu_fopen_bdrv(bs_vm_state, 0);
if (!f) {
error_setg(errp, "Could not open VM state file");
goto err_drain;
}
qemu_system_reset(SHUTDOWN_CAUSE_SNAPSHOT_LOAD);
mis->from_src_file = f;
if (!yank_register_instance(MIGRATION_YANK_INSTANCE, errp)) {
ret = -EINVAL;
goto err_drain;
}
aio_context_acquire(aio_context);
ret = qemu_loadvm_state(f);
migration_incoming_state_destroy();
aio_context_release(aio_context);
bdrv_drain_all_end();
if (ret < 0) {
error_setg(errp, "Error %d while loading VM state", ret);
return false;
}
return true;
err_drain:
bdrv_drain_all_end();
return false;
}
bool delete_snapshot(const char *name, bool has_devices,
strList *devices, Error **errp)
{
if (!bdrv_all_can_snapshot(has_devices, devices, errp)) {
return false;
}
if (bdrv_all_delete_snapshot(name, has_devices, devices, errp) < 0) {
return false;
}
return true;
}
void vmstate_register_ram(MemoryRegion *mr, DeviceState *dev)
{
qemu_ram_set_idstr(mr->ram_block,
memory_region_name(mr), dev);
qemu_ram_set_migratable(mr->ram_block);
}
void vmstate_unregister_ram(MemoryRegion *mr, DeviceState *dev)
{
qemu_ram_unset_idstr(mr->ram_block);
qemu_ram_unset_migratable(mr->ram_block);
}
void vmstate_register_ram_global(MemoryRegion *mr)
{
vmstate_register_ram(mr, NULL);
}
bool vmstate_check_only_migratable(const VMStateDescription *vmsd)
{
/* check needed if --only-migratable is specified */
if (!only_migratable) {
return true;
}
return !(vmsd && vmsd->unmigratable);
}
typedef struct SnapshotJob {
Job common;
char *tag;
char *vmstate;
strList *devices;
Coroutine *co;
Error **errp;
bool ret;
} SnapshotJob;
static void qmp_snapshot_job_free(SnapshotJob *s)
{
g_free(s->tag);
g_free(s->vmstate);
qapi_free_strList(s->devices);
}
static void snapshot_load_job_bh(void *opaque)
{
Job *job = opaque;
SnapshotJob *s = container_of(job, SnapshotJob, common);
int orig_vm_running;
job_progress_set_remaining(&s->common, 1);
orig_vm_running = runstate_is_running();
vm_stop(RUN_STATE_RESTORE_VM);
s->ret = load_snapshot(s->tag, s->vmstate, true, s->devices, s->errp);
if (s->ret && orig_vm_running) {
vm_start();
}
job_progress_update(&s->common, 1);
qmp_snapshot_job_free(s);
aio_co_wake(s->co);
}
static void snapshot_save_job_bh(void *opaque)
{
Job *job = opaque;
SnapshotJob *s = container_of(job, SnapshotJob, common);
job_progress_set_remaining(&s->common, 1);
s->ret = save_snapshot(s->tag, false, s->vmstate,
true, s->devices, s->errp);
job_progress_update(&s->common, 1);
qmp_snapshot_job_free(s);
aio_co_wake(s->co);
}
static void snapshot_delete_job_bh(void *opaque)
{
Job *job = opaque;
SnapshotJob *s = container_of(job, SnapshotJob, common);
job_progress_set_remaining(&s->common, 1);
s->ret = delete_snapshot(s->tag, true, s->devices, s->errp);
job_progress_update(&s->common, 1);
qmp_snapshot_job_free(s);
aio_co_wake(s->co);
}
static int coroutine_fn snapshot_save_job_run(Job *job, Error **errp)
{
SnapshotJob *s = container_of(job, SnapshotJob, common);
s->errp = errp;
s->co = qemu_coroutine_self();
aio_bh_schedule_oneshot(qemu_get_aio_context(),
snapshot_save_job_bh, job);
qemu_coroutine_yield();
return s->ret ? 0 : -1;
}
static int coroutine_fn snapshot_load_job_run(Job *job, Error **errp)
{
SnapshotJob *s = container_of(job, SnapshotJob, common);
s->errp = errp;
s->co = qemu_coroutine_self();
aio_bh_schedule_oneshot(qemu_get_aio_context(),
snapshot_load_job_bh, job);
qemu_coroutine_yield();
return s->ret ? 0 : -1;
}
static int coroutine_fn snapshot_delete_job_run(Job *job, Error **errp)
{
SnapshotJob *s = container_of(job, SnapshotJob, common);
s->errp = errp;
s->co = qemu_coroutine_self();
aio_bh_schedule_oneshot(qemu_get_aio_context(),
snapshot_delete_job_bh, job);
qemu_coroutine_yield();
return s->ret ? 0 : -1;
}
static const JobDriver snapshot_load_job_driver = {
.instance_size = sizeof(SnapshotJob),
.job_type = JOB_TYPE_SNAPSHOT_LOAD,
.run = snapshot_load_job_run,
};
static const JobDriver snapshot_save_job_driver = {
.instance_size = sizeof(SnapshotJob),
.job_type = JOB_TYPE_SNAPSHOT_SAVE,
.run = snapshot_save_job_run,
};
static const JobDriver snapshot_delete_job_driver = {
.instance_size = sizeof(SnapshotJob),
.job_type = JOB_TYPE_SNAPSHOT_DELETE,
.run = snapshot_delete_job_run,
};
void qmp_snapshot_save(const char *job_id,
const char *tag,
const char *vmstate,
strList *devices,
Error **errp)
{
SnapshotJob *s;
s = job_create(job_id, &snapshot_save_job_driver, NULL,
qemu_get_aio_context(), JOB_MANUAL_DISMISS,
NULL, NULL, errp);
if (!s) {
return;
}
s->tag = g_strdup(tag);
s->vmstate = g_strdup(vmstate);
s->devices = QAPI_CLONE(strList, devices);
job_start(&s->common);
}
void qmp_snapshot_load(const char *job_id,
const char *tag,
const char *vmstate,
strList *devices,
Error **errp)
{
SnapshotJob *s;
s = job_create(job_id, &snapshot_load_job_driver, NULL,
qemu_get_aio_context(), JOB_MANUAL_DISMISS,
NULL, NULL, errp);
if (!s) {
return;
}
s->tag = g_strdup(tag);
s->vmstate = g_strdup(vmstate);
s->devices = QAPI_CLONE(strList, devices);
job_start(&s->common);
}
void qmp_snapshot_delete(const char *job_id,
const char *tag,
strList *devices,
Error **errp)
{
SnapshotJob *s;
s = job_create(job_id, &snapshot_delete_job_driver, NULL,
qemu_get_aio_context(), JOB_MANUAL_DISMISS,
NULL, NULL, errp);
if (!s) {
return;
}
s->tag = g_strdup(tag);
s->devices = QAPI_CLONE(strList, devices);
job_start(&s->common);
}