The fact that the MMIO handler is not re-entrant causes an infinite
loop under certain conditions:
Guest write to TDT -> Loopback -> RX (DMA to TDT) -> TX
We now eliminate the effect of this problem locally in e1000, by adding
a boolean in struct E1000State indicating when the TX side is busy. This
will cause any entering new call to return early instead of interfering
with the ongoing work, and eliminates any risk of looping.
This is intended to address CVE-2021-20257.
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 25ddb946e6301f42cff3094ea1c25fb78813e7e9)
Signed-off-by: Michael Roth <michael.roth@amd.com>
While activating device in vmxnet3_acticate_device(), it does not
validate guest supplied configuration values against predefined
minimum - maximum limits. This may lead to integer overflow or
OOB access issues. Add checks to avoid it.
Fixes: CVE-2021-20203
Buglink: https://bugs.launchpad.net/qemu/+bug/1913873
Reported-by: Gaoning Pan <pgn@zju.edu.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit d05dcd94aee88728facafb993c7280547eb4d645)
Signed-off-by: Michael Roth <michael.roth@amd.com>
The code that introduced "virtio-blk: Configure all host notifiers in
a single MR transaction" introduced a second loop variable to perform
cleanup in second loop, but mistakenly still refers to the first
loop variable within the second loop body.
Fixes: d0267da61489 ("virtio-blk: Configure all host notifiers in a single MR transaction")
Signed-off-by: Mark Mielke <mark.mielke@gmail.com>
Message-id: CALm7yL08qarOu0dnQkTN+pa=BSRC92g31YpQQNDeAiT4yLZWQQ@mail.gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 5b807181c27a940a3a7ad1f221a2e76a132cbdc0)
Signed-off-by: Michael Roth <michael.roth@amd.com>
Guest might select another drive on the bus by setting the
DRIVE_SEL bit of the DIGITAL OUTPUT REGISTER (DOR).
The current controller model doesn't expect a BlockBackend
to be NULL. A simple way to fix CVE-2021-20196 is to create
an empty BlockBackend when it is missing. All further
accesses will be safely handled, and the controller state
machines keep behaving correctly.
Cc: qemu-stable@nongnu.org
Fixes: CVE-2021-20196
Reported-by: Gaoning Pan (Ant Security Light-Year Lab) <pgn@zju.edu.cn>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20211124161536.631563-3-philmd@redhat.com
BugLink: https://bugs.launchpad.net/qemu/+bug/1912780
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/338
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 1ab95af033a419e7a64e2d58e67dd96b20af5233)
Signed-off-by: Michael Roth <michael.roth@amd.com>
We are going to re-use this code in the next commit,
so extract it as a new blk_create_empty_drive() function.
Inspired-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20211124161536.631563-2-philmd@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit b154791e7b6d4ca5cdcd54443484d97360bd7ad2)
Signed-off-by: Michael Roth <michael.roth@amd.com>
hostwin is allocated and added to hostwin_list in vfio_host_win_add, but
it is only deleted from hostwin_list in vfio_host_win_del, which causes
a memory leak. Also, freeing all elements in hostwin_list is missing in
vfio_disconnect_container.
Fix: 2e4109de8e58 ("vfio/spapr: Create DMA window dynamically (SPAPR IOMMU v2)")
CC: qemu-stable@nongnu.org
Signed-off-by: Peng Liang <liangpeng10@huawei.com>
Link: https://lore.kernel.org/r/20211117014739.1839263-1-liangpeng10@huawei.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
(cherry picked from commit f3bc3a73c908df15966e66f88d5a633bd42fd029)
Signed-off-by: Michael Roth <michael.roth@amd.com>
We used to access packed descriptor event and off_wrap via
address_space_{write|read}_cached(). When we hit the cache, memcpy()
is used which is not atomic which may lead a wrong value to be read or
wrote.
This patch fixes this by switching to use
virito_{stw|lduw}_phys_cached() to make sure the access is atomic.
Fixes: 683f7665679c1 ("virtio: event suppression support for packed ring")
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20211111063854.29060-2-jasowang@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit d152cdd6f6fad381e804c8185f0ba938030ccac9)
Signed-off-by: Michael Roth <michael.roth@amd.com>
We used to access packed descriptor flags via
address_space_{write|read}_cached(). When we hit the cache, memcpy()
is used which is not an atomic operation which may lead a wrong value
is read or wrote.
So this patch switches to use virito_{stw|lduw}_phys_cached() to make
sure the aceess is atomic.
Fixes: 86044b24e865f ("virtio: basic packed virtqueue support")
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20211111063854.29060-1-jasowang@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit f463e761a41ee71e59892121e1c74d9c25c985d2)
Signed-off-by: Michael Roth <michael.roth@amd.com>
Mark property as experimental/internal adding 'x-' prefix.
Property was introduced in 6.1 and it should have provided
ability to turn on native PCIE hotplug on port even when
ACPI PCI hotplug is in use is user explicitly sets property
on CLI. However that never worked since slot is wired to
ACPI hotplug controller.
Another non-intended usecase: disable native hotplug on slot
when APCI based hotplug is disabled, which works but slot has
'hotplug' property for this taks.
It should be relatively safe to rename it to experimental
as no users should exist for it and given that the property
is broken we don't really want to leave it around for much
longer lest users start using it.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Ani Sinha <ani@anisinha.ca>
Message-Id: <20211112110857.3116853-2-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 2aa1842d6d79dcd1b84c58eeb44591a99a9e56df)
Signed-off-by: Michael Roth <michael.roth@amd.com>
This avoids an off-by-one read of 'mode_sense_valid' buffer in
hw/scsi/scsi-disk.c:mode_sense_page().
Fixes: CVE-2021-3930
Cc: qemu-stable@nongnu.org
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Fixes: a8f4bbe2900 ("scsi-disk: store valid mode pages in a table")
Fixes: #546
Reported-by: Qiuhao Li <Qiuhao.Li@outlook.com>
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit b3af7fdf9cc537f8f0dd3e2423d83f5c99a457e8)
Signed-off-by: Michael Roth <michael.roth@amd.com>
PCI resource reserve capability should use LE format as all other PCI
things. If we don't then seabios won't boot:
=== PCI new allocation pass #1 ===
PCI: check devices
PCI: QEMU resource reserve cap: size 10000000000000 type io
PCI: secondary bus 1 size 10000000000000 type io
PCI: secondary bus 1 size 00200000 type mem
PCI: secondary bus 1 size 00200000 type prefmem
=== PCI new allocation pass #2 ===
PCI: out of I/O address space
This became more important since we started reserving IO by default,
previously no one noticed.
Fixes: e2a6290aab ("hw/pcie-root-port: Fix hotplug for PCI devices requiring IO")
Cc: marcel.apfelbaum@gmail.com
Fixes: 226263fb5c ("hw/pci: add QEMU-specific PCI capability to the Generic PCI Express Root Port")
Cc: zuban32s@gmail.com
Fixes: 6755e618d0 ("hw/pci: add PCI resource reserve capability to legacy PCI bridge")
Cc: jing2.liu@linux.intel.com
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
(cherry picked from commit 0e464f7d993113119f0fd17b890831440734ce15)
Signed-off-by: Michael Roth <michael.roth@amd.com>
Linux limits the size of iovecs to 1024 (UIO_MAXIOV in the kernel
sources, IOV_MAX in POSIX). Because of this, on some host adapters
requests with many iovecs are rejected with -EINVAL by the
io_submit() or readv()/writev() system calls.
In fact, the same limit applies to SG_IO as well. To fix both the
EINVAL and the possible performance issues from using fewer iovecs
than allowed by Linux (some HBAs have max_segments as low as 128),
introduce a separate entry in BlockLimits to hold the max_segments
value from sysfs. This new limit is used only for SG_IO and clamped
to bs->bl.max_iov anyway, just like max_hw_transfer is clamped to
bs->bl.max_transfer.
Reported-by: Halil Pasic <pasic@linux.ibm.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-block@nongnu.org
Cc: qemu-stable@nongnu.org
Fixes: 18473467d5 ("file-posix: try BLKSECTGET on block devices too, do not round to power of 2", 2021-06-25)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210923130436.1187591-1-pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit cc071629539dc1f303175a7e2d4ab854c0a8b20f)
Signed-off-by: Michael Roth <michael.roth@amd.com>
Since commit d8fb7d0969d5 ("vl: switch -M parsing to keyval"), machine
parameter definitions cannot use underscores, because keyval_dashify()
transforms them to dashes and the parser doesn't find the parameter.
This affects option default_bus_bypass_iommu which was introduced in the
same release:
$ qemu-system-x86_64 -M q35,default_bus_bypass_iommu=on
qemu-system-x86_64: Property 'pc-q35-6.1-machine.default-bus-bypass-iommu' not found
Rename the parameter to "default-bus-bypass-iommu". Passing
"default_bus_bypass_iommu" is still valid since the underscore are
transformed automatically.
Fixes: c9e96b04fc19 ("hw/i386: Add a default_bus_bypass_iommu pc machine option")
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-Id: <20211025104737.1560274-1-jean-philippe@linaro.org>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 739b38630c45585cd9d372d44537f69c0b2b4346)
Signed-off-by: Michael Roth <michael.roth@amd.com>
Since commit d8fb7d0969d5 ("vl: switch -M parsing to keyval"), machine
parameter definitions cannot use underscores, because keyval_dashify()
transforms them to dashes and the parser doesn't find the parameter.
This affects option default_bus_bypass_iommu which was introduced in the
same release:
$ qemu-system-aarch64 -M virt,default_bus_bypass_iommu=on
qemu-system-aarch64: Property 'virt-6.1-machine.default-bus-bypass-iommu' not found
Rename the parameter to "default-bus-bypass-iommu". Passing
"default_bus_bypass_iommu" is still valid since the underscore are
transformed automatically.
Fixes: 6d7a85483a06 ("hw/arm/virt: Add default_bus_bypass_iommu machine option")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211026093733.2144161-1-jean-philippe@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 9dad363a223df8269175d218413aa8cd265e078e)
Signed-off-by: Michael Roth <michael.roth@amd.com>
Commit 1e08fd0a46 ("vhost-vsock: SOCK_SEQPACKET feature bit support")
enabled the SEQPACKET feature bit.
This commit is released with QEMU 6.1, so if we try to migrate a VM where
the host kernel supports SEQPACKET but machine type version is less than
6.1, we get the following errors:
Features 0x130000002 unsupported. Allowed features: 0x179000000
Failed to load virtio-vhost_vsock:virtio
error while loading state for instance 0x0 of device '0000:00:05.0/virtio-vhost_vsock'
load of migration failed: Operation not permitted
Let's disable the feature bit for machine types < 6.1.
We add a new OnOffAuto property for this, called `seqpacket`.
When it is `auto` (default), QEMU behaves as before, trying to enable the
feature, when it is `on` QEMU will fail if the backend (vhost-vsock
kernel module) doesn't support it.
Fixes: 1e08fd0a46 ("vhost-vsock: SOCK_SEQPACKET feature bit support")
Cc: qemu-stable@nongnu.org
Reported-by: Jiang Wang <jiang.wang@bytedance.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20210921161642.206461-2-sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit d6a9378f47515c6d70dbff4912c5740c98709880)
Signed-off-by: Michael Roth <michael.roth@amd.com>
v9fs_walk() utilizes the v9fs_co_run_in_worker({...}) macro to run the
supplied fs driver code block on a background worker thread.
When either the 'Twalk' client request was interrupted or if the client
requested fid for that 'Twalk' request caused a stat error then that
fs driver code block was left by 'break' keyword, with the intention to
return from worker thread back to main thread as well:
v9fs_co_run_in_worker({
if (v9fs_request_cancelled(pdu)) {
err = -EINTR;
break;
}
err = s->ops->lstat(&s->ctx, &dpath, &fidst);
if (err < 0) {
err = -errno;
break;
}
...
});
However that 'break;' statement also skipped the v9fs_co_run_in_worker()
macro's final and mandatory
/* re-enter back to qemu thread */
qemu_coroutine_yield();
call and thus caused the rest of v9fs_walk() to be continued being
executed on the worker thread instead of main thread, eventually
leading to a crash in the transport virtio transport driver.
To fix this issue and to prevent the same error from happening again by
other users of v9fs_co_run_in_worker() in future, auto wrap the supplied
code block into its own
do { } while (0);
loop inside the 'v9fs_co_run_in_worker' macro definition.
Full discussion and backtrace:
https://lists.gnu.org/archive/html/qemu-devel/2021-08/msg05209.htmlhttps://lists.gnu.org/archive/html/qemu-devel/2021-09/msg00174.html
Fixes: 8d6cb100731c4d28535adbf2a3c2d1f29be3fef4
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <E1mLTBg-0002Bh-2D@lizzy.crudebyte.com>
(cherry picked from commit f83df00900816476cca41bb536e4d532b297d76e)
Signed-off-by: Michael Roth <michael.roth@amd.com>
The CDE desktop on HP-UX 10 shows wrongly rendered pixels when the local screen
menu is closed. This bug was introduced by commit c7050f3f167b
("hw/display/artist: Refactor x/y coordination extraction") which converted the
coordinate extraction in artist_vram_read() and artist_vram_write() to use the
ADDR_TO_X and ADDR_TO_Y macros, but forgot to right-shift the address by 2 as
it was done before.
Signed-off-by: Helge Deller <deller@gmx.de>
Fixes: c7050f3f167b ("hw/display/artist: Refactor x/y coordination extraction")
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Sven Schnelle <svens@stackframe.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <YK1aPb8keur9W7h2@ls3530>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 01f750f5fef1afd8f6abc0548910f87d473e26d5)
Signed-off-by: Michael Roth <michael.roth@amd.com>
In case of device resume after suspend, VQ notifier MR still valid.
Duplicated registrations explode memory block list and slow down device
resume.
Fixes: 44866521bd6e ("vhost-user: support registering external host notifiers")
Cc: tiwei.bie@intel.com
Cc: qemu-stable@nongnu.org
Cc: Yuwei Zhang <zhangyuwei.9149@bytedance.com>
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Message-Id: <20211008080215.590292-1-xuemingl@nvidia.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit a1ed9ef1de87c3e86ff68589604298ec90875a14)
Signed-off-by: Michael Roth <michael.roth@amd.com>
The device uses the guest-supplied stream number unchecked, which can
lead to guest-triggered out-of-band access to the UASDevice->data3 and
UASDevice->status3 fields. Add the missing checks.
Fixes: CVE-2021-3713
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reported-by: Chen Zhe <chenzhe@huawei.com>
Reported-by: Tan Jingguo <tanjingguo@huawei.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210818120505.1258262-2-kraxel@redhat.com>
(cherry picked from commit 13b250b12ad3c59114a6a17d59caf073ce45b33a)
Signed-off-by: Michael Roth <michael.roth@amd.com>
Apparently, we don't have to duplicate the string.
Fixes: 722a3c783ef4 ("virtio-pci: Send qapi events when the virtio-mem size changes")
Cc: qemu-stable@nongnu.org
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210929162445.64060-2-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 75b98cb9f6456ccf194211beffcbf93b0a995fa4)
Signed-off-by: Michael Roth <michael.roth@amd.com>
When mergeable buffer is enabled, we try to set the num_buffers after
the virtqueue elem has been unmapped. This will lead several issues,
E.g a use after free when the descriptor has an address which belongs
to the non direct access region. In this case we use bounce buffer
that is allocated during address_space_map() and freed during
address_space_unmap().
Fixing this by storing the elems temporarily in an array and delay the
unmap after we set the the num_buffers.
This addresses CVE-2021-3748.
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Fixes: fbe78f4f55c6 ("virtio-net support")
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit bedd7e93d01961fcb16a97ae45d93acf357e11f6)
Signed-off-by: Michael Roth <michael.roth@amd.com>
Postcopy never worked properly with 'free-page-hint=on', as there are
at least two issues:
1) With postcopy, the guest will never receive a VIRTIO_BALLOON_CMD_ID_DONE
and consequently won't release free pages back to the OS once
migration finishes.
The issue is that for postcopy, we won't do a final bitmap sync while
the guest is stopped on the source and
virtio_balloon_free_page_hint_notify() will only call
virtio_balloon_free_page_done() on the source during
PRECOPY_NOTIFY_CLEANUP, after the VM state was already migrated to
the destination.
2) Once the VM touches a page on the destination that has been excluded
from migration on the source via qemu_guest_free_page_hint() while
postcopy is active, that thread will stall until postcopy finishes
and all threads are woken up. (with older Linux kernels that won't
retry faults when woken up via userfaultfd, we might actually get a
SEGFAULT)
The issue is that the source will refuse to migrate any pages that
are not marked as dirty in the dirty bmap -- for example, because the
page might just have been sent. Consequently, the faulting thread will
stall, waiting for the page to be migrated -- which could take quite
a while and result in guest OS issues.
While we could fix 1) comparatively easily, 2) is harder to get right and
might require more involved RAM migration changes on source and destination
[1].
As it never worked properly, let's not start free page hinting in the
precopy notifier if the postcopy migration capability was enabled to fix
it easily. Capabilities cannot be enabled once migration is already
running.
Note 1: in the future we might either adjust migration code on the source
to track pages that have actually been sent or adjust
migration code on source and destination to eventually send
pages multiple times from the source and and deal with pages
that are sent multiple times on the destination.
Note 2: virtio-mem has similar issues, however, access to "unplugged"
memory by the guest is very rare and we would have to be very
lucky for it to happen during migration. The spec states
"The driver SHOULD NOT read from unplugged memory blocks ..."
and "The driver MUST NOT write to unplugged memory blocks".
virtio-mem will move away from virtio_balloon_free_page_done()
soon and handle this case explicitly on the destination.
[1] https://lkml.kernel.org/r/e79fd18c-aa62-c1d8-c7f3-ba3fc2c25fc8@redhat.com
Fixes: c13c4153f76d ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
Cc: qemu-stable@nongnu.org
Cc: Wei Wang <wei.w.wang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Cc: Juan Quintela <quintela@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210708095339.20274-2-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit fd51e54fa10221e5a8add894c38cc1cf199f4bc4)
Signed-off-by: Michael Roth <michael.roth@amd.com>
machine_set_smp() mistakenly checks 'errp' not '*errp',
and so thinks there is an error every single time it runs.
This causes it to jump to the end of the method, skipping
the max CPUs checks. The caller meanwhile sees no error
and so carries on execution. The result of all this is:
$ qemu-system-x86_64 -smp -1
qemu-system-x86_64: GLib: ../glib/gmem.c:142: failed to allocate 481036337048 bytes
instead of
$ qemu-system-x86_64 -smp -1
qemu-system-x86_64: Invalid SMP CPUs -1. The max CPUs supported by machine 'pc-i440fx-6.1' is 255
This is a regression from
commit fe68090e8fbd6e831aaf3fc3bb0459c5cccf14cf
Author: Paolo Bonzini <pbonzini@redhat.com>
Date: Thu May 13 09:03:48 2021 -0400
machine: add smp compound property
Closes: https://gitlab.com/qemu-project/qemu/-/issues/524
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210812175353.4128471-1-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If dies is not supported by this machine's CPU topology, don't
keep processing options and return directly.
Fixes: 0aebebb561c ("machine: reject -smp dies!=1 for non-PC machines")
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210813112608.1452541-2-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Coverity found that 'uuid', 'csi' and 'eui64' are uninitialized. While
we set most of the fields, we do not explicitly set the rsvd2 field in
the NvmeIdNsDescr header.
Fix this by explicitly zero-initializing the variables.
Reported-by: Coverity (CID 1458835, 1459295 and 1459580)
Fixes: 6870cfb8140d ("hw/nvme: namespace parameter for EUI-64")
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Since commit 9894dc0cdcc397ee5b26370bc53da6d360a363c2 "char: convert
from GIOChannel to QIOChannel", the first argument to the watch callback
can actually be a QIOChannel, which is not a GIOChannel (but a QEMU
Object).
Even though we never used that pointer, change the callback type to warn
the users. Possibly a better fix later, we may want to store the
callback and call it from intermediary functions.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
This reverts commit 0cf8882fd06ba0aeb1e90fa6f23fce85504d7e14.
Which this commit, with aarch64 when using efi PCI devices with IO ports
do not work. The reason is that EFI creates I/O port mappings below
0x1000 (in fact, at 0). However Linux, for legacy reasons, does not
support I/O ports <= 0x1000 on PCI, so the I/O assignment created by EFI
is rejected.
EFI creates the mappings primarily for itself, and up until DSM #5
started to be enforced, all PCI resource allocations that existed at
boot were ignored by Linux and recreated from scratch.
Also, the commit in question looks dubious - it seems unlikely that
Linux would fail to create a resource tree. What does
happen is that BARs get moved around, which may cause trouble in some
cases: for instance, Linux had to add special code to the EFI framebuffer
driver to copy with framebuffer BARs being relocated.
DSM #5 has a long history of debate and misinterpretation.
Link: https://lore.kernel.org/r/20210724185234.GA2265457@roeck-us.net/
Fixes: 0cf8882fd06 ("acpi/gpex: Inform os to keep firmware resource map")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Suggested-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Commit [1] switched PCI hotplug from native to ACPI one by default.
That however breaks hotplug on following CLI that used to work:
-nodefaults -machine q35 \
-device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
-device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2
where PCI device is hotplugged to pcie-root-port-1 with error on guest side:
ACPI BIOS Error (bug): Could not resolve symbol [^S0B.PCNT], AE_NOT_FOUND (20201113/psargs-330)
ACPI Error: Aborting method \_SB.PCI0.PCNT due to previous error (AE_NOT_FOUND) (20201113/psparse-531)
ACPI Error: Aborting method \_GPE._E01 due to previous error (AE_NOT_FOUND) (20201113/psparse-531)
ACPI Error: AE_NOT_FOUND, while evaluating GPE method [_E01] (20201113/evgpe-515)
cause is that QEMU's ACPI hotplug never supported functions other then 0
and due to bug it was generating notification entries for not described
functions.
Technically there is no reason not to describe cold-plugged bridges
(root ports) on functions other then 0, as they similarly to bridge
on function 0 are unpluggable.
So since we need to describe multifunction devices iterate over
fuctions as well. But describe only cold-plugged bridges[root ports]
on functions other than 0 as well.
1)
Fixes: 17858a169508609ca9063c544833e5a1adeb7b52 (hw/acpi/ich9: Set ACPI PCI hot-plug as default on Q35)
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reported-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20210723090424.2092226-1-imammedo@redhat.com>
Fixes: 17858a169508609ca9063c544833e5a1adeb7b52 (hw/acpi/ich9: Set ACPI PCI hot-plug as default on Q35)<br>
Signed-off-by: Igor Mammedov <<a href="mailto:imammedo@redhat.com" target="_blank">imammedo@redhat.com</a>><br>
Reported-by: Laurent Vivier <<a href="mailto:lvivier@redhat.com" target="_blank">lvivier@redhat.com</a>><br>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Q35 has now ACPI hotplug enabled by default for PCI(e) devices.
As opposed to native PCIe hotplug, guests like Fedora 34
will not assign IO range to pcie-root-ports not supporting
native hotplug, resulting into a regression.
Reproduce by:
qemu-bin -M q35 -device pcie-root-port,id=p1 -monitor stdio
device_add e1000,bus=p1
In the Guest OS the respective pcie-root-port will have the IO range
disabled.
Fix it by setting the "reserve-io" hint capability of the
pcie-root-ports so the firmware will allocate the IO range instead.
Acked-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Message-Id: <20210802090057.1709775-1-marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
OSS-Fuzz found sending illegal addresses when querying the write
protection bits triggers the assertion added in commit 84816fb63e5
("hw/sd/sdcard: Assert if accessing an illegal group"):
qemu-fuzz-i386-target-generic-fuzz-sdhci-v3: ../hw/sd/sd.c:824: uint32_t sd_wpbits(SDState *, uint64_t):
Assertion `wpnum < sd->wpgrps_size' failed.
#3 0x7f62a8b22c91 in __assert_fail
#4 0x5569adcec405 in sd_wpbits hw/sd/sd.c:824:9
#5 0x5569adce5f6d in sd_normal_command hw/sd/sd.c:1389:38
#6 0x5569adce3870 in sd_do_command hw/sd/sd.c:1737:17
#7 0x5569adcf1566 in sdbus_do_command hw/sd/core.c💯16
#8 0x5569adcfc192 in sdhci_send_command hw/sd/sdhci.c:337:12
#9 0x5569adcfa3a3 in sdhci_write hw/sd/sdhci.c:1186:9
#10 0x5569adfb3447 in memory_region_write_accessor softmmu/memory.c:492:5
It is legal for the CMD30 to query for out-of-range addresses.
Such invalid addresses are simply ignored in the response (write
protection bits set to 0).
In commit 84816fb63e5 ("hw/sd/sdcard: Assert if accessing an illegal
group") we misplaced the assertion *before* we test the address is
in range. Move it *after*.
Include the qtest reproducer provided by Alexander Bulekov:
$ make check-qtest-i386
...
Running test qtest-i386/fuzz-sdcard-test
qemu-system-i386: ../hw/sd/sd.c:824: sd_wpbits: Assertion `wpnum < sd->wpgrps_size' failed.
Cc: qemu-stable@nongnu.org
Reported-by: OSS-Fuzz (Issue 29225)
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Fixes: 84816fb63e5 ("hw/sd/sdcard: Assert if accessing an illegal group")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/495
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210802235524.3417739-3-f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Per the 'Physical Layer Simplified Specification Version 3.01',
Table 4-22: 'Block Oriented Write Protection Commands'
SEND_WRITE_PROT (CMD30)
If the card provides write protection features, this command asks
the card to send the status of the write protection bits [1].
[1] 32 write protection bits (representing 32 write protect groups
starting at the specified address) [...]
The last (least significant) bit of the protection bits corresponds
to the first addressed group. If the addresses of the last groups
are outside the valid range, then the corresponding write protection
bits shall be set to 0.
Split the if() statement (without changing the behaviour of the code)
to better position the description comment.
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210802235524.3417739-2-f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
If the user provides both a BIOS/firmware image and also a guest
kernel filename, arm_setup_firmware_boot() will pass the
kernel image to the firmware via the fw_cfg device. However we
weren't checking whether there really was a fw_cfg device present,
and if there wasn't we would crash.
This crash can be provoked with a command line such as
qemu-system-aarch64 -M raspi3 -kernel /dev/null -bios /dev/null -display none
It is currently only possible on the raspi3 machine, because unless
the machine sets info->firmware_loaded we won't call
arm_setup_firmware_boot(), and the only machines which set that are:
* virt (has a fw-cfg device)
* sbsa-ref (checks itself for kernel_filename && firmware_loaded)
* raspi3 (crashes)
But this is an unfortunate beartrap to leave for future machine
model implementors, so we should handle this situation in boot.c.
Check in arm_setup_firmware_boot() whether the fw-cfg device exists
before trying to load files into it, and if it doesn't exist then
exit with a hopefully helpful error message.
Because we now handle this check in a machine-agnostic way, we
can remove the check from sbsa-ref.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/503
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210726163351.32086-1-peter.maydell@linaro.org
In the legacy RX descriptor mode, VLAN tag was saved to d->special
by e1000e_build_rx_metadata() in e1000e_write_lgcy_rx_descr(), but
it was then zeroed out again at the end of the call, which is wrong.
Fixes: c89d416a2b0f ("e1000e: Don't zero out buffer address in rx descriptor")
Reported-by: Markus Carlstedt <markus.carlstedt@windriver.com>
Signed-off-by: Christina Wang <christina.wang@windriver.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The initial value of VLAN Ether Type (VET) register is 0x8100, as per
the manual and real hardware.
While Linux e1000e driver always writes VET register to 0x8100, it is
not always the case for everyone. Drivers relying on the reset value
of VET won't be able to transmit and receive VLAN frames in QEMU.
Unlike e1000 in QEMU, e1000e uses a field 'vet' in "struct E1000Core"
to cache the value of VET register, but the cache only gets updated
when VET register is written. To always get a consistent VET value
no matter VET is written or remains its reset value, drop the 'vet'
field and use 'core->mac[VET]' directly.
Reported-by: Markus Carlstedt <markus.carlstedt@windriver.com>
Signed-off-by: Christina Wang <christina.wang@windriver.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The initial value of VLAN Ether Type (VET) register is 0x8100, as per
the manual and real hardware.
While Linux e1000 driver always writes VET register to 0x8100, it is
not always the case for everyone. Drivers relying on the reset value
of VET won't be able to transmit and receive VLAN frames in QEMU.
Reported-by: Markus Carlstedt <markus.carlstedt@windriver.com>
Signed-off-by: Christina Wang <christina.wang@windriver.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Problem reported by openEuler fuzz-sig group.
The buff2frame_bas function (hw\net\can\can_sja1000.c)
infoleak(qemu5.x~qemu6.x) or stack-overflow(qemu 4.x).
Reported-by: Qiang Ning <ningqiang1@huawei.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Pavel Pisa <pisa@cmp.felk.cvut.cz>
Signed-off-by: Jason Wang <jasowang@redhat.com>
QEMU should never terminate unexpectedly just because the guest is
doing something wrong like specifying wrong queue numbers. Let's
simply refuse to set the device active in this case.
Buglink: https://bugs.launchpad.net/qemu/+bug/1890160
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
data might point into the middle of a larger buffer, there is a separate
free_on_destroy pointer passed into bufp_alloc() to handle that. It is
only used in the normal workflow though, not when dropping packets due
to the queue being full. Fix that.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/491
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20210722072756.647673-1-kraxel@redhat.com>
On windows we can't wait on file descriptors.
Poll libusb using a timer instead.
Fixes long-standing FIXME.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/431
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-Id: <20210623085249.1151901-2-kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Coverity reported issues which are caused by mixing of signed return codes
from DTC and unsigned return codes of the client interface.
This introduces PROM_ERROR and makes distinction between the error types.
This fixes NEGATIVE_RETURNS, OVERRUN issues reported by Coverity.
This adds a comment about the return parameters number in the VOF hcall.
The reason for such counting is to keep the numbers look the same in
vof_client_handle() and the Linux (an OF client).
vmc->client_architecture_support() returns target_ulong and we want to
propagate this to the client (for example H_MULTI_THREADS_ACTIVE).
The VOF path to do_client_architecture_support() needs chopping off
the top 32bit but SLOF's H_CAS does not; and either way the return values
are either 0 or 32bit negative error code. For now this chops
the top 32bits.
This makes "claim" fail if the allocated address is above 4GB as
the client interface is 32bit. This still allows claiming memory above
4GB as potentially initrd can be put there and the client can read
the address from the FDT's "available" property.
Fixes: CID 1458139, 1458138, 1458137, 1458133, 1458132
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20210720050726.2737405-1-aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Add the differential clock input feature bit to the generated SPD
data. Most guests don't seem to care but pegasos2 firmware version 1.2
checks for this bit and stops with unsupported module type error if
it's not present. Since this feature is likely present on real memory
modules add it in the general code rather than patching the generated
SPD data in pegasos2 board only.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <19d42ade295d5297aa624a9eb757b8df18cf64d6.1626367844.git.balaton@eik.bme.hu>
Acked-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The -append option is currently not compatible with -bios (as we don't
yet emulate nvram so we can only put it in the environment with VOF).
Therefore a warning is printed if -append is used with -bios but
because the default value of kernel_cmdline seems to be an empty
string instead of NULL this warning was printed even without -append
when -bios is used. Only print warning if -append is given.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <483ac599a1407b766179aaea2794aed60cc09f53.1626367844.git.balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>