The function qemu_madvise() and the QEMU_MADV_* constants associated
with it are used in only 10 files. Move them out of osdep.h to a new
qemu/madvise.h header that is included where it is needed.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220208200856.3558249-2-peter.maydell@linaro.org
Let's sense support and use it for preallocation. MADV_POPULATE_WRITE
does not require a SIGBUS handler, doesn't actually touch page content,
and avoids context switches; it is, therefore, faster and easier to handle
than our current approach.
While MADV_POPULATE_WRITE is, in general, faster than manual
prefaulting, and especially faster with 4k pages, there is still value in
prefaulting using multiple threads to speed up preallocation.
More details on MADV_POPULATE_WRITE can be found in the Linux commits
4ca9b3859dac ("mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault
page tables") and eb2faa513c24 ("mm/madvise: report SIGBUS as -EFAULT for
MADV_POPULATE_(READ|WRITE)"), and in the man page proposal [1].
This resolves the TODO in do_touch_pages().
In the future, we might want to look into using fallocate(), eventually
combined with MADV_POPULATE_READ, when dealing with shared file/fd
mappings and not caring about memory bindings.
[1] https://lkml.kernel.org/r/20210816081922.5155-1-david@redhat.com
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211217134611.31172-3-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
While config-host.mak entries are expanded to "1" for compatibility with
create-config.sh, tests done directly in meson.build expand to the empty
string and cannot be placed to the right of the && operator. Adjust
osdep.h after commit e46bd55d9c ("configure: convert HAVE_BROKEN_SIZE_MAX
to meson", 2021-07-06) changed the way HAVE_BROKEN_SIZE_MAX is defined.
Reported-by: Frederic Bezies <fredbezies@gmail.com>
Fixes: e46bd55d9c ("configure: convert HAVE_BROKEN_SIZE_MAX to meson", 2021-07-06)
Resolves: #463
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Introduce a convenient macro, that works for qemu_memalign() like
g_autofree works with g_malloc.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210628121133.193984-2-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
osdep.h provides a ROUND_UP macro to hide bitwise operations for the
purpose of rounding a number up to a power of two; add a ROUND_DOWN
macro that does the same with truncation towards zero.
While at it, change the formatting of some comments.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Let's support RAM_NORESERVE via MAP_NORESERVE on Linux. The flag has no
effect on most shared mappings - except for hugetlbfs and anonymous memory.
Linux man page:
"MAP_NORESERVE: Do not reserve swap space for this mapping. When swap
space is reserved, one has the guarantee that it is possible to modify
the mapping. When swap space is not reserved one might get SIGSEGV
upon a write if no physical memory is available. See also the discussion
of the file /proc/sys/vm/overcommit_memory in proc(5). In kernels before
2.6, this flag had effect only for private writable mappings."
Note that the "guarantee" part is wrong with memory overcommit in Linux.
Also, in Linux hugetlbfs is treated differently - we configure reservation
of huge pages from the pool, not reservation of swap space (huge pages
cannot be swapped).
The rough behavior is [1]:
a) !Hugetlbfs:
1) Without MAP_NORESERVE *or* with memory overcommit under Linux
disabled ("/proc/sys/vm/overcommit_memory == 2"), the following
accounting/reservation happens:
For a file backed map
SHARED or READ-only - 0 cost (the file is the map not swap)
PRIVATE WRITABLE - size of mapping per instance
For an anonymous or /dev/zero map
SHARED - size of mapping
PRIVATE READ-only - 0 cost (but of little use)
PRIVATE WRITABLE - size of mapping per instance
2) With MAP_NORESERVE, no accounting/reservation happens.
b) Hugetlbfs:
1) Without MAP_NORESERVE, huge pages are reserved.
2) With MAP_NORESERVE, no huge pages are reserved.
Note: With "/proc/sys/vm/overcommit_memory == 0", we were already able
to configure it for !hugetlbfs globally; this toggle now allows
configuring it more fine-grained, not for the whole system.
The target use case is virtio-mem, which dynamically exposes memory
inside a large, sparse memory area to the VM.
[1] https://www.kernel.org/doc/Documentation/vm/overcommit-accounting
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-10-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Let's introduce RAM_NORESERVE, allowing mmap'ing with MAP_NORESERVE. The
new flag has the following semantics:
"
RAM is mmap-ed with MAP_NORESERVE. When set, reserving swap space (or huge
pages if applicable) is skipped: will bail out if not supported. When not
set, the OS will do the reservation, if supported for the memory type.
"
Allow passing it into:
- memory_region_init_ram_nomigrate()
- memory_region_init_resizeable_ram()
- memory_region_init_ram_from_file()
... and teach qemu_ram_mmap() and qemu_anon_ram_alloc() about the flag.
Bail out if the flag is not supported, which is the case right now for
both, POSIX and win32. We will add Linux support next and allow specifying
RAM_NORESERVE via memory backends.
The target use case is virtio-mem, which dynamically exposes memory
inside a large, sparse memory area to the VM.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-9-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Let's pass flags instead of bools to prepare for passing other flags and
update the documentation of qemu_ram_mmap(). Introduce new QEMU_MAP_
flags that abstract the mmap() PROT_ and MAP_ flag handling and simplify
it.
We expose only flags that are currently supported by qemu_ram_mmap().
Maybe, we'll see qemu_mmap() in the future as well that can implement these
flags.
Note: We don't use MAP_ flags as some flags (e.g., MAP_SYNC) are only
defined for some systems and we want to always be able to identify
these flags reliably inside qemu_ram_mmap() -- for example, to properly
warn when some future flags are not available or effective on a system.
Also, this way we can simplify PROT_ handling as well.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-8-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We can create shared anonymous memory via
"-object memory-backend-ram,share=on,..."
which is, for example, required by PVRDMA for mremap() to work.
Shared anonymous memory is weird, though. Instead of MADV_DONTNEED, we
have to use MADV_REMOVE: MADV_DONTNEED will only remove / zap all
relevant page table entries of the current process, the backend storage
will not get removed, resulting in no reduced memory consumption and
a repopulation of previous content on next access.
Shared anonymous memory is internally really just shmem, but without a
fd exposed. As we cannot use fallocate() without the fd to discard the
backing storage, MADV_REMOVE gets the same job done without a fd as
documented in "man 2 madvise". Removing backing storage implicitly
invalidates all page table entries with relevant mappings - an additional
MADV_DONTNEED is not required.
Fixes: 06329ccecfa0 ("mem: add share parameter to memory-backend-ram")
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210406080126.24010-3-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For --enable-tcg-interpreter on Windows, we will need this.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Both os-win32.h and os-posix.h include system header files. Instead
of having osdep.h include them inside its 'extern "C"' block, make
these headers handle that themselves, so that we don't include the
system headers inside 'extern "C"'.
This doesn't fix any current problems, but it's conceptually the
right way to handle system headers.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Mostly osdep.h puts the system includes at the top of the file; but
there are a couple of exceptions where we include a system header
halfway through the file. Move these up to the top with the rest
so that all the system headers we include are included before
we include os-win32.h or os-posix.h.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210416135543.20382-4-peter.maydell@linaro.org
Message-id: 20210414184343.26235-1-peter.maydell@linaro.org
System headers may include templates if compiled with a C++ compiler,
which cause the compiler to complain if qemu/osdep.h is included
within a C++ source file's 'extern "C"' block. Add
an 'extern "C"' block directly to qemu/osdep.h, so that
system headers can be kept out of it.
There is a stray declaration early in qemu/osdep.h, which needs
to be special cased. Add a definition in qemu/compiler.h to
make it look nice.
config-host.h, CONFIG_TARGET, exec/poison.h and qemu/compiler.h
are included outside the 'extern "C"' block; that is not
an issue because they consist entirely of preprocessor directives.
This allows us to move the include of osdep.h in our two C++
source files outside the extern "C" block they were previously
using for it, which in turn means that they compile successfully
against newer versions of glib which insist that glib.h is
*not* inside an extern "C" block.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210416135543.20382-3-peter.maydell@linaro.org
[PMM: Moved disas/arm-a64.cc osdep.h include out of its extern "C" block;
explained in commit message why we're doing this]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
glib-compat.h is sort of like a system header, and it needs to include
system headers (glib.h) that may dislike being included under
'extern "C"'. Move it right after all system headers and before
all other QEMU headers.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210416135543.20382-2-peter.maydell@linaro.org
[PMM: Added comment about why glib-compat.h is special]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Build without error on hosts without a working system(). If system()
is called, return -1 with ENOSYS.
Signed-off-by: Joelle van Dyne <j@getutm.app>
Message-id: 20210126012457.39046-6-j@getutm.app
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Pages can't be both write and executable at the same time on Apple
Silicon. macOS provides public API to switch write protection [1] for
JIT applications, like TCG.
1. https://developer.apple.com/documentation/apple_silicon/porting_just-in-time_compilers_to_apple_silicon
Tested-by: Alexander Graf <agraf@csgraf.de>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Message-Id: <20210113032806.18220-1-r.bolshakov@yadro.com>
[rth: Inline the qemu_thread_jit_* functions;
drop the MAP_JIT change for a follow-on patch.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Prior to 2a4b472c3c, sys/signal.h was only included on OpenBSD
(apart from two .c files). The POSIX standard location for this
header is just <signal.h> and in fact, OpenBSD's signal.h includes
sys/signal.h itself.
Unconditionally including <sys/signal.h> on musl causes warnings
for just about every source file:
/usr/include/sys/signal.h:1:2: warning: #warning redirecting incorrect #include <sys/signal.h> to <signal.h> [-Wcpp]
1 | #warning redirecting incorrect #include <sys/signal.h> to <signal.h>
| ^~~~~~~
Since there don't seem to be any platforms which require including
<sys/signal.h> in addition to <signal.h>, and some platforms like
Haiku lack it completely, just remove it.
Tested building on OpenBSD after removing this include.
Signed-off-by: Michael Forney <mforney@mforney.org>
Tested-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210113215600.16100-1-mforney@mforney.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Just return the directory without requiring the caller to free it.
This also removes a bogus check for NULL in os_find_datadir and
module_load_one; g_strdup of a static variable cannot return NULL.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Some minor qtest improvements
* Fix the unit tests to work on MSYS2, too
* Enable building and testing on MSYS2 in the Cirrus-CI
* Build FreeBSD with one task again in the Cirrus-CI
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAl9h9e0RHHRodXRoQHJl
ZGhhdC5jb20ACgkQLtnXdP5wLbXXMA/6AvNOOEgYeW+YrkIjMgh+jjgrmBK5FH0J
REJiJ5CxQBh9v3gPV5ehWv4/R9pmaEPtbsZ4Bc1jmRwLHcAWIJ/JTYo11M4vTYa3
IjS9+dlqgznzxZHFavwJ8USjcyeVjkqyaUTE7CNPgzE2b0237oQ8MHzFGlsHwGZV
AiRhDHI0StCE3QeKICnpB91Us+KF/+UjZnCwSaC/SM8Sq+6LnTF0bEYYUH44SfZe
AX3ax9kxzWFtzpXXh/3qL0gdGwiVqwv35V7MYpQWZJAPA3TdxVnUDE7/XP1RTOjL
hhJLf6IqgPwbRWLszmYmTiUCDGE8kqO8wj5MkKlJcjLY9n4zv0ErOjy6Nhnr8b5Q
TA9hjRfkRkUoquVRm7ZBOE9l2jIkWV9olxYFqBipqBMujSlt9T0seUi+eaY6NuAA
Z8NOQslqi8xP7wN4Lw3DpGOfbeTvtOlDtA7O7HwwTChTlhCJX7FCoNmpqhCiFRpH
s7VkNCXoc6l8NDI+Py5sjpRRHMQIsFWUCnZLWJQ+UJWZvfnNoLTM3ErdqzIasVLt
vW/behHRd7L/hGMa7zNtQa+wv2bgXY/hbFFpNK6RUEaPBzUq3ZixFrMW2Fw6X7mg
eIVPNrh/LloiJGQfpUuNkqiZ4vdgUeBq7Z89TCU49xskQAgHb0KglnveU42nP8Yf
pO8OCBOjfJg=
=ErBp
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/huth-gitlab/tags/pull-request-2020-09-16' into staging
* Fix "readlink -f" problem in iotests on macOS (to fix the Cirrus-CI tests)
* Some minor qtest improvements
* Fix the unit tests to work on MSYS2, too
* Enable building and testing on MSYS2 in the Cirrus-CI
* Build FreeBSD with one task again in the Cirrus-CI
# gpg: Signature made Wed 16 Sep 2020 12:24:29 BST
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* remotes/huth-gitlab/tags/pull-request-2020-09-16: (24 commits)
cirrus: Building freebsd in a single shot
ci: Enable msys2 ci in cirrus
tests: Fixes test-qdev-global-props.c
tests: fix test-util-sockets.c
tests: Fixes test-io-channel-file by mask only owner file state mask bits
tests: fixes aio-win32 about aio_remove_fd_handler, get it consistence with aio-posix.c
tests: Fixes test-io-channel-socket.c tests under msys2/mingw
vmstate: Fixes test-vmstate.c on msys2/mingw
meson: remove empty else and duplicated gio deps
meson: Use -b to ignore CR vs. CR-LF issues on Windows
osdep: file locking functions are not available on Win32
tests: test-replication disable /replication/secondary/* on msys2/mingw.
tests: Fixes test-replication.c on msys2/mingw.
meson: disable crypto tests are empty under win32
meson: Disable test-char on msys2/mingw for fixing tests stuck
rcu: fixes test-logging.c by call drain_call_rcu before rmdir_full
tests: Convert g_free to g_autofree macro in test-logging.c
rcu: Implement drain_call_rcu
qga/commands-win32: Fix problem with redundant protype declaration
Simplify the .gitignore file
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
qemu_open_old() works like open(): set errno and return -1 on failure.
It has even more failure modes, though. Reporting the error clearly
to users is basically impossible for many of them.
Our standard cure for "errno is too coarse" is the Error object.
Introduce two new helper methods:
int qemu_open(const char *name, int flags, Error **errp);
int qemu_create(const char *name, int flags, mode_t mode, Error **errp);
Note that with this design we no longer require or even accept the
O_CREAT flag. Avoiding overloading the two distinct operations
means we can avoid variable arguments which would prevent 'errp' from
being the last argument. It also gives us a guarantee that the 'mode' is
given when creating files, avoiding a latent security bug.
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
We want to introduce a new version of qemu_open() that uses an Error
object for reporting problems and make this it the preferred interface.
Rename the existing method to release the namespace for the new impl.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Currently code has to call monitor_fdset_get_fd, then dup
the return fd, and then add the duplicate FD back into the
fdset. This dance is overly verbose for the caller and
introduces extra failure modes which can be avoided by
folding all the logic into monitor_fdset_dup_fd_add and
removing monitor_fdset_get_fd entirely.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Do not declare the following locking functions on Win32:
int qemu_lock_fd(int fd, int64_t start, int64_t len, bool exclusive);
int qemu_unlock_fd(int fd, int64_t start, int64_t len);
int qemu_lock_fd_test(int fd, int64_t start, int64_t len, bool exclusive);
bool qemu_has_ofd_lock(void);
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20200915121318.247-10-luoyonggang@gmail.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Plain MAP_FIXED has the undesirable behaviour of splatting exiting
maps so we don't actually achieve what we want when looking for gaps.
We should be using MAP_FIXED_NOREPLACE. As this isn't always available
we need to potentially check the returned address to see if the kernel
gave us what we asked for.
Fixes: ad592e37dfc ("linux-user: provide fallback pgd_find_hole for bare chroots")
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200724064509.331-9-alex.bennee@linaro.org>
This will be used in a future patch. For POSIX systems _SC_PHYS_PAGES
isn't standardised but at least appears in the man pages for
Open/FreeBSD. The result is advisory so any users of it shouldn't just
fail if we can't work it out.
The win32 stub currently returns 0 until someone with a Windows system
can develop and test a patch.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Cc: BALATON Zoltan <balaton@eik.bme.hu>
Cc: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Message-Id: <20200724064509.331-5-alex.bennee@linaro.org>
This comment is confuse, reword it a bit.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Michael Rolnik <mrolnik@gmail.com>
Tested-by: Michael Rolnik <mrolnik@gmail.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20200714164257.23330-3-f4bug@amsat.org>
This function offers operating system agnostic way to fetch host
name. It is implemented for both POSIX-like and Windows systems.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Haiku doesn't provide SIGIO; fix this up in osdep.h by defining it as
equal to SIGPOLL.
Signed-off-by: David Carlier <devnexen@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20200703145614.16684-6-peter.maydell@linaro.org
[PMM: Expanded commit message]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Regularize our handling of <sys/signal.h>: currently we include it in
osdep.h, but only for OpenBSD, and we include it without an ifdef
guard in a couple of C files. This causes problems for Haiku, which
doesn't have that header.
Instead, check in configure whether sys/signal.h exists, and if it
does then always include it from osdep.h.
Signed-off-by: David Carlier <devnexen@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20200703145614.16684-5-peter.maydell@linaro.org
[PMM: Expanded commit message; rename to HAVE_SYS_SIGNAL_H]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Coverity has problems seeing through __builtin_choose_expr, which
result in it abandoning analysis of later functions that utilize a
definition that used MIN_CONST or MAX_CONST, such as in qemu-file.c:
50 DECLARE_BITMAP(may_free, MAX_IOV_SIZE);
CID 1429992 (#1 of 1): Unrecoverable parse warning (PARSE_ERROR)1.
expr_not_constant: expression must have a constant value
As has been done in the past (see 07d66672), it's okay to dumb things
down when compiling for static analyzers. (Of course, now the
syntax-checker has a false positive on our reference to
__COVERITY__...)
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Fixes: CID 1429992, CID 1429995, CID 1429997, CID 1429999
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200629162804.1096180-1-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
I'm not aware of any immediate bugs in qemu where a second runtime
evaluation of the arguments to MIN() or MAX() causes a problem, but
proactively preventing such abuse is easier than falling prey to an
unintended case down the road. At any rate, here's the conversation
that sparked the current patch:
https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg05718.html
Update the MIN/MAX macros to only evaluate their argument once at
runtime; this uses typeof(1 ? (a) : (b)) to ensure that we are
promoting the temporaries to the same type as the final comparison (we
have to trigger type promotion, as typeof(bitfield) won't compile; and
we can't use typeof((a) + (b)) or even typeof((a) + 0), as some of our
uses of MAX are on void* pointers where such addition is undefined).
However, we are unable to work around gcc refusing to compile ({}) in
a constant context (such as the array length of a static variable),
even when only used in the dead branch of a __builtin_choose_expr(),
so we have to provide a second macro pair MIN_CONST and MAX_CONST for
use when both arguments are known to be compile-time constants and
where the result must also be usable as a constant; this second form
evaluates arguments multiple times but that doesn't matter for
constants. By using a void expression as the expansion if a
non-constant is presented to this second form, we can enlist the
compiler to ensure the double evaluation is not attempted on
non-constants.
Alas, as both macros now rely on compiler intrinsics, they are no
longer usable in preprocessor #if conditions; those will just have to
be open-coded or the logic rewritten into #define or runtime 'if'
conditions (but where the compiler dead-code-elimination will probably
still apply).
I tested that both gcc 10.1.1 and clang 10.0.0 produce errors for all
forms of macro mis-use. As the errors can sometimes be cryptic, I'm
demonstrating the gcc output:
Use of MIN when MIN_CONST is needed:
In file included from /home/eblake/qemu/qemu-img.c:25:
/home/eblake/qemu/include/qemu/osdep.h:249:5: error: braced-group within expression allowed only inside a function
249 | ({ \
| ^
/home/eblake/qemu/qemu-img.c:92:12: note: in expansion of macro ‘MIN’
92 | char array[MIN(1, 2)] = "";
| ^~~
Use of MIN_CONST when MIN is needed:
/home/eblake/qemu/qemu-img.c: In function ‘is_allocated_sectors’:
/home/eblake/qemu/qemu-img.c:1225:15: error: void value not ignored as it ought to be
1225 | i = MIN_CONST(i, n);
| ^
Use of MIN in the preprocessor:
In file included from /home/eblake/qemu/accel/tcg/translate-all.c:20:
/home/eblake/qemu/accel/tcg/translate-all.c: In function ‘page_check_range’:
/home/eblake/qemu/include/qemu/osdep.h:249:6: error: token "{" is not valid in preprocessor expressions
249 | ({ \
| ^
Fix the resulting callsites that used #if or computed a compile-time
constant min or max to use the new macros. cpu-defs.h is interesting,
as CPU_TLB_DYN_MAX_BITS is sometimes used as a constant and sometimes
dynamic.
It may be worth improving glib's MIN/MAX definitions to be saner, but
that is a task for another day.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200625162602.700741-1-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
macOS API for dealing with serial ports/ttys is identical to BSDs.
Signed-off-by: Mikhail Gusarov <dottedmag@dottedmag.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20200426210956.17324-1-dottedmag@dottedmag.net>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
In commit a1a98357e3fd in 2018 we added some workarounds for Coverity
not being able to handle the _Float* types introduced by recent
glibc. Newer versions of the Coverity scan tools have support for
these types, and will fail with errors about duplicate typedefs if we
have our workaround. Remove our copy of the typedefs.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200319193323.2038-2-peter.maydell@linaro.org
Add a helper function to match qemu_open() which may return files
under the /dev/fdset prefix. Those shouldn't be removed, since it's
only a qemu namespace.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
There are three page size in qemu:
real host page size
host page size
target page size
All of them have dedicate variable to represent. For the last two, we
use the same form in the whole qemu project, while for the first one we
use two forms: qemu_real_host_page_size and getpagesize().
qemu_real_host_page_size is defined to be a replacement of
getpagesize(), so let it serve the role.
[Note] Not fully tested for some arch or device.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20191013021145.16011-3-richardw.yang@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Neither stat(2) nor lseek(2) report the size of Linux devdax pmem
character device nodes. Commit 314aec4a6e06844937f1677f6cba21981005f389
("hostmem-file: reject invalid pmem file sizes") added code to
hostmem-file.c to fetch the size from sysfs and compare against the
user-provided size=NUM parameter:
if (backend->size > size) {
error_setg(errp, "size property %" PRIu64 " is larger than "
"pmem file \"%s\" size %" PRIu64, backend->size,
fb->mem_path, size);
return;
}
It turns out that exec.c:qemu_ram_alloc_from_fd() already has an
equivalent size check but it skips devdax pmem character devices because
lseek(2) returns 0:
if (file_size > 0 && file_size < size) {
error_setg(errp, "backing store %s size 0x%" PRIx64
" does not match 'size' option 0x" RAM_ADDR_FMT,
mem_path, file_size, size);
return NULL;
}
This patch moves the devdax pmem file size code into get_file_size() so
that we check the memory size in a single place:
qemu_ram_alloc_from_fd(). This simplifies the code and makes it more
general.
This also fixes the problem that hostmem-file only checks the devdax
pmem file size when the pmem=on parameter is given. An unchecked
size=NUM parameter can lead to SIGBUS in QEMU so we must always fetch
the file size for Linux devdax pmem character device nodes.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20190830093056.12572-1-stefanha@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
I encountered the following compilation error on mingw:
/mnt/d/qemu/include/qemu/osdep.h:97:9: error: '__USE_MINGW_ANSI_STDIO' macro redefined [-Werror,-Wmacro-redefined]
#define __USE_MINGW_ANSI_STDIO 1
^
/mnt/d/llvm-mingw/aarch64-w64-mingw32/include/_mingw.h:433:9: note: previous definition is here
#define __USE_MINGW_ANSI_STDIO 0 /* was not defined so it should be 0 */
It turns out that __USE_MINGW_ANSI_STDIO must be set before any
system headers are included, not just before stdio.h.
Signed-off-by: Cao Jiaxi <driver1998@foxmail.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Message-id: 20190503003719.10233-1-driver1998@foxmail.com
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Guests started with NVDIMMs larger than the underlying host file produce
confusing errors inside the guest. This happens because the guest
accesses pages beyond the end of the file.
Check the pmem file size on startup and print a clear error message if
the size is invalid.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1669053
Cc: Wei Yang <richardw.yang@linux.intel.com>
Cc: Zhang Yi <yi.z.zhang@linux.intel.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20190214031004.32522-3-stefanha@redhat.com>
Reviewed-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Pankaj Gupta <pagupta@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
On FreeBSD 11.2:
$ nbdkit memory size=1M --run './qemu-io -f raw -c "aio_write 0 512" $nbd'
Parsing error: non-numeric argument, or extraneous/unrecognized suffix -- aio_write
After main option parsing, we reinitialize optind so we can parse each
command. However reinitializing optind to 0 does not work on FreeBSD.
What happens when you do this is optind remains 0 after the option
parsing loop, and the result is we try to parse argv[optind] ==
argv[0] == "aio_write" as if it was the first parameter.
The FreeBSD manual page says:
In order to use getopt() to evaluate multiple sets of arguments, or to
evaluate a single set of arguments multiple times, the variable optreset
must be set to 1 before the second and each additional set of calls to
getopt(), and the variable optind must be reinitialized.
(From the rest of the man page it is clear that optind must be
reinitialized to 1).
The glibc man page says:
A program that scans multiple argument vectors, or rescans the same
vector more than once, and wants to make use of GNU extensions such as
'+' and '-' at the start of optstring, or changes the value of
POSIXLY_CORRECT between scans, must reinitialize getopt() by resetting
optind to 0, rather than the traditional value of 1. (Resetting to 0
forces the invocation of an internal initialization routine that
rechecks POSIXLY_CORRECT and checks for GNU extensions in optstring.)
This commit introduces an OS-portability function called
qemu_reset_optind which provides a way of resetting optind that works
on FreeBSD and platforms that use optreset, while keeping it the same
as now on other platforms.
Note that the qemu codebase sets optind in many other places, but in
those other places it's setting a local variable and not using getopt.
This change is only needed in places where we are using getopt and the
associated global variable optind.
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Message-id: 20190118101114.11759-2-rjones@redhat.com
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Both qemu & qga build with Vista API by default already, by defining
_WIN32_WINNT 0x0600. Set it globally in osdep.h instead.
This replaces WINVER by _WIN32_WINNT in osdep.h. WINVER doesn't seem
to be really useful these days.
(see also https://blogs.msdn.microsoft.com/oldnewthing/20070411-00/?p=27283)
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20181122110039.15972-4-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This removes some clutter in compilation logging, and allows some
easier tweaking per compilation unit/CFLAGS overriding.
Note that we can't move those define in os-win32.h, since they must be
set before the first system headers are included.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20181122110039.15972-3-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In several places we use assert(FEATURE), and assume that if FEATURE
is disabled, all following code is removed as unreachable. Which allows
us to compile-out functions that are only present with FEATURE, and
have a link-time failure if the functions remain used.
MinGW does not mark its internal function _assert() as noreturn, so the
compiler cannot see when code is unreachable, which leads to link errors
for this host that are not present elsewhere.
The current build-time failure concerns 62823083b8a2, but I remember
having seen this same error before. Fix it once and for all for MinGW.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20181022181623.8810-1-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Coverity does not like the new _Float* types that are used by
recent glibc, and croaks on every single file that includes
stdlib.h. Add dummy typedefs to please it.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This allows KVM with the Book3S radix MMU mode to take advantage of
THP and install larger pages in the partition scope page tables (the
host translation).
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Man page for WCOREDUMP says:
WCOREDUMP(wstatus) returns true if the child produced a core dump.
This macro should be employed only if WIFSIGNALED returned true.
This macro is not specified in POSIX.1-2001 and is not
available on some UNIX implementations (e.g., AIX, SunOS). Therefore,
enclose its use inside #ifdef WCOREDUMP ... #endif.
Let's do exactly this.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently only file backed memory backend can
be created with a "share" flag in order to allow
sharing guest RAM with other processes in the host.
Add the "share" flag also to RAM Memory Backend
in order to allow remapping parts of the guest RAM
to different host virtual addresses. This is needed
by the RDMA devices in order to remap non-contiguous
QEMU virtual addresses to a contiguous virtual address range.
Moved the "share" flag to the Host Memory base class,
modified phys_mem_alloc to include the new parameter
and a new interface memory_region_init_ram_shared_nomigrate.
There are no functional changes if the new flag is not used.
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>