6135 Commits

Author SHA1 Message Date
Peter Maydell
9e654e1019 block: Make block-copy API thread-safe
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEi5wmzbL9FHyIDoahVh8kwfGfefsFAmDVzrgACgkQVh8kwfGf
 efuoKRAAsHqE46P2xwjjPROtwZC6HP/Ny5xfbF2CglfC4eX4ff89dwWSagjHfBig
 1TQjHemOo5Fs8gyBGec0WPn0HaBVFthq75VGObt+QkHy7+owGDmtVghkXnEO8z2c
 gldPoVeOXAbU4DZXgITQYfN1ljCOdrdKQMrMYmKqXLmw/vMzAfKlBKqsOFQiyVys
 4egn25QxyiOh+T29zyVwmGVABaH2TIuJqkDr+iMh4IypYZf0BlqJRw++kLhGdxGq
 RIpXQiXExy3lRm8htuh+GDAigSXNz93XKU3ZHe8RPBtPfdS0siGWwVTW2nLsH+Rc
 vfUvfkqBSGcFaFjGHg/eNGvoMTYAZKHx8yq72voqrfFIPtz3NAoS2ahz/hIU/NiL
 YLOmOcRZVx+xJ5lxjaxi/SvbTHVtZoBym/Aje/YiT3v4A8rziG6BeBUP9ec/bk+D
 YYxMZNfzW2jVq0Vl4TZyYmV8e/H8Ha3HbLLJip3tiLXsBIjajfNti9iJ/xac5NzD
 jV5pR27yIXilYHPR7GCYaMRp5LJv8uGiW704yAt2dwBizqonm7cgTg5Fcx0HFDyk
 +HyPwu/TGI3cEF7dl+8V+AmwKug+jFj3VFVh5UMtf4bqNeliYNx+QpCUuWPuPMFc
 U1nXWYBtcklD4JNPERUgUW7x2tmAJN5dGDY0LAa2brrOnKVTTwU=
 =JXBO
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/vsementsov/tags/pull-jobs-2021-06-25' into staging

block: Make block-copy API thread-safe

# gpg: Signature made Fri 25 Jun 2021 13:40:24 BST
# gpg:                using RSA key 8B9C26CDB2FD147C880E86A1561F24C1F19F79FB
# gpg: Good signature from "Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 8B9C 26CD B2FD 147C 880E  86A1 561F 24C1 F19F 79FB

* remotes/vsementsov/tags/pull-jobs-2021-06-25:
  block-copy: atomic .cancelled and .finished fields in BlockCopyCallState
  block-copy: add CoMutex lock
  block-copy: move progress_set_remaining in block_copy_task_end
  block-copy: streamline choice of copy_range vs. read/write
  block-copy: small refactor in block_copy_task_entry and block_copy_common
  co-shared-resource: protect with a mutex
  progressmeter: protect with a mutex
  blockjob: let ratelimit handle a speed of 0
  block-copy: let ratelimit handle a speed of 0
  ratelimit: treat zero speed as unlimited

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-06-28 18:58:19 +01:00
Emanuele Giuseppe Esposito
149009bef4 block-copy: atomic .cancelled and .finished fields in BlockCopyCallState
By adding acquire/release pairs, we ensure that .ret and .error_is_read
fields are written by block_copy_dirty_clusters before .finished is true,
and that they are read by API user after .finished is true.

The atomic here are necessary because the fields are concurrently modified
in coroutines, and read outside.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20210624072043.180494-6-eesposit@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2021-06-25 14:33:51 +03:00
Emanuele Giuseppe Esposito
d0c389d2ce block-copy: add CoMutex lock
Group various structures fields, to better understand what we need to
protect with a lock and what doesn't need it.
Then, add a CoMutex to protect concurrent access of block-copy
data structures. This mutex also protects .copy_bitmap, because its thread-safe
API does not prevent it from assigning two tasks to the same
bitmap region.

Exceptions to the lock:
- .sleep_state is handled in the series "coroutine: new sleep/wake API"
and thus here left as TODO.

- .finished, .cancelled and reads to .ret and .error_is_read will be
protected in the following patch, because are used also outside
coroutines.

- .skip_unallocated is atomic. Including it under the mutex would
increase the critical sections and make them also much more complex.
We can have it as atomic since it is only written from outside and
read by block-copy coroutines.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20210624072043.180494-5-eesposit@redhat.com>
  [vsementsov: fix typo in comment]
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2021-06-25 14:33:39 +03:00
Emanuele Giuseppe Esposito
e3dd339fee block-copy: move progress_set_remaining in block_copy_task_end
Moving this function in task_end ensures to update the progress
anyways, even if there is an error.

It also helps in next patch, allowing task_end to have only
one critical section.

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20210624072043.180494-4-eesposit@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2021-06-25 14:33:35 +03:00
Paolo Bonzini
05d5e12b24 block-copy: streamline choice of copy_range vs. read/write
Put the logic to determine the copy size in a separate function, so
that there is a simple state machine for the possible methods of
copying data from one BlockDriverState to the other.

Use .method instead of .copy_range as in-out argument, and
include also .zeroes as an additional copy method.

While at it, store the common computation of block_copy_max_transfer
into a new field of BlockCopyState, and make sure that we always
obey max_transfer; that's more efficient even for the
COPY_RANGE_READ_WRITE case.

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210624072043.180494-3-eesposit@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2021-06-25 14:33:33 +03:00
Emanuele Giuseppe Esposito
c6a3e3df30 block-copy: small refactor in block_copy_task_entry and block_copy_common
Use a local variable instead of referencing BlockCopyState through a
BlockCopyCallState or BlockCopyTask every time.
This is in preparation for next patches.

No functional change intended.

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20210624072043.180494-2-eesposit@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2021-06-25 14:32:09 +03:00
Emanuele Giuseppe Esposito
a7b4f8fc09 progressmeter: protect with a mutex
Progressmeter is protected by the AioContext mutex, which
is taken by the block jobs and their caller (like blockdev).

We would like to remove the dependency of block layer code on the
AioContext mutex, since most drivers and the core I/O code are already
not relying on it.

Create a new C file to implement the ProgressMeter API, but keep the
struct as public, to avoid forcing allocation on the heap.

Also add a mutex to be able to provide an accurate snapshot of the
progress values to the caller.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20210614081130.22134-5-eesposit@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2021-06-25 14:24:24 +03:00
Paolo Bonzini
ca657c99e6 block-copy: let ratelimit handle a speed of 0
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20210614081130.22134-3-eesposit@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2021-06-25 14:24:16 +03:00
Paolo Bonzini
bd80936a4f file-posix: handle EINTR during ioctl
Similar to other handle_aiocb_* functions, handle_aiocb_ioctl needs to cater
for the possibility that ioctl is interrupted by a signal.  Otherwise, the
I/O is incorrectly reported as a failure to the guest.

Reported-by: Gordon Watson <gwatson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-25 10:54:13 +02:00
Joelle van Dyne
09e20abdda block: detect DKIOCGETBLOCKCOUNT/SIZE before use
iOS hosts do not have these defined so we fallback to the
default behaviour.

Co-authored-by: Warner Losh <imp@bsdimp.com>
Signed-off-by: Joelle van Dyne <j@getutm.app>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-25 10:54:13 +02:00
Paolo Bonzini
267cd53f5f block: try BSD disk size ioctls one after another
Try all the possible ioctls for disk size as long as they are
supported, to keep the #if ladder simple.

Extracted and cleaned up from a patch by Joelle van Dyne and
Warner Losh.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-25 10:54:13 +02:00
Joelle van Dyne
14176c8d05 block: feature detection for host block support
On Darwin (iOS), there are no system level APIs for directly accessing
host block devices. We detect this at configure time.

Signed-off-by: Joelle van Dyne <j@getutm.app>
Message-Id: <20210315180341.31638-2-j@getutm.app>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-25 10:54:13 +02:00
Paolo Bonzini
18473467d5 file-posix: try BLKSECTGET on block devices too, do not round to power of 2
bs->sg is only true for character devices, but block devices can also
be used with scsi-block and scsi-generic.  Unfortunately BLKSECTGET
returns bytes in an int for /dev/sgN devices, and sectors in a short
for block devices, so account for that in the code.

The maximum transfer also need not be a power of 2 (for example I have
seen disks with 1280 KiB maximum transfer) so there's no need to pass
the result through pow2floor.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-25 10:54:13 +02:00
Paolo Bonzini
24b36e9813 block: add max_hw_transfer to BlockLimits
For block host devices, I/O can happen through either the kernel file
descriptor I/O system calls (preadv/pwritev, io_submit, io_uring)
or the SCSI passthrough ioctl SG_IO.

In the latter case, the size of each transfer can be limited by the
HBA, while for file descriptor I/O the kernel is able to split and
merge I/O in smaller pieces as needed.  Applying the HBA limits to
file descriptor I/O results in more system calls and suboptimal
performance, so this patch splits the max_transfer limit in two:
max_transfer remains valid and is used in general, while max_hw_transfer
is limited to the maximum hardware size.  max_hw_transfer can then be
included by the scsi-generic driver in the block limits page, to ensure
that the stricter hardware limit is used.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-25 10:54:13 +02:00
Paolo Bonzini
b99f7fa08a block-backend: align max_transfer to request alignment
Block device requests must be aligned to bs->bl.request_alignment.
It makes sense for drivers to align bs->bl.max_transfer the same
way; however when there is no specified limit, blk_get_max_transfer
just returns INT_MAX.  Since the contract of the function does not
specify that INT_MAX means "no maximum", just align the outcome
of the function (whether INT_MAX or bs->bl.max_transfer) before
returning it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-25 10:54:13 +02:00
Paolo Bonzini
01ef8185b8 scsi-generic: pass max_segments via max_iov field in BlockLimits
I/O to a disk via read/write is not limited by the number of segments allowed
by the host adapter; the kernel can split requests if needed, and the limit
imposed by the host adapter can be very low (256k or so) to avoid that SG_IO
returns EINVAL if memory is heavily fragmented.

Since this value is only interesting for SG_IO-based I/O, do not include
it in the max_transfer and only take it into account when patching the
block limits VPD page in the scsi-generic device.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2021-06-25 10:54:12 +02:00
Paolo Bonzini
8ad5ab6148 file-posix: fix max_iov for /dev/sg devices
Even though it was only called for devices that have bs->sg set (which
must be character devices), sg_get_max_segments looked at /sys/dev/block
which only works for block devices.

On Linux the sg driver has its own way to provide the maximum number of
iovecs in a scatter/gather list, so add support for it.  The block device
path is kept because it will be reinstated in the next patches.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2021-06-25 10:54:12 +02:00
Max Reitz
32a9a245d7 block/snapshot: Clarify goto fallback behavior
In the bdrv_snapshot_goto() fallback code, we work with a pointer to
either bs->file or bs->backing.  We detach that child, close the node
(with .bdrv_close()), apply the snapshot on the child node, and then
re-open the node (with .bdrv_open()).

In order for .bdrv_open() to attach the same child node that we had
before, we pass "file={child-node}" or "backing={child-node}" to it.
Therefore, when .bdrv_open() has returned success, we can assume that
bs->file or bs->backing (respectively) points to our original child
again.  This is verified by an assertion.

All of this is not immediately clear from a quick glance at the code,
so add a comment to the assertion what it is for, and why it is valid.
It certainly confused Coverity.

Reported-by: Coverity (CID 1452774)
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210503095418.31521-1-mreitz@redhat.com>
[mreitz: s/close/detach/]
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2021-06-24 09:49:04 +02:00
Vladimir Sementsov-Ogievskiy
bbfb7c2f35 block/nbd: safer transition to receiving request
req->receiving is a flag of request being in one concrete yield point
in nbd_co_do_receive_one_chunk().

Such kind of boolean flag is always better to unset before scheduling
the coroutine, to avoid double scheduling. So, let's be more careful.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210610100802.5888-33-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 12:21:22 -05:00
Vladimir Sementsov-Ogievskiy
91e0998f5a block/nbd: add nbd_client_connected() helper
We already have two similar helpers for other state. Let's add another
one for convenience.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210610100802.5888-32-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 12:21:22 -05:00
Vladimir Sementsov-Ogievskiy
a71d597b98 block/nbd: reuse nbd_co_do_establish_connection() in nbd_open()
The only last step we need to reuse the function is coroutine-wrapper.
nbd_open() may be called from non-coroutine context. So, generate the
wrapper and use it.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210610100802.5888-31-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 12:21:22 -05:00
Vladimir Sementsov-Ogievskiy
97cf89259e nbd/client-connection: add option for non-blocking connection attempt
We'll need a possibility of non-blocking nbd_co_establish_connection(),
so that it returns immediately, and it returns success only if a
connections was previously established in background.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210610100802.5888-30-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 12:21:22 -05:00
Vladimir Sementsov-Ogievskiy
51edbf537d block/nbd: split nbd_co_do_establish_connection out of nbd_reconnect_attempt
Split out the part that we want to reuse for nbd_open().

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610100802.5888-29-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 12:21:21 -05:00
Vladimir Sementsov-Ogievskiy
43cb34dede nbd/client-connection: return only one io channel
block/nbd doesn't need underlying sioc channel anymore. So, we can
update nbd/client-connection interface to return only one top-most io
channel, which is more straight forward.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210610100802.5888-27-vsementsov@virtuozzo.com>
[eblake: squash in Vladimir's fixes for uninit usage caught by clang]
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 12:20:53 -05:00
Vladimir Sementsov-Ogievskiy
95a078ea3e block/nbd: drop BDRVNBDState::sioc
Currently sioc pointer is used just to pass from socket-connection to
nbd negotiation. Drop the field, and use local variables instead. With
next commit we'll update nbd/client-connection.c to behave
appropriately (return only top-most ioc, not two channels).

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210610100802.5888-26-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 10:59:54 -05:00
Vladimir Sementsov-Ogievskiy
c2405af0e4 block/nbd: don't touch s->sioc in nbd_teardown_connection()
Negotiation during reconnect is now done in a thread, and s->sioc is
not available during negotiation. Negotiation in thread will be
cancelled by nbd_client_connection_release() called from
nbd_clear_bdrvstate().  So, we don't need this code chunk anymore.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210610100802.5888-25-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 10:59:54 -05:00
Vladimir Sementsov-Ogievskiy
6d2b0332d3 block/nbd: use negotiation of NBDClientConnection
Now that we can opt in to negotiation as part of the client connection
thread, use that to simplify connection_co.  This is another step on
the way to moving all reconnect code into NBDClientConnection.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210610100802.5888-24-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 10:59:54 -05:00
Vladimir Sementsov-Ogievskiy
e9ba7788b0 block/nbd: split nbd_handle_updated_info out of nbd_client_handshake()
To be reused in the following patch.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Roman Kagan <rvkagan@yandex-team.ru>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210610100802.5888-23-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 10:59:54 -05:00
Vladimir Sementsov-Ogievskiy
130d49baa5 nbd/client-connection: add possibility of negotiation
Add arguments and logic to support nbd negotiation in the same thread
after successful connection.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610100802.5888-20-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy
5276c87c12 nbd: move connection code from block/nbd to nbd/client-connection
We now have bs-independent connection API, which consists of four
functions:

  nbd_client_connection_new()
  nbd_client_connection_release()
  nbd_co_establish_connection()
  nbd_co_establish_connection_cancel()

Move them to a separate file together with NBDClientConnection
structure which becomes private to the new API.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210610100802.5888-18-vsementsov@virtuozzo.com>
[eblake: comment tweaks]
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy
248d470198 block/nbd: introduce nbd_client_connection_release()
This is a last step of creating bs-independent nbd connection
interface. With next commit we can finally move it to separate file.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610100802.5888-17-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy
f68729747d block/nbd: introduce nbd_client_connection_new()
This is a step of creating bs-independent nbd connection interface.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Roman Kagan <rvkagan@yandex-team.ru>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210610100802.5888-16-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy
90ddc64fb2 block/nbd: rename NBDConnectThread to NBDClientConnection
We are going to move the connection code to its own file, and want
clear names and APIs first.

The structure is shared between user and (possibly) several runs of
connect-thread. So it's wrong to call it "thread". Let's rename to
something more generic.

Appropriately rename connect_thread and thr variables to conn.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Roman Kagan <rvkagan@yandex-team.ru>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210610100802.5888-15-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy
c3e7730485 block/nbd: make nbd_co_establish_connection_cancel() bs-independent
nbd_co_establish_connection_cancel() actually needs only pointer to
NBDConnectThread. So, make it clean.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Roman Kagan <rvkagan@yandex-team.ru>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210610100802.5888-14-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy
d33833d7af block/nbd: bs-independent interface for nbd_co_establish_connection()
We are going to split connection code to a separate file. Now we are
ready to give nbd_co_establish_connection() clean and bs-independent
interface.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Roman Kagan <rvkagan@yandex-team.ru>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210610100802.5888-13-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy
b8e8a3d116 block/nbd: drop thr->state
We don't need all these states. The code refactored to use two boolean
variables looks simpler.

While moving the comment in nbd_co_establish_connection() rework it to
give better information. Also, we are going to move the connection code
to separate file and mentioning drained section would be confusing.

Improve also the comment in NBDConnectThread, while dropping removed
state names from it.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610100802.5888-12-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: comment tweak]
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy
08ea55d068 block/nbd: simplify waking of nbd_co_establish_connection()
Instead of managing connect_bh, bh_ctx, and wait_connect fields, we
can use a single link to the waiting coroutine with proper mutex
protection.

So new logic is:

nbd_co_establish_connection() sets wait_co under the mutex, releases
the mutex, then yield()s.  Note that wait_co may be scheduled by the
thread immediately after unlocking the mutex.  Still, the main thread
(or iothread) will not reach the code for entering the coroutine until
the yield(), so we are safe.

connect_thread_func() and nbd_co_establish_connection_cancel() do
the following to handle wait_co:

Under the mutex, if thr->wait_co is not NULL, make it NULL and
schedule it. This way, we avoid scheduling the coroutine twice.

Still scheduling is a bit different:

In connect_thread_func() we can just call aio_co_wake under mutex,
after commit
   [async: the main AioContext is only "current" if under the BQL]
we are sure that aio_co_wake() will not try to acquire the aio context
and do qemu_aio_coroutine_enter() but simply schedule the coroutine by
aio_co_schedule().

nbd_co_establish_connection_cancel() will be called from non-coroutine
context in further patch and will be able to go through
qemu_aio_coroutine_enter() path of aio_co_wake(). So keep current
behavior of waking the coroutine after the critical section.

Also, this commit reduces the dependence of
nbd_co_establish_connection() on the internals of bs (we now use a
generic pointer to the coroutine, instead of direct use of
s->connection_co).  This is a step towards splitting the connection
API out of nbd.c.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610100802.5888-11-vsementsov@virtuozzo.com>
Reviewied-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy
2def3edb4b block/nbd: BDRVNBDState: drop unused connect_err and connect_status
These fields are write-only. Drop them.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Roman Kagan <rvkagan@yandex-team.ru>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210610100802.5888-10-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy
2a25def4be block/nbd: nbd_client_handshake(): fix leak of s->ioc
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Roman Kagan <rvkagan@yandex-team.ru>
Message-Id: <20210610100802.5888-9-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 10:59:53 -05:00
Roman Kagan
e8b35bf5dc block/nbd: ensure ->connection_thread is always valid
Simplify lifetime management of BDRVNBDState->connect_thread by
delaying the possible cleanup of it until the BDRVNBDState itself goes
away.

This also reverts
 0267101af6 "block/nbd: fix possible use after free of s->connect_thread"
as now s->connect_thread can't be cleared until the very end.

Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
 [vsementsov: rebase, revert 0267101af6 changes]
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
 [eblake: tweak comment]
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210610100802.5888-8-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy
6cc702beac block/nbd: call socket_address_parse_named_fd() in advance
Detecting monitor by current coroutine works bad when we are not in
coroutine context. And that's exactly so in nbd reconnect code, where
qio_channel_socket_connect_sync() is called from thread.

Monitor is needed only to parse named file descriptor. So, let's just
parse it during nbd_open(), so that all further users of s->saddr don't
need to access monitor.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610100802.5888-7-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy
fb392b548e block/nbd: connect_thread_func(): do qio_channel_set_delay(false)
nbd_open() does it (through nbd_establish_connection()).
Actually we lost that call on reconnect path in 1dc4718d849e1a1fe
"block/nbd: use non-blocking connect: fix vm hang on connect()"
when we have introduced reconnect thread.

Fixes: 1dc4718d849e1a1fe665ce5241ed79048cfa2cfc
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610100802.5888-5-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy
bbba1c376b block/nbd: fix how state is cleared on nbd_open() failure paths
We have two "return error" paths in nbd_open() after
nbd_process_options(). Actually we should call nbd_clear_bdrvstate()
on these paths. Interesting that nbd_process_options() calls
nbd_clear_bdrvstate() by itself.

Let's fix leaks and refactor things to be more obvious:

- intialize yank at top of nbd_open()
- move yank cleanup to nbd_clear_bdrvstate()
- refactor nbd_open() so that all failure paths except for
  yank-register goes through nbd_clear_bdrvstate()

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Roman Kagan <rvkagan@yandex-team.ru>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210610100802.5888-4-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 10:59:53 -05:00
Roman Kagan
3687ad4903 block/nbd: fix channel object leak
nbd_free_connect_thread leaks the channel object if it hasn't been
stolen.

Unref it and fix the leak.

Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210610100802.5888-3-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 10:59:52 -05:00
Daniel P. Berrangé
39683553f9 block: use GDateTime for formatting timestamp when dumping snapshot info
The GDateTime APIs provided by GLib avoid portability pitfalls, such
as some platforms where 'struct timeval.tv_sec' field is still 'long'
instead of 'time_t'. When combined with automatic cleanup, GDateTime
often results in simpler code too.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-06-14 13:28:50 +01:00
Daniel P. Berrangé
99be1ac366 block: remove duplicate trace.h include
Reviewed-by: Connor Kuehl <ckuehl@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-06-14 13:28:50 +01:00
Daniel P. Berrangé
60ff2ae2a2 block: add trace point when fdatasync fails
A flush failure is a critical failure scenario for some operations.
For example, it will prevent migration from completing, as it will
make vm_stop() report an error. Thus it is important to have a
trace point present for debugging.

Reviewed-by: Connor Kuehl <ckuehl@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-06-14 13:28:50 +01:00
Daniel P. Berrangé
c7ddc8821d block: preserve errno from fdatasync failures
When fdatasync() fails on a file backend we set a flag that
short-circuits any future attempts to call fdatasync(). The
first failure returns the true errno, but the later short-
circuited calls return a generic EIO. The latter is unhelpful
because fdatasync() can return a variety of errnos, including
EACCESS.

Reviewed-by: Connor Kuehl <ckuehl@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-06-14 13:28:50 +01:00
Paolo Bonzini
7fa1c63553 iscsi: link libm into the module
Depending on the configuration of QEMU, some binaries might not need libm
at all.  In that case libiscsi, which uses exp(), will fail to load.
Link it in the module explicitly.

Reported-by: Yi Sun <yisun@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-04 13:47:07 +02:00
Paolo Bonzini
96acfb1f25 meson: allow optional dependencies for block modules
Right now all dependencies for block modules are passed to
module_ss.add(when: ...), so they are mandatory.  In the next patch we
will need to add a libm dependency to a module, but libm does not exist
on all systems.  So, modify the creation of module_ss and modsrc so that
dependencies can also be passed to module_ss.add(if_true: ...).

While touching the array, remove the useless dependency of the curl
module on glib.  glib is always linked in QEMU and in fact all other
block modules also need it, but they don't have to specify it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-04 13:47:07 +02:00