This patch introduces VHOST_USER_PROTOCOL_F_HOST_NOTIFIER. With this feature negotiated, vhost-user backend can register memory region based host notifiers. And it will allow the guest driver in the VM to notify the hardware accelerator at the vhost-user backend directly. Signed-off-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
		
			
				
	
	
		
			838 lines
		
	
	
		
			31 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			838 lines
		
	
	
		
			31 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
Vhost-user Protocol
 | 
						|
===================
 | 
						|
 | 
						|
Copyright (c) 2014 Virtual Open Systems Sarl.
 | 
						|
 | 
						|
This work is licensed under the terms of the GNU GPL, version 2 or later.
 | 
						|
See the COPYING file in the top-level directory.
 | 
						|
===================
 | 
						|
 | 
						|
This protocol is aiming to complement the ioctl interface used to control the
 | 
						|
vhost implementation in the Linux kernel. It implements the control plane needed
 | 
						|
to establish virtqueue sharing with a user space process on the same host. It
 | 
						|
uses communication over a Unix domain socket to share file descriptors in the
 | 
						|
ancillary data of the message.
 | 
						|
 | 
						|
The protocol defines 2 sides of the communication, master and slave. Master is
 | 
						|
the application that shares its virtqueues, in our case QEMU. Slave is the
 | 
						|
consumer of the virtqueues.
 | 
						|
 | 
						|
In the current implementation QEMU is the Master, and the Slave is intended to
 | 
						|
be a software Ethernet switch running in user space, such as Snabbswitch.
 | 
						|
 | 
						|
Master and slave can be either a client (i.e. connecting) or server (listening)
 | 
						|
in the socket communication.
 | 
						|
 | 
						|
Message Specification
 | 
						|
---------------------
 | 
						|
 | 
						|
Note that all numbers are in the machine native byte order. A vhost-user message
 | 
						|
consists of 3 header fields and a payload:
 | 
						|
 | 
						|
------------------------------------
 | 
						|
| request | flags | size | payload |
 | 
						|
------------------------------------
 | 
						|
 | 
						|
 * Request: 32-bit type of the request
 | 
						|
 * Flags: 32-bit bit field:
 | 
						|
   - Lower 2 bits are the version (currently 0x01)
 | 
						|
   - Bit 2 is the reply flag - needs to be sent on each reply from the slave
 | 
						|
   - Bit 3 is the need_reply flag - see VHOST_USER_PROTOCOL_F_REPLY_ACK for
 | 
						|
     details.
 | 
						|
 * Size - 32-bit size of the payload
 | 
						|
 | 
						|
 | 
						|
Depending on the request type, payload can be:
 | 
						|
 | 
						|
 * A single 64-bit integer
 | 
						|
   -------
 | 
						|
   | u64 |
 | 
						|
   -------
 | 
						|
 | 
						|
   u64: a 64-bit unsigned integer
 | 
						|
 | 
						|
 * A vring state description
 | 
						|
   ---------------
 | 
						|
   | index | num |
 | 
						|
   ---------------
 | 
						|
 | 
						|
   Index: a 32-bit index
 | 
						|
   Num: a 32-bit number
 | 
						|
 | 
						|
 * A vring address description
 | 
						|
   --------------------------------------------------------------
 | 
						|
   | index | flags | size | descriptor | used | available | log |
 | 
						|
   --------------------------------------------------------------
 | 
						|
 | 
						|
   Index: a 32-bit vring index
 | 
						|
   Flags: a 32-bit vring flags
 | 
						|
   Descriptor: a 64-bit ring address of the vring descriptor table
 | 
						|
   Used: a 64-bit ring address of the vring used ring
 | 
						|
   Available: a 64-bit ring address of the vring available ring
 | 
						|
   Log: a 64-bit guest address for logging
 | 
						|
 | 
						|
   Note that a ring address is an IOVA if VIRTIO_F_IOMMU_PLATFORM has been
 | 
						|
   negotiated.  Otherwise it is a user address.
 | 
						|
 | 
						|
 * Memory regions description
 | 
						|
   ---------------------------------------------------
 | 
						|
   | num regions | padding | region0 | ... | region7 |
 | 
						|
   ---------------------------------------------------
 | 
						|
 | 
						|
   Num regions: a 32-bit number of regions
 | 
						|
   Padding: 32-bit
 | 
						|
 | 
						|
   A region is:
 | 
						|
   -----------------------------------------------------
 | 
						|
   | guest address | size | user address | mmap offset |
 | 
						|
   -----------------------------------------------------
 | 
						|
 | 
						|
   Guest address: a 64-bit guest address of the region
 | 
						|
   Size: a 64-bit size
 | 
						|
   User address: a 64-bit user address
 | 
						|
   mmap offset: 64-bit offset where region starts in the mapped memory
 | 
						|
 | 
						|
* Log description
 | 
						|
   ---------------------------
 | 
						|
   | log size | log offset |
 | 
						|
   ---------------------------
 | 
						|
   log size: size of area used for logging
 | 
						|
   log offset: offset from start of supplied file descriptor
 | 
						|
       where logging starts (i.e. where guest address 0 would be logged)
 | 
						|
 | 
						|
 * An IOTLB message
 | 
						|
   ---------------------------------------------------------
 | 
						|
   | iova | size | user address | permissions flags | type |
 | 
						|
   ---------------------------------------------------------
 | 
						|
 | 
						|
   IOVA: a 64-bit I/O virtual address programmed by the guest
 | 
						|
   Size: a 64-bit size
 | 
						|
   User address: a 64-bit user address
 | 
						|
   Permissions: a 8-bit value:
 | 
						|
    - 0: No access
 | 
						|
    - 1: Read access
 | 
						|
    - 2: Write access
 | 
						|
    - 3: Read/Write access
 | 
						|
   Type: a 8-bit IOTLB message type:
 | 
						|
    - 1: IOTLB miss
 | 
						|
    - 2: IOTLB update
 | 
						|
    - 3: IOTLB invalidate
 | 
						|
    - 4: IOTLB access fail
 | 
						|
 | 
						|
 * Virtio device config space
 | 
						|
   -----------------------------------
 | 
						|
   | offset | size | flags | payload |
 | 
						|
   -----------------------------------
 | 
						|
 | 
						|
   Offset: a 32-bit offset of virtio device's configuration space
 | 
						|
   Size: a 32-bit configuration space access size in bytes
 | 
						|
   Flags: a 32-bit value:
 | 
						|
    - 0: Vhost master messages used for writeable fields
 | 
						|
    - 1: Vhost master messages used for live migration
 | 
						|
   Payload: Size bytes array holding the contents of the virtio
 | 
						|
       device's configuration space
 | 
						|
 | 
						|
 * Vring area description
 | 
						|
   -----------------------
 | 
						|
   | u64 | size | offset |
 | 
						|
   -----------------------
 | 
						|
 | 
						|
   u64: a 64-bit integer contains vring index and flags
 | 
						|
   Size: a 64-bit size of this area
 | 
						|
   Offset: a 64-bit offset of this area from the start of the
 | 
						|
       supplied file descriptor
 | 
						|
 | 
						|
In QEMU the vhost-user message is implemented with the following struct:
 | 
						|
 | 
						|
typedef struct VhostUserMsg {
 | 
						|
    VhostUserRequest request;
 | 
						|
    uint32_t flags;
 | 
						|
    uint32_t size;
 | 
						|
    union {
 | 
						|
        uint64_t u64;
 | 
						|
        struct vhost_vring_state state;
 | 
						|
        struct vhost_vring_addr addr;
 | 
						|
        VhostUserMemory memory;
 | 
						|
        VhostUserLog log;
 | 
						|
        struct vhost_iotlb_msg iotlb;
 | 
						|
        VhostUserConfig config;
 | 
						|
        VhostUserVringArea area;
 | 
						|
    };
 | 
						|
} QEMU_PACKED VhostUserMsg;
 | 
						|
 | 
						|
Communication
 | 
						|
-------------
 | 
						|
 | 
						|
The protocol for vhost-user is based on the existing implementation of vhost
 | 
						|
for the Linux Kernel. Most messages that can be sent via the Unix domain socket
 | 
						|
implementing vhost-user have an equivalent ioctl to the kernel implementation.
 | 
						|
 | 
						|
The communication consists of master sending message requests and slave sending
 | 
						|
message replies. Most of the requests don't require replies. Here is a list of
 | 
						|
the ones that do:
 | 
						|
 | 
						|
 * VHOST_USER_GET_FEATURES
 | 
						|
 * VHOST_USER_GET_PROTOCOL_FEATURES
 | 
						|
 * VHOST_USER_GET_VRING_BASE
 | 
						|
 * VHOST_USER_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
 | 
						|
 | 
						|
[ Also see the section on REPLY_ACK protocol extension. ]
 | 
						|
 | 
						|
There are several messages that the master sends with file descriptors passed
 | 
						|
in the ancillary data:
 | 
						|
 | 
						|
 * VHOST_USER_SET_MEM_TABLE
 | 
						|
 * VHOST_USER_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
 | 
						|
 * VHOST_USER_SET_LOG_FD
 | 
						|
 * VHOST_USER_SET_VRING_KICK
 | 
						|
 * VHOST_USER_SET_VRING_CALL
 | 
						|
 * VHOST_USER_SET_VRING_ERR
 | 
						|
 * VHOST_USER_SET_SLAVE_REQ_FD
 | 
						|
 | 
						|
If Master is unable to send the full message or receives a wrong reply it will
 | 
						|
close the connection. An optional reconnection mechanism can be implemented.
 | 
						|
 | 
						|
Any protocol extensions are gated by protocol feature bits,
 | 
						|
which allows full backwards compatibility on both master
 | 
						|
and slave.
 | 
						|
As older slaves don't support negotiating protocol features,
 | 
						|
a feature bit was dedicated for this purpose:
 | 
						|
#define VHOST_USER_F_PROTOCOL_FEATURES 30
 | 
						|
 | 
						|
Starting and stopping rings
 | 
						|
----------------------
 | 
						|
Client must only process each ring when it is started.
 | 
						|
 | 
						|
Client must only pass data between the ring and the
 | 
						|
backend, when the ring is enabled.
 | 
						|
 | 
						|
If ring is started but disabled, client must process the
 | 
						|
ring without talking to the backend.
 | 
						|
 | 
						|
For example, for a networking device, in the disabled state
 | 
						|
client must not supply any new RX packets, but must process
 | 
						|
and discard any TX packets.
 | 
						|
 | 
						|
If VHOST_USER_F_PROTOCOL_FEATURES has not been negotiated, the ring is initialized
 | 
						|
in an enabled state.
 | 
						|
 | 
						|
If VHOST_USER_F_PROTOCOL_FEATURES has been negotiated, the ring is initialized
 | 
						|
in a disabled state. Client must not pass data to/from the backend until ring is enabled by
 | 
						|
VHOST_USER_SET_VRING_ENABLE with parameter 1, or after it has been disabled by
 | 
						|
VHOST_USER_SET_VRING_ENABLE with parameter 0.
 | 
						|
 | 
						|
Each ring is initialized in a stopped state, client must not process it until
 | 
						|
ring is started, or after it has been stopped.
 | 
						|
 | 
						|
Client must start ring upon receiving a kick (that is, detecting that file
 | 
						|
descriptor is readable) on the descriptor specified by
 | 
						|
VHOST_USER_SET_VRING_KICK, and stop ring upon receiving
 | 
						|
VHOST_USER_GET_VRING_BASE.
 | 
						|
 | 
						|
While processing the rings (whether they are enabled or not), client must
 | 
						|
support changing some configuration aspects on the fly.
 | 
						|
 | 
						|
Multiple queue support
 | 
						|
----------------------
 | 
						|
 | 
						|
Multiple queue is treated as a protocol extension, hence the slave has to
 | 
						|
implement protocol features first. The multiple queues feature is supported
 | 
						|
only when the protocol feature VHOST_USER_PROTOCOL_F_MQ (bit 0) is set.
 | 
						|
 | 
						|
The max number of queue pairs the slave supports can be queried with message
 | 
						|
VHOST_USER_GET_QUEUE_NUM. Master should stop when the number of
 | 
						|
requested queues is bigger than that.
 | 
						|
 | 
						|
As all queues share one connection, the master uses a unique index for each
 | 
						|
queue in the sent message to identify a specified queue. One queue pair
 | 
						|
is enabled initially. More queues are enabled dynamically, by sending
 | 
						|
message VHOST_USER_SET_VRING_ENABLE.
 | 
						|
 | 
						|
Migration
 | 
						|
---------
 | 
						|
 | 
						|
During live migration, the master may need to track the modifications
 | 
						|
the slave makes to the memory mapped regions. The client should mark
 | 
						|
the dirty pages in a log. Once it complies to this logging, it may
 | 
						|
declare the VHOST_F_LOG_ALL vhost feature.
 | 
						|
 | 
						|
To start/stop logging of data/used ring writes, server may send messages
 | 
						|
VHOST_USER_SET_FEATURES with VHOST_F_LOG_ALL and VHOST_USER_SET_VRING_ADDR with
 | 
						|
VHOST_VRING_F_LOG in ring's flags set to 1/0, respectively.
 | 
						|
 | 
						|
All the modifications to memory pointed by vring "descriptor" should
 | 
						|
be marked. Modifications to "used" vring should be marked if
 | 
						|
VHOST_VRING_F_LOG is part of ring's flags.
 | 
						|
 | 
						|
Dirty pages are of size:
 | 
						|
#define VHOST_LOG_PAGE 0x1000
 | 
						|
 | 
						|
The log memory fd is provided in the ancillary data of
 | 
						|
VHOST_USER_SET_LOG_BASE message when the slave has
 | 
						|
VHOST_USER_PROTOCOL_F_LOG_SHMFD protocol feature.
 | 
						|
 | 
						|
The size of the log is supplied as part of VhostUserMsg
 | 
						|
which should be large enough to cover all known guest
 | 
						|
addresses. Log starts at the supplied offset in the
 | 
						|
supplied file descriptor.
 | 
						|
The log covers from address 0 to the maximum of guest
 | 
						|
regions. In pseudo-code, to mark page at "addr" as dirty:
 | 
						|
 | 
						|
page = addr / VHOST_LOG_PAGE
 | 
						|
log[page / 8] |= 1 << page % 8
 | 
						|
 | 
						|
Where addr is the guest physical address.
 | 
						|
 | 
						|
Use atomic operations, as the log may be concurrently manipulated.
 | 
						|
 | 
						|
Note that when logging modifications to the used ring (when VHOST_VRING_F_LOG
 | 
						|
is set for this ring), log_guest_addr should be used to calculate the log
 | 
						|
offset: the write to first byte of the used ring is logged at this offset from
 | 
						|
log start. Also note that this value might be outside the legal guest physical
 | 
						|
address range (i.e. does not have to be covered by the VhostUserMemory table),
 | 
						|
but the bit offset of the last byte of the ring must fall within
 | 
						|
the size supplied by VhostUserLog.
 | 
						|
 | 
						|
VHOST_USER_SET_LOG_FD is an optional message with an eventfd in
 | 
						|
ancillary data, it may be used to inform the master that the log has
 | 
						|
been modified.
 | 
						|
 | 
						|
Once the source has finished migration, rings will be stopped by
 | 
						|
the source. No further update must be done before rings are
 | 
						|
restarted.
 | 
						|
 | 
						|
In postcopy migration the slave is started before all the memory has been
 | 
						|
received from the source host, and care must be taken to avoid accessing pages
 | 
						|
that have yet to be received.  The slave opens a 'userfault'-fd and registers
 | 
						|
the memory with it; this fd is then passed back over to the master.
 | 
						|
The master services requests on the userfaultfd for pages that are accessed
 | 
						|
and when the page is available it performs WAKE ioctl's on the userfaultfd
 | 
						|
to wake the stalled slave.  The client indicates support for this via the
 | 
						|
VHOST_USER_PROTOCOL_F_PAGEFAULT feature.
 | 
						|
 | 
						|
Memory access
 | 
						|
-------------
 | 
						|
 | 
						|
The master sends a list of vhost memory regions to the slave using the
 | 
						|
VHOST_USER_SET_MEM_TABLE message.  Each region has two base addresses: a guest
 | 
						|
address and a user address.
 | 
						|
 | 
						|
Messages contain guest addresses and/or user addresses to reference locations
 | 
						|
within the shared memory.  The mapping of these addresses works as follows.
 | 
						|
 | 
						|
User addresses map to the vhost memory region containing that user address.
 | 
						|
 | 
						|
When the VIRTIO_F_IOMMU_PLATFORM feature has not been negotiated:
 | 
						|
 | 
						|
 * Guest addresses map to the vhost memory region containing that guest
 | 
						|
   address.
 | 
						|
 | 
						|
When the VIRTIO_F_IOMMU_PLATFORM feature has been negotiated:
 | 
						|
 | 
						|
 * Guest addresses are also called I/O virtual addresses (IOVAs).  They are
 | 
						|
   translated to user addresses via the IOTLB.
 | 
						|
 | 
						|
 * The vhost memory region guest address is not used.
 | 
						|
 | 
						|
IOMMU support
 | 
						|
-------------
 | 
						|
 | 
						|
When the VIRTIO_F_IOMMU_PLATFORM feature has been negotiated, the master
 | 
						|
sends IOTLB entries update & invalidation by sending VHOST_USER_IOTLB_MSG
 | 
						|
requests to the slave with a struct vhost_iotlb_msg as payload. For update
 | 
						|
events, the iotlb payload has to be filled with the update message type (2),
 | 
						|
the I/O virtual address, the size, the user virtual address, and the
 | 
						|
permissions flags. Addresses and size must be within vhost memory regions set
 | 
						|
via the VHOST_USER_SET_MEM_TABLE request. For invalidation events, the iotlb
 | 
						|
payload has to be filled with the invalidation message type (3), the I/O virtual
 | 
						|
address and the size. On success, the slave is expected to reply with a zero
 | 
						|
payload, non-zero otherwise.
 | 
						|
 | 
						|
The slave relies on the slave communcation channel (see "Slave communication"
 | 
						|
section below) to send IOTLB miss and access failure events, by sending
 | 
						|
VHOST_USER_SLAVE_IOTLB_MSG requests to the master with a struct vhost_iotlb_msg
 | 
						|
as payload. For miss events, the iotlb payload has to be filled with the miss
 | 
						|
message type (1), the I/O virtual address and the permissions flags. For access
 | 
						|
failure event, the iotlb payload has to be filled with the access failure
 | 
						|
message type (4), the I/O virtual address and the permissions flags.
 | 
						|
For synchronization purpose, the slave may rely on the reply-ack feature,
 | 
						|
so the master may send a reply when operation is completed if the reply-ack
 | 
						|
feature is negotiated and slaves requests a reply. For miss events, completed
 | 
						|
operation means either master sent an update message containing the IOTLB entry
 | 
						|
containing requested address and permission, or master sent nothing if the IOTLB
 | 
						|
miss message is invalid (invalid IOVA or permission).
 | 
						|
 | 
						|
The master isn't expected to take the initiative to send IOTLB update messages,
 | 
						|
as the slave sends IOTLB miss messages for the guest virtual memory areas it
 | 
						|
needs to access.
 | 
						|
 | 
						|
Slave communication
 | 
						|
-------------------
 | 
						|
 | 
						|
An optional communication channel is provided if the slave declares
 | 
						|
VHOST_USER_PROTOCOL_F_SLAVE_REQ protocol feature, to allow the slave to make
 | 
						|
requests to the master.
 | 
						|
 | 
						|
The fd is provided via VHOST_USER_SET_SLAVE_REQ_FD ancillary data.
 | 
						|
 | 
						|
A slave may then send VHOST_USER_SLAVE_* messages to the master
 | 
						|
using this fd communication channel.
 | 
						|
 | 
						|
If VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD protocol feature is negotiated,
 | 
						|
slave can send file descriptors (at most 8 descriptors in each message)
 | 
						|
to master via ancillary data using this fd communication channel.
 | 
						|
 | 
						|
Protocol features
 | 
						|
-----------------
 | 
						|
 | 
						|
#define VHOST_USER_PROTOCOL_F_MQ             0
 | 
						|
#define VHOST_USER_PROTOCOL_F_LOG_SHMFD      1
 | 
						|
#define VHOST_USER_PROTOCOL_F_RARP           2
 | 
						|
#define VHOST_USER_PROTOCOL_F_REPLY_ACK      3
 | 
						|
#define VHOST_USER_PROTOCOL_F_MTU            4
 | 
						|
#define VHOST_USER_PROTOCOL_F_SLAVE_REQ      5
 | 
						|
#define VHOST_USER_PROTOCOL_F_CROSS_ENDIAN   6
 | 
						|
#define VHOST_USER_PROTOCOL_F_CRYPTO_SESSION 7
 | 
						|
#define VHOST_USER_PROTOCOL_F_PAGEFAULT      8
 | 
						|
#define VHOST_USER_PROTOCOL_F_CONFIG         9
 | 
						|
#define VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD  10
 | 
						|
#define VHOST_USER_PROTOCOL_F_HOST_NOTIFIER  11
 | 
						|
 | 
						|
Master message types
 | 
						|
--------------------
 | 
						|
 | 
						|
 * VHOST_USER_GET_FEATURES
 | 
						|
 | 
						|
      Id: 1
 | 
						|
      Equivalent ioctl: VHOST_GET_FEATURES
 | 
						|
      Master payload: N/A
 | 
						|
      Slave payload: u64
 | 
						|
 | 
						|
      Get from the underlying vhost implementation the features bitmask.
 | 
						|
      Feature bit VHOST_USER_F_PROTOCOL_FEATURES signals slave support for
 | 
						|
      VHOST_USER_GET_PROTOCOL_FEATURES and VHOST_USER_SET_PROTOCOL_FEATURES.
 | 
						|
 | 
						|
 * VHOST_USER_SET_FEATURES
 | 
						|
 | 
						|
      Id: 2
 | 
						|
      Ioctl: VHOST_SET_FEATURES
 | 
						|
      Master payload: u64
 | 
						|
 | 
						|
      Enable features in the underlying vhost implementation using a bitmask.
 | 
						|
      Feature bit VHOST_USER_F_PROTOCOL_FEATURES signals slave support for
 | 
						|
      VHOST_USER_GET_PROTOCOL_FEATURES and VHOST_USER_SET_PROTOCOL_FEATURES.
 | 
						|
 | 
						|
 * VHOST_USER_GET_PROTOCOL_FEATURES
 | 
						|
 | 
						|
      Id: 15
 | 
						|
      Equivalent ioctl: VHOST_GET_FEATURES
 | 
						|
      Master payload: N/A
 | 
						|
      Slave payload: u64
 | 
						|
 | 
						|
      Get the protocol feature bitmask from the underlying vhost implementation.
 | 
						|
      Only legal if feature bit VHOST_USER_F_PROTOCOL_FEATURES is present in
 | 
						|
      VHOST_USER_GET_FEATURES.
 | 
						|
      Note: slave that reported VHOST_USER_F_PROTOCOL_FEATURES must support
 | 
						|
      this message even before VHOST_USER_SET_FEATURES was called.
 | 
						|
 | 
						|
 * VHOST_USER_SET_PROTOCOL_FEATURES
 | 
						|
 | 
						|
      Id: 16
 | 
						|
      Ioctl: VHOST_SET_FEATURES
 | 
						|
      Master payload: u64
 | 
						|
 | 
						|
      Enable protocol features in the underlying vhost implementation.
 | 
						|
      Only legal if feature bit VHOST_USER_F_PROTOCOL_FEATURES is present in
 | 
						|
      VHOST_USER_GET_FEATURES.
 | 
						|
      Note: slave that reported VHOST_USER_F_PROTOCOL_FEATURES must support
 | 
						|
      this message even before VHOST_USER_SET_FEATURES was called.
 | 
						|
 | 
						|
 * VHOST_USER_SET_OWNER
 | 
						|
 | 
						|
      Id: 3
 | 
						|
      Equivalent ioctl: VHOST_SET_OWNER
 | 
						|
      Master payload: N/A
 | 
						|
 | 
						|
      Issued when a new connection is established. It sets the current Master
 | 
						|
      as an owner of the session. This can be used on the Slave as a
 | 
						|
      "session start" flag.
 | 
						|
 | 
						|
 * VHOST_USER_RESET_OWNER
 | 
						|
 | 
						|
      Id: 4
 | 
						|
      Master payload: N/A
 | 
						|
 | 
						|
      This is no longer used. Used to be sent to request disabling
 | 
						|
      all rings, but some clients interpreted it to also discard
 | 
						|
      connection state (this interpretation would lead to bugs).
 | 
						|
      It is recommended that clients either ignore this message,
 | 
						|
      or use it to disable all rings.
 | 
						|
 | 
						|
 * VHOST_USER_SET_MEM_TABLE
 | 
						|
 | 
						|
      Id: 5
 | 
						|
      Equivalent ioctl: VHOST_SET_MEM_TABLE
 | 
						|
      Master payload: memory regions description
 | 
						|
      Slave payload: (postcopy only) memory regions description
 | 
						|
 | 
						|
      Sets the memory map regions on the slave so it can translate the vring
 | 
						|
      addresses. In the ancillary data there is an array of file descriptors
 | 
						|
      for each memory mapped region. The size and ordering of the fds matches
 | 
						|
      the number and ordering of memory regions.
 | 
						|
 | 
						|
      When VHOST_USER_POSTCOPY_LISTEN has been received, SET_MEM_TABLE replies with
 | 
						|
      the bases of the memory mapped regions to the master.  The slave must
 | 
						|
      have mmap'd the regions but not yet accessed them and should not yet generate
 | 
						|
      a userfault event. Note NEED_REPLY_MASK is not set in this case.
 | 
						|
      QEMU will then reply back to the list of mappings with an empty
 | 
						|
      VHOST_USER_SET_MEM_TABLE as an acknowledgment; only upon reception of this
 | 
						|
      message may the guest start accessing the memory and generating faults.
 | 
						|
 | 
						|
 * VHOST_USER_SET_LOG_BASE
 | 
						|
 | 
						|
      Id: 6
 | 
						|
      Equivalent ioctl: VHOST_SET_LOG_BASE
 | 
						|
      Master payload: u64
 | 
						|
      Slave payload: N/A
 | 
						|
 | 
						|
      Sets logging shared memory space.
 | 
						|
      When slave has VHOST_USER_PROTOCOL_F_LOG_SHMFD protocol
 | 
						|
      feature, the log memory fd is provided in the ancillary data of
 | 
						|
      VHOST_USER_SET_LOG_BASE message, the size and offset of shared
 | 
						|
      memory area provided in the message.
 | 
						|
 | 
						|
 | 
						|
 * VHOST_USER_SET_LOG_FD
 | 
						|
 | 
						|
      Id: 7
 | 
						|
      Equivalent ioctl: VHOST_SET_LOG_FD
 | 
						|
      Master payload: N/A
 | 
						|
 | 
						|
      Sets the logging file descriptor, which is passed as ancillary data.
 | 
						|
 | 
						|
 * VHOST_USER_SET_VRING_NUM
 | 
						|
 | 
						|
      Id: 8
 | 
						|
      Equivalent ioctl: VHOST_SET_VRING_NUM
 | 
						|
      Master payload: vring state description
 | 
						|
 | 
						|
      Set the size of the queue.
 | 
						|
 | 
						|
 * VHOST_USER_SET_VRING_ADDR
 | 
						|
 | 
						|
      Id: 9
 | 
						|
      Equivalent ioctl: VHOST_SET_VRING_ADDR
 | 
						|
      Master payload: vring address description
 | 
						|
      Slave payload: N/A
 | 
						|
 | 
						|
      Sets the addresses of the different aspects of the vring.
 | 
						|
 | 
						|
 * VHOST_USER_SET_VRING_BASE
 | 
						|
 | 
						|
      Id: 10
 | 
						|
      Equivalent ioctl: VHOST_SET_VRING_BASE
 | 
						|
      Master payload: vring state description
 | 
						|
 | 
						|
      Sets the base offset in the available vring.
 | 
						|
 | 
						|
 * VHOST_USER_GET_VRING_BASE
 | 
						|
 | 
						|
      Id: 11
 | 
						|
      Equivalent ioctl: VHOST_USER_GET_VRING_BASE
 | 
						|
      Master payload: vring state description
 | 
						|
      Slave payload: vring state description
 | 
						|
 | 
						|
      Get the available vring base offset.
 | 
						|
 | 
						|
 * VHOST_USER_SET_VRING_KICK
 | 
						|
 | 
						|
      Id: 12
 | 
						|
      Equivalent ioctl: VHOST_SET_VRING_KICK
 | 
						|
      Master payload: u64
 | 
						|
 | 
						|
      Set the event file descriptor for adding buffers to the vring. It
 | 
						|
      is passed in the ancillary data.
 | 
						|
      Bits (0-7) of the payload contain the vring index. Bit 8 is the
 | 
						|
      invalid FD flag. This flag is set when there is no file descriptor
 | 
						|
      in the ancillary data. This signals that polling should be used
 | 
						|
      instead of waiting for a kick.
 | 
						|
 | 
						|
 * VHOST_USER_SET_VRING_CALL
 | 
						|
 | 
						|
      Id: 13
 | 
						|
      Equivalent ioctl: VHOST_SET_VRING_CALL
 | 
						|
      Master payload: u64
 | 
						|
 | 
						|
      Set the event file descriptor to signal when buffers are used. It
 | 
						|
      is passed in the ancillary data.
 | 
						|
      Bits (0-7) of the payload contain the vring index. Bit 8 is the
 | 
						|
      invalid FD flag. This flag is set when there is no file descriptor
 | 
						|
      in the ancillary data. This signals that polling will be used
 | 
						|
      instead of waiting for the call.
 | 
						|
 | 
						|
 * VHOST_USER_SET_VRING_ERR
 | 
						|
 | 
						|
      Id: 14
 | 
						|
      Equivalent ioctl: VHOST_SET_VRING_ERR
 | 
						|
      Master payload: u64
 | 
						|
 | 
						|
      Set the event file descriptor to signal when error occurs. It
 | 
						|
      is passed in the ancillary data.
 | 
						|
      Bits (0-7) of the payload contain the vring index. Bit 8 is the
 | 
						|
      invalid FD flag. This flag is set when there is no file descriptor
 | 
						|
      in the ancillary data.
 | 
						|
 | 
						|
 * VHOST_USER_GET_QUEUE_NUM
 | 
						|
 | 
						|
      Id: 17
 | 
						|
      Equivalent ioctl: N/A
 | 
						|
      Master payload: N/A
 | 
						|
      Slave payload: u64
 | 
						|
 | 
						|
      Query how many queues the backend supports. This request should be
 | 
						|
      sent only when VHOST_USER_PROTOCOL_F_MQ is set in queried protocol
 | 
						|
      features by VHOST_USER_GET_PROTOCOL_FEATURES.
 | 
						|
 | 
						|
 * VHOST_USER_SET_VRING_ENABLE
 | 
						|
 | 
						|
      Id: 18
 | 
						|
      Equivalent ioctl: N/A
 | 
						|
      Master payload: vring state description
 | 
						|
 | 
						|
      Signal slave to enable or disable corresponding vring.
 | 
						|
      This request should be sent only when VHOST_USER_F_PROTOCOL_FEATURES
 | 
						|
      has been negotiated.
 | 
						|
 | 
						|
 * VHOST_USER_SEND_RARP
 | 
						|
 | 
						|
      Id: 19
 | 
						|
      Equivalent ioctl: N/A
 | 
						|
      Master payload: u64
 | 
						|
 | 
						|
      Ask vhost user backend to broadcast a fake RARP to notify the migration
 | 
						|
      is terminated for guest that does not support GUEST_ANNOUNCE.
 | 
						|
      Only legal if feature bit VHOST_USER_F_PROTOCOL_FEATURES is present in
 | 
						|
      VHOST_USER_GET_FEATURES and protocol feature bit VHOST_USER_PROTOCOL_F_RARP
 | 
						|
      is present in VHOST_USER_GET_PROTOCOL_FEATURES.
 | 
						|
      The first 6 bytes of the payload contain the mac address of the guest to
 | 
						|
      allow the vhost user backend to construct and broadcast the fake RARP.
 | 
						|
 | 
						|
 * VHOST_USER_NET_SET_MTU
 | 
						|
 | 
						|
      Id: 20
 | 
						|
      Equivalent ioctl: N/A
 | 
						|
      Master payload: u64
 | 
						|
 | 
						|
      Set host MTU value exposed to the guest.
 | 
						|
      This request should be sent only when VIRTIO_NET_F_MTU feature has been
 | 
						|
      successfully negotiated, VHOST_USER_F_PROTOCOL_FEATURES is present in
 | 
						|
      VHOST_USER_GET_FEATURES and protocol feature bit
 | 
						|
      VHOST_USER_PROTOCOL_F_NET_MTU is present in
 | 
						|
      VHOST_USER_GET_PROTOCOL_FEATURES.
 | 
						|
      If VHOST_USER_PROTOCOL_F_REPLY_ACK is negotiated, slave must respond
 | 
						|
      with zero in case the specified MTU is valid, or non-zero otherwise.
 | 
						|
 | 
						|
 * VHOST_USER_SET_SLAVE_REQ_FD
 | 
						|
 | 
						|
      Id: 21
 | 
						|
      Equivalent ioctl: N/A
 | 
						|
      Master payload: N/A
 | 
						|
 | 
						|
      Set the socket file descriptor for slave initiated requests. It is passed
 | 
						|
      in the ancillary data.
 | 
						|
      This request should be sent only when VHOST_USER_F_PROTOCOL_FEATURES
 | 
						|
      has been negotiated, and protocol feature bit VHOST_USER_PROTOCOL_F_SLAVE_REQ
 | 
						|
      bit is present in VHOST_USER_GET_PROTOCOL_FEATURES.
 | 
						|
      If VHOST_USER_PROTOCOL_F_REPLY_ACK is negotiated, slave must respond
 | 
						|
      with zero for success, non-zero otherwise.
 | 
						|
 | 
						|
 * VHOST_USER_IOTLB_MSG
 | 
						|
 | 
						|
      Id: 22
 | 
						|
      Equivalent ioctl: N/A (equivalent to VHOST_IOTLB_MSG message type)
 | 
						|
      Master payload: struct vhost_iotlb_msg
 | 
						|
      Slave payload: u64
 | 
						|
 | 
						|
      Send IOTLB messages with struct vhost_iotlb_msg as payload.
 | 
						|
      Master sends such requests to update and invalidate entries in the device
 | 
						|
      IOTLB. The slave has to acknowledge the request with sending zero as u64
 | 
						|
      payload for success, non-zero otherwise.
 | 
						|
      This request should be send only when VIRTIO_F_IOMMU_PLATFORM feature
 | 
						|
      has been successfully negotiated.
 | 
						|
 | 
						|
 * VHOST_USER_SET_VRING_ENDIAN
 | 
						|
 | 
						|
      Id: 23
 | 
						|
      Equivalent ioctl: VHOST_SET_VRING_ENDIAN
 | 
						|
      Master payload: vring state description
 | 
						|
 | 
						|
      Set the endianess of a VQ for legacy devices. Little-endian is indicated
 | 
						|
      with state.num set to 0 and big-endian is indicated with state.num set
 | 
						|
      to 1. Other values are invalid.
 | 
						|
      This request should be sent only when VHOST_USER_PROTOCOL_F_CROSS_ENDIAN
 | 
						|
      has been negotiated.
 | 
						|
      Backends that negotiated this feature should handle both endianesses
 | 
						|
      and expect this message once (per VQ) during device configuration
 | 
						|
      (ie. before the master starts the VQ).
 | 
						|
 | 
						|
 * VHOST_USER_GET_CONFIG
 | 
						|
 | 
						|
      Id: 24
 | 
						|
      Equivalent ioctl: N/A
 | 
						|
      Master payload: virtio device config space
 | 
						|
      Slave payload: virtio device config space
 | 
						|
 | 
						|
      When VHOST_USER_PROTOCOL_F_CONFIG is negotiated, this message is
 | 
						|
      submitted by the vhost-user master to fetch the contents of the virtio
 | 
						|
      device configuration space, vhost-user slave's payload size MUST match
 | 
						|
      master's request, vhost-user slave uses zero length of payload to
 | 
						|
      indicate an error to vhost-user master. The vhost-user master may
 | 
						|
      cache the contents to avoid repeated VHOST_USER_GET_CONFIG calls.
 | 
						|
 | 
						|
* VHOST_USER_SET_CONFIG
 | 
						|
 | 
						|
      Id: 25
 | 
						|
      Equivalent ioctl: N/A
 | 
						|
      Master payload: virtio device config space
 | 
						|
      Slave payload: N/A
 | 
						|
 | 
						|
      When VHOST_USER_PROTOCOL_F_CONFIG is negotiated, this message is
 | 
						|
      submitted by the vhost-user master when the Guest changes the virtio
 | 
						|
      device configuration space and also can be used for live migration
 | 
						|
      on the destination host. The vhost-user slave must check the flags
 | 
						|
      field, and slaves MUST NOT accept SET_CONFIG for read-only
 | 
						|
      configuration space fields unless the live migration bit is set.
 | 
						|
 | 
						|
* VHOST_USER_CREATE_CRYPTO_SESSION
 | 
						|
 | 
						|
     Id: 26
 | 
						|
     Equivalent ioctl: N/A
 | 
						|
     Master payload: crypto session description
 | 
						|
     Slave payload: crypto session description
 | 
						|
 | 
						|
     Create a session for crypto operation. The server side must return the
 | 
						|
     session id, 0 or positive for success, negative for failure.
 | 
						|
     This request should be sent only when VHOST_USER_PROTOCOL_F_CRYPTO_SESSION
 | 
						|
     feature has been successfully negotiated.
 | 
						|
     It's a required feature for crypto devices.
 | 
						|
 | 
						|
* VHOST_USER_CLOSE_CRYPTO_SESSION
 | 
						|
 | 
						|
     Id: 27
 | 
						|
     Equivalent ioctl: N/A
 | 
						|
     Master payload: u64
 | 
						|
 | 
						|
     Close a session for crypto operation which was previously
 | 
						|
     created by VHOST_USER_CREATE_CRYPTO_SESSION.
 | 
						|
     This request should be sent only when VHOST_USER_PROTOCOL_F_CRYPTO_SESSION
 | 
						|
     feature has been successfully negotiated.
 | 
						|
     It's a required feature for crypto devices.
 | 
						|
 | 
						|
 * VHOST_USER_POSTCOPY_ADVISE
 | 
						|
      Id: 28
 | 
						|
      Master payload: N/A
 | 
						|
      Slave payload: userfault fd
 | 
						|
 | 
						|
      When VHOST_USER_PROTOCOL_F_PAGEFAULT is supported, the
 | 
						|
      master advises slave that a migration with postcopy enabled is underway,
 | 
						|
      the slave must open a userfaultfd for later use.
 | 
						|
      Note that at this stage the migration is still in precopy mode.
 | 
						|
 | 
						|
 * VHOST_USER_POSTCOPY_LISTEN
 | 
						|
      Id: 29
 | 
						|
      Master payload: N/A
 | 
						|
 | 
						|
      Master advises slave that a transition to postcopy mode has happened.
 | 
						|
      The slave must ensure that shared memory is registered with userfaultfd
 | 
						|
      to cause faulting of non-present pages.
 | 
						|
 | 
						|
      This is always sent sometime after a VHOST_USER_POSTCOPY_ADVISE, and
 | 
						|
      thus only when VHOST_USER_PROTOCOL_F_PAGEFAULT is supported.
 | 
						|
 | 
						|
 * VHOST_USER_POSTCOPY_END
 | 
						|
      Id: 30
 | 
						|
      Slave payload: u64
 | 
						|
 | 
						|
      Master advises that postcopy migration has now completed.  The
 | 
						|
      slave must disable the userfaultfd. The response is an acknowledgement
 | 
						|
      only.
 | 
						|
      When VHOST_USER_PROTOCOL_F_PAGEFAULT is supported, this message
 | 
						|
      is sent at the end of the migration, after VHOST_USER_POSTCOPY_LISTEN
 | 
						|
      was previously sent.
 | 
						|
      The value returned is an error indication; 0 is success.
 | 
						|
 | 
						|
Slave message types
 | 
						|
-------------------
 | 
						|
 | 
						|
 * VHOST_USER_SLAVE_IOTLB_MSG
 | 
						|
 | 
						|
      Id: 1
 | 
						|
      Equivalent ioctl: N/A (equivalent to VHOST_IOTLB_MSG message type)
 | 
						|
      Slave payload: struct vhost_iotlb_msg
 | 
						|
      Master payload: N/A
 | 
						|
 | 
						|
      Send IOTLB messages with struct vhost_iotlb_msg as payload.
 | 
						|
      Slave sends such requests to notify of an IOTLB miss, or an IOTLB
 | 
						|
      access failure. If VHOST_USER_PROTOCOL_F_REPLY_ACK is negotiated,
 | 
						|
      and slave set the VHOST_USER_NEED_REPLY flag, master must respond with
 | 
						|
      zero when operation is successfully completed, or non-zero otherwise.
 | 
						|
      This request should be send only when VIRTIO_F_IOMMU_PLATFORM feature
 | 
						|
      has been successfully negotiated.
 | 
						|
 | 
						|
* VHOST_USER_SLAVE_CONFIG_CHANGE_MSG
 | 
						|
 | 
						|
     Id: 2
 | 
						|
     Equivalent ioctl: N/A
 | 
						|
     Slave payload: N/A
 | 
						|
     Master payload: N/A
 | 
						|
 | 
						|
     When VHOST_USER_PROTOCOL_F_CONFIG is negotiated, vhost-user slave sends
 | 
						|
     such messages to notify that the virtio device's configuration space has
 | 
						|
     changed, for those host devices which can support such feature, host
 | 
						|
     driver can send VHOST_USER_GET_CONFIG message to slave to get the latest
 | 
						|
     content. If VHOST_USER_PROTOCOL_F_REPLY_ACK is negotiated, and slave set
 | 
						|
     the VHOST_USER_NEED_REPLY flag, master must respond with zero when
 | 
						|
     operation is successfully completed, or non-zero otherwise.
 | 
						|
 | 
						|
 * VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG
 | 
						|
 | 
						|
      Id: 3
 | 
						|
      Equivalent ioctl: N/A
 | 
						|
      Slave payload: vring area description
 | 
						|
      Master payload: N/A
 | 
						|
 | 
						|
      Sets host notifier for a specified queue. The queue index is contained
 | 
						|
      in the u64 field of the vring area description. The host notifier is
 | 
						|
      described by the file descriptor (typically it's a VFIO device fd) which
 | 
						|
      is passed as ancillary data and the size (which is mmap size and should
 | 
						|
      be the same as host page size) and offset (which is mmap offset) carried
 | 
						|
      in the vring area description. QEMU can mmap the file descriptor based
 | 
						|
      on the size and offset to get a memory range. Registering a host notifier
 | 
						|
      means mapping this memory range to the VM as the specified queue's notify
 | 
						|
      MMIO region. Slave sends this request to tell QEMU to de-register the
 | 
						|
      existing notifier if any and register the new notifier if the request is
 | 
						|
      sent with a file descriptor.
 | 
						|
      This request should be sent only when VHOST_USER_PROTOCOL_F_HOST_NOTIFIER
 | 
						|
      protocol feature has been successfully negotiated.
 | 
						|
 | 
						|
VHOST_USER_PROTOCOL_F_REPLY_ACK:
 | 
						|
-------------------------------
 | 
						|
The original vhost-user specification only demands replies for certain
 | 
						|
commands. This differs from the vhost protocol implementation where commands
 | 
						|
are sent over an ioctl() call and block until the client has completed.
 | 
						|
 | 
						|
With this protocol extension negotiated, the sender (QEMU) can set the
 | 
						|
"need_reply" [Bit 3] flag to any command. This indicates that
 | 
						|
the client MUST respond with a Payload VhostUserMsg indicating success or
 | 
						|
failure. The payload should be set to zero on success or non-zero on failure,
 | 
						|
unless the message already has an explicit reply body.
 | 
						|
 | 
						|
The response payload gives QEMU a deterministic indication of the result
 | 
						|
of the command. Today, QEMU is expected to terminate the main vhost-user
 | 
						|
loop upon receiving such errors. In future, qemu could be taught to be more
 | 
						|
resilient for selective requests.
 | 
						|
 | 
						|
For the message types that already solicit a reply from the client, the
 | 
						|
presence of VHOST_USER_PROTOCOL_F_REPLY_ACK or need_reply bit being set brings
 | 
						|
no behavioural change. (See the 'Communication' section for details.)
 |