FRET-qemu/docs/colo-proxy.txt
Romain Malmain 7c3c7877d8 Update to QEMU 9.0.0 (#67)
* Update to QEMU v9.0.0

---------

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: Ido Plat <ido.plat@ibm.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Gregory Price <gregory.price@memverge.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Lorenz Brun <lorenz@brun.one>
Signed-off-by: Yao Xingtao <yaoxt.fnst@fujitsu.com>
Signed-off-by: Arnaud Minier <arnaud.minier@telecom-paris.fr>
Signed-off-by: Inès Varhol <ines.varhol@telecom-paris.fr>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Joonas Kankaala <joonas.a.kankaala@gmail.com>
Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Glenn Miles <milesg@linux.ibm.com>
Signed-off-by: Oleg Sviridov <oleg.sviridov@red-soft.ru>
Signed-off-by: Artem Chernyshev <artem.chernyshev@red-soft.ru>
Signed-off-by: Yajun Wu <yajunw@nvidia.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Pierre-Clément Tosi <ptosi@google.com>
Signed-off-by: Lei Wang <lei4.wang@intel.com>
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Martin Hundebøll <martin@geanix.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Signed-off-by: Wafer <wafer@jaguarmicro.com>
Signed-off-by: Yuxue Liu <yuxue.liu@jaguarmicro.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Nguyen Dinh Phi <phind.uet@gmail.com>
Signed-off-by: Zack Buhman <zack@buhman.org>
Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Yuquan Wang wangyuquan1236@phytium.com.cn
Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Signed-off-by: Cindy Lu <lulu@redhat.com>
Co-authored-by: Peter Maydell <peter.maydell@linaro.org>
Co-authored-by: Fabiano Rosas <farosas@suse.de>
Co-authored-by: Peter Xu <peterx@redhat.com>
Co-authored-by: Thomas Huth <thuth@redhat.com>
Co-authored-by: Cédric Le Goater <clg@redhat.com>
Co-authored-by: Zheyu Ma <zheyuma97@gmail.com>
Co-authored-by: Ido Plat <ido.plat@ibm.com>
Co-authored-by: Ilya Leoshkevich <iii@linux.ibm.com>
Co-authored-by: Markus Armbruster <armbru@redhat.com>
Co-authored-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Co-authored-by: Paolo Bonzini <pbonzini@redhat.com>
Co-authored-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Co-authored-by: David Hildenbrand <david@redhat.com>
Co-authored-by: Kevin Wolf <kwolf@redhat.com>
Co-authored-by: Stefan Reiter <s.reiter@proxmox.com>
Co-authored-by: Fiona Ebner <f.ebner@proxmox.com>
Co-authored-by: Gregory Price <gregory.price@memverge.com>
Co-authored-by: Lorenz Brun <lorenz@brun.one>
Co-authored-by: Yao Xingtao <yaoxt.fnst@fujitsu.com>
Co-authored-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Co-authored-by: Arnaud Minier <arnaud.minier@telecom-paris.fr>
Co-authored-by: BALATON Zoltan <balaton@eik.bme.hu>
Co-authored-by: Igor Mammedov <imammedo@redhat.com>
Co-authored-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Co-authored-by: Richard Henderson <richard.henderson@linaro.org>
Co-authored-by: Sven Schnelle <svens@stackframe.org>
Co-authored-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Co-authored-by: Helge Deller <deller@kernel.org>
Co-authored-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Co-authored-by: Benjamin Gray <bgray@linux.ibm.com>
Co-authored-by: Nicholas Piggin <npiggin@gmail.com>
Co-authored-by: Avihai Horon <avihaih@nvidia.com>
Co-authored-by: Michael Tokarev <mjt@tls.msk.ru>
Co-authored-by: Joonas Kankaala <joonas.a.kankaala@gmail.com>
Co-authored-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Co-authored-by: Stefan Weil <sw@weilnetz.de>
Co-authored-by: Dayu Liu <liu.dayu@zte.com.cn>
Co-authored-by: Zhao Liu <zhao1.liu@intel.com>
Co-authored-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Co-authored-by: Artem Chernyshev <artem.chernyshev@red-soft.ru>
Co-authored-by: Yajun Wu <yajunw@nvidia.com>
Co-authored-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Co-authored-by: Pierre-Clément Tosi <ptosi@google.com>
Co-authored-by: Wei Wang <wei.w.wang@intel.com>
Co-authored-by: Martin Hundebøll <martin@geanix.com>
Co-authored-by: Michael S. Tsirkin <mst@redhat.com>
Co-authored-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Co-authored-by: Wafer <wafer@jaguarmicro.com>
Co-authored-by: lyx634449800 <yuxue.liu@jaguarmicro.com>
Co-authored-by: Gerd Hoffmann <kraxel@redhat.com>
Co-authored-by: Nguyen Dinh Phi <phind.uet@gmail.com>
Co-authored-by: Zack Buhman <zack@buhman.org>
Co-authored-by: Keith Packard <keithp@keithp.com>
Co-authored-by: Yuquan Wang <wangyuquan1236@phytium.com.cn>
Co-authored-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Co-authored-by: Cindy Lu <lulu@redhat.com>
2024-05-01 16:10:20 +02:00

218 lines
11 KiB
Plaintext

COLO-proxy
----------
Copyright (c) 2016 Intel Corporation
Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
Copyright (c) 2016 Fujitsu, Corp.
This work is licensed under the terms of the GNU GPL, version 2 or later.
See the COPYING file in the top-level directory.
This document gives an overview of COLO proxy's design.
== Background ==
COLO-proxy is a part of COLO project. It is used
to compare the network package to help COLO decide
whether to do checkpoint. With COLO-proxy's help,
COLO greatly improves the performance.
The filter-redirector, filter-mirror, colo-compare
and filter-rewriter compose the COLO-proxy.
== Architecture ==
COLO-Proxy is based on qemu netfilter and it's a plugin for qemu netfilter
(except colo-compare). It keep Secondary VM connect normally to
client and compare packets sent by PVM with sent by SVM.
If the packet difference, notify COLO-frame to do checkpoint and send
all primary packet has queued. Otherwise just send the queued primary
packet and drop the queued secondary packet.
Below is a COLO proxy ascii figure:
Primary qemu Secondary qemu
+--------------------------------------------------------------+ +----------------------------------------------------------------+
| +----------------------------------------------------------+ | | +-----------------------------------------------------------+ |
| | | | | | | |
| | guest | | | | guest | |
| | | | | | | |
| +-------^--------------------------+-----------------------+ | | +---------------------+--------+----------------------------+ |
| | | | | ^ | |
| | | | | | | |
| | +------------------------------------------------------+ | | | |
|netfilter| | | | | | netfilter | | |
| +----------+ +----------------------------+ | | | +-----------------------------------------------------------+ |
| | | | | | out | | | | | | filter execute order | |
| | | | +-----------------------------+ | | | | | | +-------------------> | |
| | | | | | | | | | | | | | TCP | |
| | +-----+--+-+ +-----v----+ +-----v----+ |pri +----+----+sec| | | | +------------+ +---+----+---v+rewriter++ +------------+ | |
| | | | | | | | |in | |in | | | | | | | | | | | | |
| | | filter | | filter | | filter +------> colo <------+ +--------> filter +--> adjust | adjust +--> filter | | |
| | | mirror | |redirector| |redirector| | | compare | | | | | | redirector | | ack | seq | | redirector | | |
| | | | | | | | | | | | | | | | | | | | | | | |
| | +----^-----+ +----+-----+ +----------+ | +---------+ | | | | +------------+ +--------+--------------+ +---+--------+ | |
| | | tx | rx rx | | | | | tx all | rx | |
| | | | | | | | +-----------------------------------------------------------+ |
| | | +--------------+ | | | | | |
| | | filter execute order | | | | | | |
| | | +----------------> | | | +--------------------------------------------------------+ |
| +-----------------------------------------+ | | |
| | | | | |
+--------------------------------------------------------------+ +----------------------------------------------------------------+
|guest receive | guest send
| |
+--------+----------------------------v------------------------+
| | NOTE: filter direction is rx/tx/all
| tap | rx:receive packets sent to the netdev
| | tx:receive packets sent by the netdev
+--------------------------------------------------------------+
1.Guest receive packet route:
Primary:
Tap --> Mirror Client Filter
Mirror client will send packet to guest,at the
same time, copy and forward packet to secondary
mirror server.
Secondary:
Mirror Server Filter --> TCP Rewriter
If receive packet is TCP packet,we will adjust ack
and update TCP checksum, then send to secondary
guest. Otherwise directly send to guest.
2.Guest send packet route:
Primary:
Guest --> Redirect Server Filter
Redirect server filter receive primary guest packet
but do nothing, just pass to next filter.
Redirect Server Filter --> COLO-Compare
COLO-compare receive primary guest packet then
waiting secondary redirect packet to compare it.
If packet same,send queued primary packet and clear
queued secondary packet, Otherwise send primary packet
and do checkpoint.
COLO-Compare --> Another Redirector Filter
The redirector get packet from colo-compare by use
chardev socket.
Redirector Filter --> Tap
Send the packet.
Secondary:
Guest --> TCP Rewriter Filter
If the packet is TCP packet,we will adjust seq
and update TCP checksum. Then send it to
redirect client filter. Otherwise directly send to
redirect client filter.
Redirect Client Filter --> Redirect Server Filter
Forward packet to primary.
== Components introduction ==
Filter-mirror is a netfilter plugin.
It gives qemu the ability to mirror
packets to a chardev.
Filter-redirector is a netfilter plugin.
It gives qemu the ability to redirect net packet.
Redirector can redirect filter's net packet to outdev,
and redirect indev's packet to filter.
filter
+
redirector |
+--------------+
| | |
| | |
| | |
indev +---------+ +----------> outdev
| | |
| | |
| | |
+--------------+
|
v
filter
COLO-compare, we do packet comparing job.
Packets coming from the primary char indev will be sent to outdev.
Packets coming from the secondary char dev will be dropped after comparing.
COLO-compare needs two input chardevs and one output chardev:
primary_in=chardev1-id (source: primary send packet)
secondary_in=chardev2-id (source: secondary send packet)
outdev=chardev3-id
Filter-rewriter will rewrite some of secondary packet to make
secondary guest's tcp connection established successfully.
In this module we will rewrite tcp packet's ack to the secondary
from primary,and rewrite tcp packet's seq to the primary from
secondary.
== Usage ==
Here is an example using demonstration IP and port addresses to more
clearly describe the usage.
Primary(ip:3.3.3.3):
-netdev tap,id=hn0,vhost=off
-device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66
-chardev socket,id=mirror0,host=3.3.3.3,port=9003,server=on,wait=off
-chardev socket,id=compare1,host=3.3.3.3,port=9004,server=on,wait=off
-chardev socket,id=compare0,host=3.3.3.3,port=9001,server=on,wait=off
-chardev socket,id=compare0-0,host=3.3.3.3,port=9001
-chardev socket,id=compare_out,host=3.3.3.3,port=9005,server=on,wait=off
-chardev socket,id=compare_out0,host=3.3.3.3,port=9005
-object iothread,id=iothread1
-object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0
-object filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out
-object filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0
-object colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,outdev=compare_out0,iothread=iothread1
Secondary(ip:3.3.3.8):
-netdev tap,id=hn0,vhost=off
-device e1000,netdev=hn0,mac=52:a4:00:12:78:66
-chardev socket,id=red0,host=3.3.3.3,port=9003
-chardev socket,id=red1,host=3.3.3.3,port=9004
-object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0
-object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1
-object filter-rewriter,id=f3,netdev=hn0,queue=all
If you want to use virtio-net-pci or other driver with vnet_header:
Primary(ip:3.3.3.3):
-netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown
-device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66
-chardev socket,id=mirror0,host=3.3.3.3,port=9003,server=on,wait=off
-chardev socket,id=compare1,host=3.3.3.3,port=9004,server=on,wait=off
-chardev socket,id=compare0,host=3.3.3.3,port=9001,server=on,wait=off
-chardev socket,id=compare0-0,host=3.3.3.3,port=9001
-chardev socket,id=compare_out,host=3.3.3.3,port=9005,server=on,wait=off
-chardev socket,id=compare_out0,host=3.3.3.3,port=9005
-object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0,vnet_hdr_support
-object filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out,vnet_hdr_support
-object filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0,vnet_hdr_support
-object colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,outdev=compare_out0,vnet_hdr_support
Secondary(ip:3.3.3.8):
-netdev tap,id=hn0,vhost=off
-device e1000,netdev=hn0,mac=52:a4:00:12:78:66
-chardev socket,id=red0,host=3.3.3.3,port=9003
-chardev socket,id=red1,host=3.3.3.3,port=9004
-object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0,vnet_hdr_support
-object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1,vnet_hdr_support
-object filter-rewriter,id=f3,netdev=hn0,queue=all,vnet_hdr_support
Note:
a.COLO-proxy must work with COLO-frame and Block-replication.
b.Primary COLO must be started firstly, because COLO-proxy needs
chardev socket server running before secondary started.
c.Filter-rewriter only rewrite tcp packet.