102 Commits

Author SHA1 Message Date
Romain Malmain
5682a6d841 v10.0.0 release
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmgHmpAACgkQnKSrs4Gr
 c8h82wf/fVN/ZlYKLX7VJz0z+u3UB5MKuDUd+7LUwSGse9uIOH3K8PITkMyYgIti
 Sh8EKg9rhVzBEpiL9ZJfqCJjQTgJFk0O4xt3dPSGNsI2pZZcDwvQXFit7e/fafrY
 tUaTPdGuZ+i7s8Ooa+Z5tacI7n8KniQQkgf90oTnKhatmDmUbsVE0fma/2EmgqdI
 fO2mJKp5YiDsRf3vmuVKx/ltHYfL2tOvBOojeWBk9Zwr+czI2ku6Fy1Suu+tWeZ5
 setxSOCfY3G+qVsTm3n0d9OW/GPoQBsSVbSYua/74nQneNivTDAncndLFbFdj60g
 Q9n4t7tHN35Nh4XqkE0DhMGqPsQ3Og==
 =CFYe
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSq9xYmtep25y1RrMYC5KE/dBVGigUCaBCxXQAKCRAC5KE/dBVG
 imXmAP0WaWyc2kmipvGyhdGor7F4PlG9LRHL0jM4Om5SM4lkzAD/WnyFAXtErEwl
 eK0c2d980jdVHS5h9tVDK5TpzcPCRA0=
 =Zk18
 -----END PGP SIGNATURE-----

Merge tag 'v10.0.0' into update_qemu_v10_0_0

v10.0.0 release
2025-04-29 13:00:44 +02:00
Romain Malmain
2a676d9cd8 v9.2.2 release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEZKoqtTHVaQM2a/75gqpKJDselHgFAme8B8gACgkQgqpKJDse
 lHjzqxAAl9+xkHoXtgsnMhENO8dNznCPFh3AGKacxrahv1/XP/ghjPF8NNV0tGDK
 us73n0rNJG88dW2RIQVTjZJ5WYXaMwFBYrPBD2F0MROpiLmjXkHTr/fuH9Z7GkXI
 DOAfzf9Hf2BgKlolLAxvL55LckolAM7C87DNE0gtg/OT+d+XXfFcCpQf6wn+v+B7
 vAj5v7ir96rBffjjbRm2wItIsBDhzSxUxdaSnefC3CT8O2hbD6OcPa9o8WH2fLIR
 HHBLsW+2JTxv01iKRwPLfA00RIbxvC9QaaxTdkyBcnWIwbJy7LIWDvy37pnfHOHS
 XBp/AXEiQ7CXWat2451CAx2WPA/Vbcz4ekNSlBFk4tGNAZTJc9gL/doTXaAOl1SM
 8URJpe/gIUVENICkZe17UXG1L2zdMclAUCrFwgzPv6Ljth8ctFC8Gdk2xvYw5etY
 wQaILuXtzl0RgGVHrVLRL3q1w51YKv7aii6v+czHjwgDRDchc1h3m2+33UPERVZe
 ymSs1R5Vvmh8kE7v0coJDtR2BLRb4++AvBKiJ6ty6UqHA/F5JLCSE7dwwUuim9YY
 7E2jI2cNX+HO8yfwNoqZQ2cr2gAtMIm4hHE4hs0iqamfi/RGk8xw9HrRPlXorj9y
 +KWDYTqYAXOtd+qZyQtbppHKGOEAKXjg9qdYNy9N5KyAe5jrd/8=
 =06yL
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSq9xYmtep25y1RrMYC5KE/dBVGigUCZ9mEEAAKCRAC5KE/dBVG
 isziAP9tS6m4jKmDiYyLoYHT5tQ8+gI0R3kMl5U8VNGOx+/kfgD/X11dFM7VaVDo
 fecgc4U1dVPRguh5WO1cjEL3k8IDQAU=
 =RdqL
 -----END PGP SIGNATURE-----

Merge tag 'v9.2.2' into update_qemu_v9_2_2

v9.2.2 release
2025-03-18 15:32:47 +01:00
Stefan Hajnoczi
871af84dd5 * target/i386: optimize string instructions
* target/i386: new Sierra Forest and Clearwater Forest models
 * rust: type-safe vmstate implementation
 * rust: use interior mutability for PL011
 * rust: clean ups
 * memtxattrs: remove usage of bitfields from MEMTXATTRS_UNSPECIFIED
 * gitlab-ci: enable Rust backtraces
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmeZ6VYUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMjbQgApuooMOp0z/8Ky4/ux8M8/vrlcNCH
 V1Pm6WzrjEzd9TIMLGr6npOyLOkWI31Aa4o/TuW09SeKE3dpCf/7LYA5VDEtkH79
 F57MgnSj56sMNgu+QZ/SiGvkKJXl+3091jIianrrI0dtX8hPonm6bt55woDvQt3z
 p94+4zzv5G0nc+ncITCDho8sn5itdZWVOjf9n6VCOumMjF4nRSoMkJKYIvjNht6n
 GtjMhYA70tzjkIi4bPyYkhFpMNlAqEDIp2TvPzp6klG5QoUErHIzdzoRTAtE4Dpb
 7240r6jarQX41TBXGOFq0NrxES1cm5zO/6159D24qZGHGm2hG4nDx+t2jw==
 =ZKFy
 -----END PGP SIGNATURE-----

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* target/i386: optimize string instructions
* target/i386: new Sierra Forest and Clearwater Forest models
* rust: type-safe vmstate implementation
* rust: use interior mutability for PL011
* rust: clean ups
* memtxattrs: remove usage of bitfields from MEMTXATTRS_UNSPECIFIED
* gitlab-ci: enable Rust backtraces

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmeZ6VYUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroMjbQgApuooMOp0z/8Ky4/ux8M8/vrlcNCH
# V1Pm6WzrjEzd9TIMLGr6npOyLOkWI31Aa4o/TuW09SeKE3dpCf/7LYA5VDEtkH79
# F57MgnSj56sMNgu+QZ/SiGvkKJXl+3091jIianrrI0dtX8hPonm6bt55woDvQt3z
# p94+4zzv5G0nc+ncITCDho8sn5itdZWVOjf9n6VCOumMjF4nRSoMkJKYIvjNht6n
# GtjMhYA70tzjkIi4bPyYkhFpMNlAqEDIp2TvPzp6klG5QoUErHIzdzoRTAtE4Dpb
# 7240r6jarQX41TBXGOFq0NrxES1cm5zO/6159D24qZGHGm2hG4nDx+t2jw==
# =ZKFy
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 29 Jan 2025 03:39:50 EST
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (49 commits)
  gitlab-ci: include full Rust backtraces in test runs
  rust: qemu-api: add sub-subclass to the integration tests
  rust/zeroable: Implement Zeroable with const_zero macro
  rust: qdev: make reset take a shared reference
  rust: pl011: drop use of ControlFlow
  rust: pl011: pull device-specific code out of MemoryRegionOps callbacks
  rust: pl011: remove duplicate definitions
  rust: pl011: wrap registers with BqlRefCell
  rust: pl011: extract PL011Registers
  rust: pl011: pull interrupt updates out of read/write ops
  rust: pl011: extract CharBackend receive logic into a separate function
  rust: pl011: extract conversion to RegisterOffset
  rust: pl011: hide unnecessarily "pub" items from outside pl011::device
  rust: pl011: remove unnecessary "extern crate"
  rust: prefer NonNull::new to assertions
  rust: vmstate: make order of parameters consistent in vmstate_clock
  rust: vmstate: remove translation of C vmstate macros
  rust: pl011: switch vmstate to new-style macros
  rust: qemu_api: add vmstate_struct
  rust: vmstate: add public utility macros to implement VMState
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2025-01-29 09:51:03 -05:00
Paolo Bonzini
d8d552d459 target/i386: unify choice between single and repeated string instructions
The same "if" is present in all generator functions for string instructions.
Push it inside gen_repz() and gen_repz_nz() instead.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20241215090613.89588-5-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-23 11:35:33 +01:00
Paolo Bonzini
e604be4fb4 target/i386: remove trailing 1 from gen_{j, cmov, set}cc1
This is not needed anymore now that gen_jcc has been eliminated
(merged into the similarly-named gen_Jcc, where the uppercase letter
gives away that it is an emission function).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20241215090613.89588-3-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-23 11:35:33 +01:00
Paolo Bonzini
6ace2d5163 target/i386: inline gen_jcc into sole caller
The code of gen_Jcc is very similar to gen_LOOP* and gen_JCXZ, but this
is hidden by gen_jcc.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20241215090613.89588-2-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-23 11:35:33 +01:00
Stefan Hajnoczi
32a97c5d05 tcg:
- Add TCGOP_TYPE, TCGOP_FLAGS.
   - Pass type and flags to tcg_op_supported, tcg_target_op_def.
   - Split out tcg-target-has.h and unexport from tcg.h.
   - Reorg constraint processing; constify TCGOpDef.
   - Make extract, sextract, deposit opcodes mandatory.
   - Merge ext{8,16,32}{s,u} opcodes into {s}extract.
 tcg/mips: Expand bswap unconditionally
 tcg/riscv: Use SRAIW, SRLIW for {s}extract_i64
 tcg/riscv: Use BEXTI for single-bit extractions
 tcg/sparc64: Use SRA, SRL for {s}extract_i64
 
 disas/riscv: Guard dec->cfg dereference for host disassemble
 util/cpuinfo-riscv: Detect Zbs
 accel/tcg: Call tcg_tb_insert() for one-insn TBs
 linux-user: Add missing /proc/cpuinfo fields for sparc
 -----BEGIN PGP SIGNATURE-----
 
 iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmeKnzUdHHJpY2hhcmQu
 aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV+Kvgf+LG9UjXlWF9GK923E
 TllBL2rLf1OOdtTXWO15VcvGMoWDwB3tVBdhihdvXmnWju+WbfMk6mct5NhzsKn9
 LmuugMIZs+hMROj+bgMK8x47jRIh5N2rDYxcEgmyfIpYb2o9qvyqKecGVRlSJTCE
 bmt5UFbvPThBb8upoMfq3F6evuMx0szBP7wrOwSR/VGpmzIr20UTEWo6I1ALp4uj
 paFaysYol4em3dIhkiuV9cL7E0EIObaNa7l9RUci/BmTq+JaVxUnW1Y2i0PEwKwG
 FJSfYTJk3wBgAVxC2zC2g3ZM7uKuecSXMpiFopTiuyQLp7Q61i9kCNvEq0qY5tdb
 DaqR/g==
 =cv4O
 -----END PGP SIGNATURE-----

Merge tag 'pull-tcg-20250117' of https://gitlab.com/rth7680/qemu into staging

tcg:
  - Add TCGOP_TYPE, TCGOP_FLAGS.
  - Pass type and flags to tcg_op_supported, tcg_target_op_def.
  - Split out tcg-target-has.h and unexport from tcg.h.
  - Reorg constraint processing; constify TCGOpDef.
  - Make extract, sextract, deposit opcodes mandatory.
  - Merge ext{8,16,32}{s,u} opcodes into {s}extract.
tcg/mips: Expand bswap unconditionally
tcg/riscv: Use SRAIW, SRLIW for {s}extract_i64
tcg/riscv: Use BEXTI for single-bit extractions
tcg/sparc64: Use SRA, SRL for {s}extract_i64

disas/riscv: Guard dec->cfg dereference for host disassemble
util/cpuinfo-riscv: Detect Zbs
accel/tcg: Call tcg_tb_insert() for one-insn TBs
linux-user: Add missing /proc/cpuinfo fields for sparc

# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmeKnzUdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV+Kvgf+LG9UjXlWF9GK923E
# TllBL2rLf1OOdtTXWO15VcvGMoWDwB3tVBdhihdvXmnWju+WbfMk6mct5NhzsKn9
# LmuugMIZs+hMROj+bgMK8x47jRIh5N2rDYxcEgmyfIpYb2o9qvyqKecGVRlSJTCE
# bmt5UFbvPThBb8upoMfq3F6evuMx0szBP7wrOwSR/VGpmzIr20UTEWo6I1ALp4uj
# paFaysYol4em3dIhkiuV9cL7E0EIObaNa7l9RUci/BmTq+JaVxUnW1Y2i0PEwKwG
# FJSfYTJk3wBgAVxC2zC2g3ZM7uKuecSXMpiFopTiuyQLp7Q61i9kCNvEq0qY5tdb
# DaqR/g==
# =cv4O
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 17 Jan 2025 13:19:33 EST
# gpg:                using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg:                issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* tag 'pull-tcg-20250117' of https://gitlab.com/rth7680/qemu: (68 commits)
  softfloat: Constify helpers returning float_status field
  accel/tcg: Call tcg_tb_insert() for one-insn TBs
  tcg: Document tb_lookup() and tcg_tb_lookup()
  linux-user: Add missing /proc/cpuinfo fields for sparc
  tcg/riscv: Use BEXTI for single-bit extractions
  util/cpuinfo-riscv: Detect Zbs
  tcg: Remove TCG_TARGET_HAS_deposit_{i32,i64}
  tcg: Remove TCG_TARGET_HAS_{s}extract_{i32,i64}
  tcg/tci: Remove assertions for deposit and extract
  tcg/tci: Provide TCG_TARGET_{s}extract_valid
  tcg/sparc64: Use SRA, SRL for {s}extract_i64
  tcg/s390x: Fold the ext{8,16,32}[us] cases into {s}extract
  tcg/riscv: Use SRAIW, SRLIW for {s}extract_i64
  tcg/riscv64: Fold the ext{8,16,32}[us] cases into {s}extract
  tcg/ppc: Fold the ext{8,16,32}[us] cases into {s}extract
  tcg/mips: Fold the ext{8,16,32}[us] cases into {s}extract
  tcg/loongarch64: Fold the ext{8,16,32}[us] cases into {s}extract
  tcg/arm: Add full [US]XT[BH] into {s}extract
  tcg/aarch64: Expand extract with offset 0 with andi
  tcg/aarch64: Provide TCG_TARGET_{s}extract_valid
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2025-01-21 08:28:33 -05:00
Richard Henderson
a4ca7f4a3e target/i386: Use tcg_op_supported
Do not reference TCG_TARGET_HAS_* directly.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2025-01-16 20:57:16 -08:00
Richard Henderson
34220513bb target/i386: Use tcg_op_deposit_valid
Avoid direct usage of TCG_TARGET_deposit_*_valid.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2025-01-16 20:57:16 -08:00
Richard Henderson
20fab3c210 target/i386: Remove TCG_TARGET_extract_tl_valid
This macro is unused.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2025-01-16 20:57:16 -08:00
Paolo Bonzini
ef682b08a0 target/i386: use shr to load high-byte registers into T0/T1
Using a sextract or extract operation is only necessary if a
sign or zero extended value is needed.  If not, a shift is
enough.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-10 23:34:44 +01:00
Paolo Bonzini
88716ae79f target/i386: improve code generation for BT
Because BT does not write back to the source operand, it can modify it to
ensure that one of the operands of TSTNE is a constant (after either gen_BT
or the optimizer's constant propagation).  This produces better and more
optimizable TCG ops.  For example, the sequence

  movl $0x60013f, %ebx
  btl %ecx, %ebx

becomes just

  and_i32 tmp1,ecx,$0x1f                   dead: 1 2  pref=0xffff
  shr_i32 tmp0,$0x60013f,tmp1              dead: 1 2  pref=0xffff
  and_i32 tmp16,tmp0,$0x1                  dead: 1  pref=0xbf80

On s390x, it can use four instructions to isolate bit 0 of 0x60013f >> (ecx & 31):

  nilf     %r12, 0x1f
  lgfi     %r11, 0x60013f
  srlk     %r12, %r11, 0(%r12)
  nilf     %r12, 1

Previously, it used five instructions to build 1 << (ecx & 31) and compute
TSTEQ, and also needed two more to construct the result of setcond:

  nilf     %r12, 0x1f
  lghi     %r11, 1
  sllk     %r12, %r11, 0(%r12)
  lgfi     %r9, 0x60013f
  nrk      %r0, %r12, %r9
  lghi     %r12, 0
  locghilh %r12, 1

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-10 23:34:44 +01:00
Richard Henderson
1f7f72bdc4 target/i386: Wrap cc_op_live with a validity check
Assert that op is known and that cc_op_live_ is populated.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:33 +01:00
Richard Henderson
f359b2fb71 target/i386: Introduce cc_op_size
Replace arithmetic on cc_op with a helper function.
Assert that the op has a size and that it is valid
for the configuration.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20240701025115.1265117-6-richard.henderson@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:33 +01:00
Paolo Bonzini
e09447c39f target/i386: remove CC_OP_CLR
Just use CC_OP_EFLAGS; it is not that likely that the flags computed by
CC_OP_CLR survive the end of the basic block, in which case there is no
need to spill cc_op_src.

cc_op_src now does need spilling if the XOR is followed by a memory
operation, but this only costs 0.2% extra TCG ops.  They will be recouped
by simplifications in how QEMU evaluates ZF at runtime, which are even
greater with this change.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:33 +01:00
Romain Malmain
b01a0bc334
Fix helper function calls & support for new x86 decoder (#92)
* fix helper function calls

* cmp hooks: support for new x86 decoder
2024-10-31 16:31:54 +01:00
Paolo Bonzini
fcd16539eb target/i386: convert CMPXCHG8B/CMPXCHG16B to new decoder
The gen_cmpxchg8b and gen_cmpxchg16b functions even have the correct
prototype already; the only thing that needs to be done is removing the
gen_lea_modrm() call.

This moves the last LOCK-enabled instructions to the new decoder.  It is
now possible to assume that gen_multi0F is called only after checking
that PREFIX_LOCK was not specified.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-17 19:41:29 +02:00
Paolo Bonzini
a2e2c78d2a target/i386: decode address before going back to translate.c
There are now relatively few unconverted opcodes in translate.c (there
are 13 of them including 8 for x87), and all of them have the same
format with a mod/rm byte and no immediate.  A good next step is
to remove the early bail out to disas_insn_x87/disas_insn_old,
instead giving these legacy translator functions the same prototype
as the other gen_* functions.

To do this, the X86DecodeInsn can be passed down to the places that
used to fetch address bytes from the instruction stream.  To make
sure that everything is done cleanly, the CPUX86State* argument is
removed.

As part of the unification, the gen_lea_modrm() name is now free,
so rename gen_load_ea() to gen_lea_modrm().  This is as good a name
and it makes the changes to translate.c easier to review.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-17 19:41:29 +02:00
Paolo Bonzini
10eae89937 target/i386: convert bit test instructions to new decoder
Code generation was rewritten; it reuses the same trick to use the
CC_OP_SAR values for cc_op, but it tries to use CC_OP_ADCX or CC_OP_ADCOX
instead of CC_OP_EFLAGS.  This is a tiny bit more efficient in the
common case where only CF is checked in the resulting flags.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-17 19:41:29 +02:00
Richard Henderson
83a3a20e59 target/i386: Fix carry flag for BLSI
BLSI has inverted semantics for C as compared to the other two
BMI1 instructions, BLSMSK and BLSR.  Introduce CC_OP_BLSI* for
this purpose.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2175
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20240801075845.573075-3-richard.henderson@linaro.org>
2024-08-21 09:11:26 +10:00
Richard Henderson
7700d2293c target/i386: Assert MMX and XMM registers in range
The mmx assert would fire without the fix for #2495.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/20240812025844.58956-4-richard.henderson@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-08-13 16:35:43 +02:00
Paolo Bonzini
3afc6539a8 target/i386/tcg: fix POP to memory in long mode
In long mode, POP to memory will write a full 64-bit value.  However,
the call to gen_writeback() in gen_POP will use MO_32 because the
decoding table is incorrect.

The bug was latent until commit aea49fbb01a ("target/i386: use gen_writeback()
within gen_POP()", 2024-06-08), and then became visible because gen_op_st_v
now receives op->ot instead of the "ot" returned by gen_pop_T0.

Analyzed-by: Clément Chigot <chigot@adacore.com>
Fixes: 5e9e21bcc4d ("target/i386: move 60-BF opcodes to new decoder", 2024-05-07)
Tested-by: Clément Chigot <chigot@adacore.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-16 18:18:24 +02:00
Paolo Bonzini
944f400134 target/i386: use cpu_cc_dst for CC_OP_POPCNT
It is the only CCOp, among those that compute ZF from one of the cc_op_*
registers, that uses cpu_cc_src.  Do not make it the odd one off,
instead use cpu_cc_dst like the others.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-28 14:44:52 +02:00
Paolo Bonzini
0c4da54883 target/i386: convert CMPXCHG to new decoder
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-17 09:47:39 +02:00
Paolo Bonzini
7b1f25ac3a target/i386: convert XADD to new decoder
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-17 09:47:39 +02:00
Paolo Bonzini
11ffaf8c73 target/i386: convert LZCNT/TZCNT/BSF/BSR/POPCNT to new decoder
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-17 09:47:39 +02:00
Paolo Bonzini
6476902740 target/i386: convert SHLD/SHRD to new decoder
Use the same flag generation code as SHL and SHR, but use
the existing gen_shiftd_rm_T1 function to compute the result
as well as CC_SRC.

Decoding-wise, SHLD/SHRD by immediate count as a 4 operand
instruction because s->T0 and s->T1 actually occupy three op
slots.  The infrastructure used by opcodes in the 0F 3A table
works fine.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-17 09:47:39 +02:00
Paolo Bonzini
e4e5981daf target/i386: adapt gen_shift_count for SHLD/SHRD
SHLD/SHRD can have 3 register operands - s->T0, s->T1 and either
1 or CL - and therefore decode->op[2] is taken by the low part
of the register being shifted.  Pass X86_OP_* to gen_shift_count
from its current callers and hardcode cpu_regs[R_ECX] as the
shift count.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-17 09:47:39 +02:00
Paolo Bonzini
ae541c0eb4 target/i386: convert non-grouped, helper-based 2-byte opcodes
These have very simple generators and no need for complex group
decoding.  Apart from LAR/LSL which are simplified to use
gen_op_deposit_reg_v and movcond, the code is generally lifted
from translate.c into the generators.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-17 09:47:39 +02:00
Paolo Bonzini
ea89aa895e target/i386: finish converting 0F AE to the new decoder
This is already partly implemented due to VLDMXCSR and VSTMXCSR; finish
the job.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-17 09:47:39 +02:00
Paolo Bonzini
e0448caebf target/i386: replace read_crN helper with read_cr8
All other control registers are stored plainly in CPUX86State.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-17 09:47:39 +02:00
Paolo Bonzini
a1af7fba5a target/i386: convert MOV from/to CR and DR to new decoder
Complete implementation of C and D operand types, then the operations
are just MOVs.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-17 09:47:35 +02:00
Paolo Bonzini
c0df9563a3 target/i386: replace NoSeg special with NoLoadEA
This is a bit more generic, as it can be applied to MPX as well.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-11 14:29:22 +02:00
Paolo Bonzini
4e2dc59cf9 target/i386: change X86_ENTRYwr to use T0, use it for moves
Just like X86_ENTRYr, X86_ENTRYwr is easily changed to use only T0.
In this case, the motivation is to use it for the MOV instruction
family.  The case when you need to preserve the input value is the
odd one, as it is used basically only for BLS* instructions.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-11 14:29:22 +02:00
Paolo Bonzini
c2b6b6a65a target/i386: change X86_ENTRYr to use T0
I am not sure why I made it use T1.  It is a bit more symmetric with
respect to X86_ENTRYwr (which uses T0 for the "w"ritten operand
and T1 for the "r"ead operand), but it is also less flexible because it
does not let you apply zextT0/sextT0.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-11 14:29:22 +02:00
Paolo Bonzini
e628387cf9 target/i386: put BLS* input in T1, use generic flag writeback
This makes for easier cpu_cc_* setup, and not using set_cc_op()
should come in handy if QEMU ever implements APX.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-11 14:29:22 +02:00
Paolo Bonzini
cc155f1971 target/i386: rewrite flags writeback for ADCX/ADOX
Avoid using set_cc_op() in preparation for implementing APX; treat
CC_OP_EFLAGS similar to the case where we have the "opposite" cc_op
(CC_OP_ADOX for ADCX and CC_OP_ADCX for ADOX), except the resulting
cc_op is not CC_OP_ADCOX. This is written easily as two "if"s, whose
conditions are both false for CC_OP_EFLAGS, both true for CC_OP_ADCOX,
and one each true for CC_OP_ADCX/ADOX.

The new logic also makes it easy to drop usage of tmp0.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-11 14:29:22 +02:00
Paolo Bonzini
4228eb8cc6 target/i386: remove CPUX86State argument from generator functions
CPUX86State argument would only be used to fetch bytes, but that has to be
done before the generator function is called.  So remove it, and all
temptation together with it.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-11 14:29:22 +02:00
Mark Cave-Ayland
f1b8613da3 target/i386: fix SP when taking a memory fault during POP
When OS/2 Warp configures its segment descriptors, many of them are configured with
the P flag clear to allow for a fault-on-demand implementation. In the case where
the stack value is POPped into the segment registers, the SP is incremented before
calling gen_helper_load_seg() to validate the segment descriptor:

IN:
0xffef2c0c:  66 07                    popl     %es

OP:
 ld_i32 loc9,env,$0xfffffffffffffff8
 sub_i32 loc9,loc9,$0x1
 brcond_i32 loc9,$0x0,lt,$L0
 st16_i32 loc9,env,$0xfffffffffffffff8
 st8_i32 $0x1,env,$0xfffffffffffffffc

 ---- 0000000000000c0c 0000000000000000
 ext16u_i64 loc0,rsp
 add_i64 loc0,loc0,ss_base
 ext32u_i64 loc0,loc0
 qemu_ld_a64_i64 loc0,loc0,noat+un+leul,5
 add_i64 loc3,rsp,$0x4
 deposit_i64 rsp,rsp,loc3,$0x0,$0x10
 extrl_i64_i32 loc5,loc0
 call load_seg,$0x0,$0,env,$0x0,loc5
 add_i64 rip,rip,$0x2
 ext16u_i64 rip,rip
 exit_tb $0x0
 set_label $L0
 exit_tb $0x7fff58000043

If helper_load_seg() generates a fault when validating the segment descriptor then as
the SP has already been incremented, the topmost word of the stack is overwritten by
the arguments pushed onto the stack by the CPU before taking the fault handler. As a
consequence things rapidly go wrong upon return from the fault handler due to the
corrupted stack.

Update the logic for the existing writeback condition so that a POP into the segment
registers also calls helper_load_seg() first before incrementing the SP, so that if a
fault occurs the SP remains unaltered.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2198
Message-ID: <20240606095319.229650-4-mark.cave-ayland@ilande.co.uk>
Fixes: cc1d28bdbe0 ("target/i386: move 00-5F opcodes to new decoder", 2024-05-07)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-08 10:33:38 +02:00
Mark Cave-Ayland
aea49fbb01 target/i386: use gen_writeback() within gen_POP()
Instead of directly implementing the writeback using gen_op_st_v(), use the
existing gen_writeback() function.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-ID: <20240606095319.229650-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-08 10:33:38 +02:00
Mark Cave-Ayland
f41990f552 target/i386: use local X86DecodedOp in gen_POP()
This will make subsequent changes a little easier to read.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-ID: <20240606095319.229650-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-08 10:33:38 +02:00
Paolo Bonzini
330e6adc1a target/i386: cleanup PAUSE helpers
Use decode.c's support for intercepts, doing the check in TCG-generated
code rather than the helper.  This is cleaner because it allows removing
the eip_addend argument to helper_pause(), even though it adds a bit of
bloat for opcode 0x90's new decoding function.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-08 10:33:38 +02:00
Paolo Bonzini
536032566b target/i386: cleanup HLT helpers
Use decode.c's support for intercepts, doing the check in TCG-generated
code rather than the helper.  This is cleaner because it allows removing
the eip_addend argument to helper_hlt().

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-08 10:33:37 +02:00
Paolo Bonzini
73fb7b3c49 target/i386: fix implementation of ICEBP
ICEBP generates a trap-like exception, while gen_exception() produces
a fault.  Resurrect gen_update_eip_next() to implement the desired
semantics.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-08 10:33:37 +02:00
Paolo Bonzini
56da568f88 target/i386: remove aflag argument of gen_lea_v_seg
It is always s->aflag.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:02 +02:00
Paolo Bonzini
6605817b1a target/i386: clean up repeated string operations
Do not bother generating inline wrappers for gen_repz and gen_repz2;
use s->prefix to separate REPZ from REPNZ in the case of SCAS and
CMPS.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:02 +02:00
Paolo Bonzini
f0e754d3ce target/i386: introduce gen_lea_ss_ofs
Generalize gen_stack_A0() to include an initial add and to use an arbitrary
destination.  This is a common pattern and it is not a huge burden to
add the extra arguments to the only caller of gen_stack_A0().

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini
d0e31d6d37 target/i386: inline gen_add_A0_ds_seg
It is only used in MONITOR, where a direct call of gen_lea_v_seg
is simpler, and in XLAT.  Inline it in the latter.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini
2512f786bf target/i386: avoid calling gen_eob_inhibit_irq before tb_stop
sti only has one exit, so it does not need to generate the
end-of-translation code inline.  It can be deferred to tb_stop.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini
abdcc5c8ef target/i386: set CC_OP in helpers if they want CC_OP_EFLAGS
Mark cc_op as clean and do not spill it at the end of the translation block.
Technically this is a tiny bit less efficient, but:

* it results in translations that are a tiny bit smaller

* for most of these instructions, it is not unlikely that they are close to
the end of the basic block, in which case cc_op would not be overwritten

* anyway the cost is probably dwarfed by that of computing flags.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00