627 Commits

Author SHA1 Message Date
Romain Malmain
5682a6d841 v10.0.0 release
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmgHmpAACgkQnKSrs4Gr
 c8h82wf/fVN/ZlYKLX7VJz0z+u3UB5MKuDUd+7LUwSGse9uIOH3K8PITkMyYgIti
 Sh8EKg9rhVzBEpiL9ZJfqCJjQTgJFk0O4xt3dPSGNsI2pZZcDwvQXFit7e/fafrY
 tUaTPdGuZ+i7s8Ooa+Z5tacI7n8KniQQkgf90oTnKhatmDmUbsVE0fma/2EmgqdI
 fO2mJKp5YiDsRf3vmuVKx/ltHYfL2tOvBOojeWBk9Zwr+czI2ku6Fy1Suu+tWeZ5
 setxSOCfY3G+qVsTm3n0d9OW/GPoQBsSVbSYua/74nQneNivTDAncndLFbFdj60g
 Q9n4t7tHN35Nh4XqkE0DhMGqPsQ3Og==
 =CFYe
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSq9xYmtep25y1RrMYC5KE/dBVGigUCaBCxXQAKCRAC5KE/dBVG
 imXmAP0WaWyc2kmipvGyhdGor7F4PlG9LRHL0jM4Om5SM4lkzAD/WnyFAXtErEwl
 eK0c2d980jdVHS5h9tVDK5TpzcPCRA0=
 =Zk18
 -----END PGP SIGNATURE-----

Merge tag 'v10.0.0' into update_qemu_v10_0_0

v10.0.0 release
2025-04-29 13:00:44 +02:00
Romain Malmain
2a676d9cd8 v9.2.2 release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEZKoqtTHVaQM2a/75gqpKJDselHgFAme8B8gACgkQgqpKJDse
 lHjzqxAAl9+xkHoXtgsnMhENO8dNznCPFh3AGKacxrahv1/XP/ghjPF8NNV0tGDK
 us73n0rNJG88dW2RIQVTjZJ5WYXaMwFBYrPBD2F0MROpiLmjXkHTr/fuH9Z7GkXI
 DOAfzf9Hf2BgKlolLAxvL55LckolAM7C87DNE0gtg/OT+d+XXfFcCpQf6wn+v+B7
 vAj5v7ir96rBffjjbRm2wItIsBDhzSxUxdaSnefC3CT8O2hbD6OcPa9o8WH2fLIR
 HHBLsW+2JTxv01iKRwPLfA00RIbxvC9QaaxTdkyBcnWIwbJy7LIWDvy37pnfHOHS
 XBp/AXEiQ7CXWat2451CAx2WPA/Vbcz4ekNSlBFk4tGNAZTJc9gL/doTXaAOl1SM
 8URJpe/gIUVENICkZe17UXG1L2zdMclAUCrFwgzPv6Ljth8ctFC8Gdk2xvYw5etY
 wQaILuXtzl0RgGVHrVLRL3q1w51YKv7aii6v+czHjwgDRDchc1h3m2+33UPERVZe
 ymSs1R5Vvmh8kE7v0coJDtR2BLRb4++AvBKiJ6ty6UqHA/F5JLCSE7dwwUuim9YY
 7E2jI2cNX+HO8yfwNoqZQ2cr2gAtMIm4hHE4hs0iqamfi/RGk8xw9HrRPlXorj9y
 +KWDYTqYAXOtd+qZyQtbppHKGOEAKXjg9qdYNy9N5KyAe5jrd/8=
 =06yL
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSq9xYmtep25y1RrMYC5KE/dBVGigUCZ9mEEAAKCRAC5KE/dBVG
 isziAP9tS6m4jKmDiYyLoYHT5tQ8+gI0R3kMl5U8VNGOx+/kfgD/X11dFM7VaVDo
 fecgc4U1dVPRguh5WO1cjEL3k8IDQAU=
 =RdqL
 -----END PGP SIGNATURE-----

Merge tag 'v9.2.2' into update_qemu_v9_2_2

v9.2.2 release
2025-03-18 15:32:47 +01:00
Philippe Mathieu-Daudé
6ff5da1600 exec: Declare tlb_flush*() in 'exec/cputlb.h'
Move CPU TLB related methods to "exec/cputlb.h".

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-ID: <20241114011310.3615-19-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2025-03-08 07:56:14 -08:00
Philippe Mathieu-Daudé
2809e2d6c4 exec: Declare tlb_set_page_with_attrs() in 'exec/cputlb.h'
Move CPU TLB related methods to "exec/cputlb.h".

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20241114011310.3615-17-philmd@linaro.org>
2025-03-08 07:56:14 -08:00
Philippe Mathieu-Daudé
b12a0f8566 accel: Rename 'hw/core/accel-cpu.h' -> 'accel/accel-cpu-target.h'
AccelCPUClass is for accelerator to initialize target specific
features of a vCPU. Not really related to hardware emulation,
rename "hw/core/accel-cpu.h" as "accel/accel-cpu-target.h"
(using the explicit -target suffix).

More importantly, target specific header often access the
target specific definitions which are in each target/FOO/cpu.h
header, usually included generically as "cpu.h" relative to
target/FOO/. However, there is already a "cpu.h" in hw/core/
which takes precedence. This change allows "accel-cpu-target.h"
to include a target "cpu.h".

Mechanical change doing:

 $  git mv include/hw/core/accel-cpu.h \
           include/accel/accel-cpu-target.h
 $  sed -i -e 's,hw/core/accel-cpu.h,accel/accel-cpu-target.h,' \
   $(git grep -l hw/core/accel-cpu.h)

and renaming header guard 'ACCEL_CPU_TARGET_H'.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20250123234415.59850-12-philmd@linaro.org>
2025-03-06 15:46:17 +01:00
Philippe Mathieu-Daudé
1501743654 accel/tcg: Rename 'hw/core/tcg-cpu-ops.h' -> 'accel/tcg/cpu-ops.h'
TCGCPUOps structure makes more sense in the accelerator context
rather than hardware emulation. Move it under the accel/tcg/ scope.

Mechanical change doing:

 $  sed -i -e 's,hw/core/tcg-cpu-ops.h,accel/tcg/cpu-ops.h,g' \
   $(git grep -l hw/core/tcg-cpu-ops.h)

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20250123234415.59850-11-philmd@linaro.org>
2025-03-06 15:46:17 +01:00
Peter Maydell
765fe845cc fpu: Pass float_status to floatx80_invalid_encoding()
The definition of which floatx80 encodings are invalid is
target-specific.  Currently we handle this with an ifdef, but we
would like to defer this decision to runtime.  In preparation, pass a
float_status argument to floatx80_invalid_encoding().

We will change the implementation from ifdef to looking at
the status argument in the following commit.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250224111524.1101196-7-peter.maydell@linaro.org
2025-02-25 15:32:57 +00:00
Peter Maydell
9ea6d1f141 fpu: Pass float_status to floatx80_is_infinity()
Unlike the other float formats, whether a floatx80 value is
considered to be an Infinity is target-dependent.  (On x86 if the
explicit integer bit is clear this is a "pseudo-infinity" and not a
valid infinity; m68k does not care about the value of the integer
bit.)

Currently we select this target-specific logic at compile time with
an ifdef.  We're going to want to do this at runtime, so change the
floatx80_is_infinity() function to take a float_status.

This commit doesn't change any logic; we'll do that in the
next commit.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250224111524.1101196-5-peter.maydell@linaro.org
2025-02-25 15:32:57 +00:00
Peter Maydell
165ce008d7 target/i386: Avoid using floatx80_infinity global const
The global const floatx80_infinity is (unlike all the other
float*_infinity values) target-specific, because whether the explicit
Integer bit is set or not varies between m68k and i386.  We want to
be able to compile softfloat once for multiple targets, so we can't
continue to use a single global whose value needs to be different
between targets.

Replace the direct uses of floatx80_infinity in target/i386 with
calls to the new floatx80_default_inf() function. Note that because
we can ask the function for either a negative or positive infinity,
we don't need to change the sign of a positive infinity via
floatx80_chs() for the negative-Inf case.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250224111524.1101196-4-peter.maydell@linaro.org
Message-id: 20250217125055.160887-4-peter.maydell@linaro.org
2025-02-25 15:32:57 +00:00
Peter Maydell
28f13bccbe fpu: allow flushing of output denormals to be after rounding
Currently we handle flushing of output denormals in uncanon_normal
always before we deal with rounding.  This works for architectures
that detect tininess before rounding, but is usually not the right
place when the architecture detects tininess after rounding.  For
example, for x86 the SDM states that the MXCSR FTZ control bit causes
outputs to be flushed to zero "when it detects a floating-point
underflow condition".  This means that we mustn't flush to zero if
the input is such that after rounding it is no longer tiny.

At least one of our guest architectures does underflow detection
after rounding but flushing of denormals before rounding (MIPS MSA);
this means we need to have a config knob for this that is separate
from our existing tininess_before_rounding setting.

Add an ftz_detection flag.  For consistency with
tininess_before_rounding, we make it default to "detect ftz after
rounding"; this means that we need to explicitly set the flag to
"detect ftz before rounding" on every existing architecture that sets
flush_to_zero, so that this commit has no behaviour change.
(This means more code change here but for the long term a less
confusing API.)

For several architectures the current behaviour is either
definitely or possibly wrong; annotate those with TODO comments.
These architectures are definitely wrong (and should detect
ftz after rounding):
 * x86
 * Alpha

For these architectures the spec is unclear:
 * MIPS (for non-MSA)
 * RX
 * SH4

PA-RISC makes ftz detection IMPDEF, but we aren't setting the
"tininess before rounding" setting that we ought to.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11 16:22:07 +00:00
Peter Maydell
2b3bfbb21b target/i386: Do not raise Invalid for 0 * Inf + QNaN
In commit 8adcff4ae7 ("fpu: handle raising Invalid for infzero in
pick_nan_muladd") we changed the handling of 0 * Inf + QNaN to always
raise the Invalid exception regardless of target architecture.  (This
was a change affecting hppa, i386, sh4 and tricore.) However, this
was incorrect for i386, which documents in the SDM section 14.5.2
that for the 0 * Inf + NaN case that it will only raise the Invalid
exception when the input is an SNaN.  (This is permitted by the IEEE
754-2008 specification, which documents that whether we raise Invalid
for 0 * Inf + QNaN is implementation defined.)

Adjust the softfloat pick_nan_muladd code to allow the target to
suppress the raising of Invalid for the inf * zero + NaN case (as an
extra flag orthogonal to its choice for when to use the default NaN),
and enable that for x86.

We do not revert here the behaviour change for hppa, sh4 or tricore:
 * The sh4 manual is clear that it should signal Invalid
 * The tricore manual is a bit vague but doesn't say it shouldn't
 * The hppa manual doesn't talk about fused multiply-add corner
   cases at all

Cc: qemu-stable@nongnu.org
Fixes: 8adcff4ae7 (""fpu: handle raising Invalid for infzero in pick_nan_muladd")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20250116112536.4117889-2-peter.maydell@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-02-07 15:51:01 +01:00
Stefan Hajnoczi
871af84dd5 * target/i386: optimize string instructions
* target/i386: new Sierra Forest and Clearwater Forest models
 * rust: type-safe vmstate implementation
 * rust: use interior mutability for PL011
 * rust: clean ups
 * memtxattrs: remove usage of bitfields from MEMTXATTRS_UNSPECIFIED
 * gitlab-ci: enable Rust backtraces
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmeZ6VYUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMjbQgApuooMOp0z/8Ky4/ux8M8/vrlcNCH
 V1Pm6WzrjEzd9TIMLGr6npOyLOkWI31Aa4o/TuW09SeKE3dpCf/7LYA5VDEtkH79
 F57MgnSj56sMNgu+QZ/SiGvkKJXl+3091jIianrrI0dtX8hPonm6bt55woDvQt3z
 p94+4zzv5G0nc+ncITCDho8sn5itdZWVOjf9n6VCOumMjF4nRSoMkJKYIvjNht6n
 GtjMhYA70tzjkIi4bPyYkhFpMNlAqEDIp2TvPzp6klG5QoUErHIzdzoRTAtE4Dpb
 7240r6jarQX41TBXGOFq0NrxES1cm5zO/6159D24qZGHGm2hG4nDx+t2jw==
 =ZKFy
 -----END PGP SIGNATURE-----

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* target/i386: optimize string instructions
* target/i386: new Sierra Forest and Clearwater Forest models
* rust: type-safe vmstate implementation
* rust: use interior mutability for PL011
* rust: clean ups
* memtxattrs: remove usage of bitfields from MEMTXATTRS_UNSPECIFIED
* gitlab-ci: enable Rust backtraces

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmeZ6VYUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroMjbQgApuooMOp0z/8Ky4/ux8M8/vrlcNCH
# V1Pm6WzrjEzd9TIMLGr6npOyLOkWI31Aa4o/TuW09SeKE3dpCf/7LYA5VDEtkH79
# F57MgnSj56sMNgu+QZ/SiGvkKJXl+3091jIianrrI0dtX8hPonm6bt55woDvQt3z
# p94+4zzv5G0nc+ncITCDho8sn5itdZWVOjf9n6VCOumMjF4nRSoMkJKYIvjNht6n
# GtjMhYA70tzjkIi4bPyYkhFpMNlAqEDIp2TvPzp6klG5QoUErHIzdzoRTAtE4Dpb
# 7240r6jarQX41TBXGOFq0NrxES1cm5zO/6159D24qZGHGm2hG4nDx+t2jw==
# =ZKFy
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 29 Jan 2025 03:39:50 EST
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (49 commits)
  gitlab-ci: include full Rust backtraces in test runs
  rust: qemu-api: add sub-subclass to the integration tests
  rust/zeroable: Implement Zeroable with const_zero macro
  rust: qdev: make reset take a shared reference
  rust: pl011: drop use of ControlFlow
  rust: pl011: pull device-specific code out of MemoryRegionOps callbacks
  rust: pl011: remove duplicate definitions
  rust: pl011: wrap registers with BqlRefCell
  rust: pl011: extract PL011Registers
  rust: pl011: pull interrupt updates out of read/write ops
  rust: pl011: extract CharBackend receive logic into a separate function
  rust: pl011: extract conversion to RegisterOffset
  rust: pl011: hide unnecessarily "pub" items from outside pl011::device
  rust: pl011: remove unnecessary "extern crate"
  rust: prefer NonNull::new to assertions
  rust: vmstate: make order of parameters consistent in vmstate_clock
  rust: vmstate: remove translation of C vmstate macros
  rust: pl011: switch vmstate to new-style macros
  rust: qemu_api: add vmstate_struct
  rust: vmstate: add public utility macros to implement VMState
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2025-01-29 09:51:03 -05:00
Peter Maydell
7af64d103d fpu: Rename float_flag_output_denormal to float_flag_output_denormal_flushed
Our float_flag_output_denormal exception flag is set when
the fpu code flushes an output denormal to zero. Rename
it to float_flag_output_denormal_flushed:
 * this keeps it parallel with the flag for flushing
   input denormals, which we just renamed
 * it makes it clearer that it doesn't mean "set when
   the output is a denormal"

Commit created with
 for f in `git grep -l float_flag_output_denormal`; do sed -i -e 's/float_flag_output_denormal/float_flag_output_denormal_flushed/' $f; done

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250124162836.2332150-21-peter.maydell@linaro.org
2025-01-28 18:40:19 +00:00
Peter Maydell
584b7aec81 fpu: Rename float_flag_input_denormal to float_flag_input_denormal_flushed
Our float_flag_input_denormal exception flag is set when the fpu code
flushes an input denormal to zero.  This is what many guest
architectures (eg classic Arm behaviour) require, but it is not the
only donarmal-related reason we might want to set an exception flag.
The x86 behaviour (which we do not currently model correctly) wants
to see an exception flag when a denormal input is *not* flushed to
zero and is actually used in an arithmetic operation. Arm's FEAT_AFP
also wants these semantics.

Rename float_flag_input_denormal to float_flag_input_denormal_flushed
to make it clearer when it is set and to allow us to add a new
float_flag_input_denormal_used next to it for the x86/FEAT_AFP
semantics.

Commit created with
 for f in `git grep -l float_flag_input_denormal`; do sed -i -e 's/float_flag_input_denormal/float_flag_input_denormal_flushed/' $f; done

and manual editing of softfloat-types.h and softfloat.c to clean
up the indentation afterwards and to fix a comment which wasn't
using the full name of the flag.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250124162836.2332150-20-peter.maydell@linaro.org
2025-01-28 18:40:19 +00:00
Paolo Bonzini
22063f03a7 target/i386: avoid using s->tmp0 for add to implicit registers
For updates to implicit registers (RCX in LOOP instructions, RSI or RDI
in string instructions, or the stack pointer) do the add directly using
the registers (with no temporary) if 32-bit or 64-bit, or use a temporary
created for the occasion if 16-bit.  This is more efficient and removes
move instructions for the MO_TL case.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20241215090613.89588-14-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-23 11:50:53 +01:00
Paolo Bonzini
82290c7647 target/i386: extract common bits of gen_repz/gen_repz_nz
Now that everything has been cleaned up, look at DF and prefixes
in a single function, and call that one from gen_repz and gen_repz_nz.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-23 11:50:44 +01:00
Paolo Bonzini
4f094e27f3 target/i386: pull computation of string update value out of loop
This is a common operation that is executed many times in rep
movs or rep stos loops.  It can improve performance by several
percentage points.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20241215090613.89588-13-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-23 11:35:33 +01:00
Paolo Bonzini
456709db50 target/i386: execute multiple REP/REPZ iterations without leaving TB
Use a TCG loop so that it is not necessary to go through the setup steps
of REP and through the I/O check on every iteration.  Interestingly, this
is not a particularly effective optimization on its own, though it avoids
the cost of correct RF emulation that was added in the previous patch.
The main benefit lies in allowing the hoisting of loop invariants outside
the loop, which will happen separately.

The loop exits when the low 16 bits of CX/ECX/RCX are zero (so generally
speaking the string operation runs in 65536 iteration batches) to give
the main loop an opportunity to pick up interrupts.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20241215090613.89588-12-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-23 11:35:33 +01:00
Paolo Bonzini
0360b78187 target/i386: optimize CX handling in repeated string operations
In a repeated string operation, CX/ECX will be decremented until it
is 0 but never underflow.  Use this observation to avoid a deposit or
zero-extend operation if the address size of the operation is smaller
than MO_TL.

As in the previous patch, the patch is structured to include some
preparatory work for subsequent changes.  In particular, introducing
cx_next prepares for when ECX will be decremented *before* calling
fn(s, ot), and therefore cannot yet be written back to cpu_regs.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20241215090613.89588-11-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-23 11:35:33 +01:00
Paolo Bonzini
3658116025 target/i386: do not use gen_op_jz_ecx for repeated string operations
Explicitly generate a TSTEQ branch (which is optimized to NE x,0 if possible).
This does not make much sense yet, but later we will add more checks and some
will use a temporary to check on the decremented value of CX/ECX/RCX; it will
be clearer for all checks to share the same logic using TSTEQ(reg, cx_mask).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20241215090613.89588-10-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-23 11:35:33 +01:00
Paolo Bonzini
6986cf0032 target/i386: make cc_op handling more explicit for repeated string instructions.
Since the cost of gen_update_cc_op() must be paid anyway, it's easier
to place them manually and not rely on spilling that is buried under
multiple levels of function calls.  While at it, clarify the circumstances
in which the gen_update_cc_op() is needed, and why it is not for REPxx
SCAS and REPxx CMPS.

And since cc_op will have been spilled at the point of a fault, just
make the whole insn CC_OP_DYNAMIC.  Once repz_opt is reintroduced,
a fault could happen either before or after the first execution of
CMPS/SCAS, and CC_OP_DYNAMIC sidesteps the complicated matter of what
x86_restore_state_to_opc would do.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20241215090613.89588-9-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-23 11:35:33 +01:00
Paolo Bonzini
0d82d9e846 target/i386: fix RF handling for string instructions
RF must be set on traps and interrupts from a string instruction,
except if they occur after the last iteration.  Ensure it is set
before giving the main loop a chance to execute.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20241215090613.89588-8-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-23 11:35:33 +01:00
Paolo Bonzini
4d7704ebc5 target/i386: tcg: move gen_set/reset_* earlier in the file
Allow using them in the code that translates REP/REPZ, without
forward declarations.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20241215090613.89588-7-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-23 11:35:33 +01:00
Paolo Bonzini
0eb7046e1b target/i386: reorganize ops emitted by do_gen_rep, drop repz_opt
The condition for optimizing repeat instruction is more or less the
opposite of what you imagine: almost always the string instruction
was _not_ optimized and optimizing the loop relied on goto_tb.
This is obviously not great for performance, due to the cost of the
exit-to-main-loop check, but also wrong.  In fact, after expanding
dc->jmp_opt and simplifying "!!x" to "x", the condition for looping used
to be:

   ((cflags & CF_NO_GOTO_TB) ||
    (flags & (HF_RF_MASK | HF_TF_MASK | HF_INHIBIT_IRQ_MASK))) && !(cflags & CF_USE_ICOUNT)

In other words, setting aside RF (it requires special handling for REP
instructions and it was completely missing), repeat instruction were
being optimized if TF or inhibit IRQ flags were set.  This is certainly
wrong for TF, because string instructions trap after every execution,
and probably for interrupt shadow too.

Get rid of repz_opt completely.  The next patches will reintroduce the
optimization, applying it in the common case instead of the unlikely
and wrong one.

While at it, place the CX/ECX/RCX=0 case is at the end of the function,
which saves a label and is clearer when reading the generated ops.
For clarity, mark the cc_op explicitly as DYNAMIC even if at the end
of the translation block; the cc_op can come from either the previous
instruction or the string instruction, and currently we rely on
a gen_update_cc_op() that is hidden in the bowels of gen_jcc() to
spill cc_op and mark it clean.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20241215090613.89588-6-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-23 11:35:33 +01:00
Paolo Bonzini
d8d552d459 target/i386: unify choice between single and repeated string instructions
The same "if" is present in all generator functions for string instructions.
Push it inside gen_repz() and gen_repz_nz() instead.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20241215090613.89588-5-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-23 11:35:33 +01:00
Paolo Bonzini
b519556f58 target/i386: unify REP and REPZ/REPNZ generation
It only differs in a single call to gen_jcc, so use a "bool" argument
to distinguish the two cases; do not duplicate code.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20241215090613.89588-4-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-23 11:35:33 +01:00
Paolo Bonzini
e604be4fb4 target/i386: remove trailing 1 from gen_{j, cmov, set}cc1
This is not needed anymore now that gen_jcc has been eliminated
(merged into the similarly-named gen_Jcc, where the uppercase letter
gives away that it is an emission function).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20241215090613.89588-3-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-23 11:35:33 +01:00
Paolo Bonzini
6ace2d5163 target/i386: inline gen_jcc into sole caller
The code of gen_Jcc is very similar to gen_LOOP* and gen_JCXZ, but this
is hidden by gen_jcc.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20241215090613.89588-2-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-23 11:35:33 +01:00
Stefan Hajnoczi
32a97c5d05 tcg:
- Add TCGOP_TYPE, TCGOP_FLAGS.
   - Pass type and flags to tcg_op_supported, tcg_target_op_def.
   - Split out tcg-target-has.h and unexport from tcg.h.
   - Reorg constraint processing; constify TCGOpDef.
   - Make extract, sextract, deposit opcodes mandatory.
   - Merge ext{8,16,32}{s,u} opcodes into {s}extract.
 tcg/mips: Expand bswap unconditionally
 tcg/riscv: Use SRAIW, SRLIW for {s}extract_i64
 tcg/riscv: Use BEXTI for single-bit extractions
 tcg/sparc64: Use SRA, SRL for {s}extract_i64
 
 disas/riscv: Guard dec->cfg dereference for host disassemble
 util/cpuinfo-riscv: Detect Zbs
 accel/tcg: Call tcg_tb_insert() for one-insn TBs
 linux-user: Add missing /proc/cpuinfo fields for sparc
 -----BEGIN PGP SIGNATURE-----
 
 iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmeKnzUdHHJpY2hhcmQu
 aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV+Kvgf+LG9UjXlWF9GK923E
 TllBL2rLf1OOdtTXWO15VcvGMoWDwB3tVBdhihdvXmnWju+WbfMk6mct5NhzsKn9
 LmuugMIZs+hMROj+bgMK8x47jRIh5N2rDYxcEgmyfIpYb2o9qvyqKecGVRlSJTCE
 bmt5UFbvPThBb8upoMfq3F6evuMx0szBP7wrOwSR/VGpmzIr20UTEWo6I1ALp4uj
 paFaysYol4em3dIhkiuV9cL7E0EIObaNa7l9RUci/BmTq+JaVxUnW1Y2i0PEwKwG
 FJSfYTJk3wBgAVxC2zC2g3ZM7uKuecSXMpiFopTiuyQLp7Q61i9kCNvEq0qY5tdb
 DaqR/g==
 =cv4O
 -----END PGP SIGNATURE-----

Merge tag 'pull-tcg-20250117' of https://gitlab.com/rth7680/qemu into staging

tcg:
  - Add TCGOP_TYPE, TCGOP_FLAGS.
  - Pass type and flags to tcg_op_supported, tcg_target_op_def.
  - Split out tcg-target-has.h and unexport from tcg.h.
  - Reorg constraint processing; constify TCGOpDef.
  - Make extract, sextract, deposit opcodes mandatory.
  - Merge ext{8,16,32}{s,u} opcodes into {s}extract.
tcg/mips: Expand bswap unconditionally
tcg/riscv: Use SRAIW, SRLIW for {s}extract_i64
tcg/riscv: Use BEXTI for single-bit extractions
tcg/sparc64: Use SRA, SRL for {s}extract_i64

disas/riscv: Guard dec->cfg dereference for host disassemble
util/cpuinfo-riscv: Detect Zbs
accel/tcg: Call tcg_tb_insert() for one-insn TBs
linux-user: Add missing /proc/cpuinfo fields for sparc

# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmeKnzUdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV+Kvgf+LG9UjXlWF9GK923E
# TllBL2rLf1OOdtTXWO15VcvGMoWDwB3tVBdhihdvXmnWju+WbfMk6mct5NhzsKn9
# LmuugMIZs+hMROj+bgMK8x47jRIh5N2rDYxcEgmyfIpYb2o9qvyqKecGVRlSJTCE
# bmt5UFbvPThBb8upoMfq3F6evuMx0szBP7wrOwSR/VGpmzIr20UTEWo6I1ALp4uj
# paFaysYol4em3dIhkiuV9cL7E0EIObaNa7l9RUci/BmTq+JaVxUnW1Y2i0PEwKwG
# FJSfYTJk3wBgAVxC2zC2g3ZM7uKuecSXMpiFopTiuyQLp7Q61i9kCNvEq0qY5tdb
# DaqR/g==
# =cv4O
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 17 Jan 2025 13:19:33 EST
# gpg:                using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg:                issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* tag 'pull-tcg-20250117' of https://gitlab.com/rth7680/qemu: (68 commits)
  softfloat: Constify helpers returning float_status field
  accel/tcg: Call tcg_tb_insert() for one-insn TBs
  tcg: Document tb_lookup() and tcg_tb_lookup()
  linux-user: Add missing /proc/cpuinfo fields for sparc
  tcg/riscv: Use BEXTI for single-bit extractions
  util/cpuinfo-riscv: Detect Zbs
  tcg: Remove TCG_TARGET_HAS_deposit_{i32,i64}
  tcg: Remove TCG_TARGET_HAS_{s}extract_{i32,i64}
  tcg/tci: Remove assertions for deposit and extract
  tcg/tci: Provide TCG_TARGET_{s}extract_valid
  tcg/sparc64: Use SRA, SRL for {s}extract_i64
  tcg/s390x: Fold the ext{8,16,32}[us] cases into {s}extract
  tcg/riscv: Use SRAIW, SRLIW for {s}extract_i64
  tcg/riscv64: Fold the ext{8,16,32}[us] cases into {s}extract
  tcg/ppc: Fold the ext{8,16,32}[us] cases into {s}extract
  tcg/mips: Fold the ext{8,16,32}[us] cases into {s}extract
  tcg/loongarch64: Fold the ext{8,16,32}[us] cases into {s}extract
  tcg/arm: Add full [US]XT[BH] into {s}extract
  tcg/aarch64: Expand extract with offset 0 with andi
  tcg/aarch64: Provide TCG_TARGET_{s}extract_valid
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2025-01-21 08:28:33 -05:00
Richard Henderson
a4ca7f4a3e target/i386: Use tcg_op_supported
Do not reference TCG_TARGET_HAS_* directly.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2025-01-16 20:57:16 -08:00
Richard Henderson
34220513bb target/i386: Use tcg_op_deposit_valid
Avoid direct usage of TCG_TARGET_deposit_*_valid.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2025-01-16 20:57:16 -08:00
Richard Henderson
20fab3c210 target/i386: Remove TCG_TARGET_extract_tl_valid
This macro is unused.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2025-01-16 20:57:16 -08:00
Xiaoyao Li
d3bb5d0d4f i386/cpu: Extract a common fucntion to setup value of MSR_CORE_THREAD_COUNT
There are duplicated code to setup the value of MSR_CORE_THREAD_COUNT.
Extract a common function for it.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20241219110125.1266461-2-xiaoyao.li@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-10 23:34:44 +01:00
Paolo Bonzini
ef682b08a0 target/i386: use shr to load high-byte registers into T0/T1
Using a sextract or extract operation is only necessary if a
sign or zero extended value is needed.  If not, a shift is
enough.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-10 23:34:44 +01:00
Paolo Bonzini
88716ae79f target/i386: improve code generation for BT
Because BT does not write back to the source operand, it can modify it to
ensure that one of the operands of TSTNE is a constant (after either gen_BT
or the optimizer's constant propagation).  This produces better and more
optimizable TCG ops.  For example, the sequence

  movl $0x60013f, %ebx
  btl %ecx, %ebx

becomes just

  and_i32 tmp1,ecx,$0x1f                   dead: 1 2  pref=0xffff
  shr_i32 tmp0,$0x60013f,tmp1              dead: 1 2  pref=0xffff
  and_i32 tmp16,tmp0,$0x1                  dead: 1  pref=0xbf80

On s390x, it can use four instructions to isolate bit 0 of 0x60013f >> (ecx & 31):

  nilf     %r12, 0x1f
  lgfi     %r11, 0x60013f
  srlk     %r12, %r11, 0(%r12)
  nilf     %r12, 1

Previously, it used five instructions to build 1 << (ecx & 31) and compute
TSTEQ, and also needed two more to construct the result of setcond:

  nilf     %r12, 0x1f
  lghi     %r11, 1
  sllk     %r12, %r11, 0(%r12)
  lgfi     %r9, 0x60013f
  nrk      %r0, %r12, %r9
  lghi     %r12, 0
  locghilh %r12, 1

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-10 23:34:44 +01:00
Romain Malmain
ace364678a
Nyx api support (#97)
* add nyx support

* target independent helper call.
2025-01-06 16:13:11 +01:00
Richard Henderson
e4a8e093dc accel/tcg: Move gen_intermediate_code to TCGCPUOps.translate_core
Convert all targets simultaneously, as the gen_intermediate_code
function disappears from the target.  While there are possible
workarounds, they're larger than simply performing the conversion.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-24 08:32:15 -08:00
Philippe Mathieu-Daudé
a9ca97ea9e accel/tcg: Un-inline translator_is_same_page()
Remove the single target-specific definition used in
"exec/translator.h" (TARGET_PAGE_MASK) by un-inlining
is_same_page().
Rename the method as translator_is_same_page() and
improve its documentation.
Use it in translator_use_goto_tb().

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20241218154145.71353-1-philmd@linaro.org>
2024-12-20 17:44:57 +01:00
Philippe Mathieu-Daudé
68df8c8dba accel/tcg: Include missing 'exec/translation-block.h' header
TB compile flags, tb_page_addr_t type, tb_cflags() and few
other methods are defined in "exec/translation-block.h".

All these files don't include "exec/translation-block.h" but
include "exec/exec-all.h" which include it. Explicitly include
"exec/translation-block.h" to be able to remove it from
"exec/exec-all.h" later when it won't be necessary. Otherwise
we'd get errors such:

  accel/tcg/internal-target.h:59:20: error: a parameter list without types is only allowed in a function definition
     59 | void tb_lock_page0(tb_page_addr_t);
        |                    ^
  accel/tcg/tb-hash.h:64:23: error: unknown type name 'tb_page_addr_t'
     64 | uint32_t tb_hash_func(tb_page_addr_t phys_pc, vaddr pc,
        |                       ^
  accel/tcg/tcg-accel-ops.c:62:36: error: use of undeclared identifier 'CF_CLUSTER_SHIFT'
     62 |     cflags = cpu->cluster_index << CF_CLUSTER_SHIFT;
        |                                    ^
  accel/tcg/watchpoint.c:102:47: error: use of undeclared identifier 'CF_NOIRQ'
    102 |                     cpu->cflags_next_tb = 1 | CF_NOIRQ | curr_cflags(cpu);
        |                                               ^
  target/i386/helper.c:536:28: error: use of undeclared identifier 'CF_PCREL'
    536 |     if (tcg_cflags_has(cs, CF_PCREL)) {
        |                            ^
  target/rx/cpu.c:51:21: error: incomplete definition of type 'struct TranslationBlock'
     51 |     cpu->env.pc = tb->pc;
        |                   ~~^
  system/physmem.c:2977:9: error: call to undeclared function 'tb_invalidate_phys_range'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
   2977 |         tb_invalidate_phys_range(addr, addr + length - 1);
        |         ^
  plugins/api.c:96:12: error: call to undeclared function 'tb_cflags'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     96 |     return tb_cflags(tcg_ctx->gen_tb) & CF_MEMI_ONLY;
        |            ^

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20241114011310.3615-5-philmd@linaro.org>
2024-12-20 17:44:57 +01:00
Philippe Mathieu-Daudé
32cad1ffb8 include: Rename sysemu/ -> system/
Headers in include/sysemu/ are not only related to system
*emulation*, they are also used by virtualization. Rename
as system/ which is clearer.

Files renamed manually then mechanical change using sed tool.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Lei Yang <leiyang@redhat.com>
Message-Id: <20241203172445.28576-1-philmd@linaro.org>
2024-12-20 17:44:56 +01:00
Peter Maydell
f69da79196 target/i386: Set default NaN pattern explicitly
Set the default NaN pattern explicitly, and remove the ifdef from
parts64_default_nan().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241202131347.498124-38-peter.maydell@linaro.org
2024-12-11 15:31:04 +00:00
Peter Maydell
703990100a target/i386: Set Float3NaNPropRule explicitly
Set the Float3NaNPropRule explicitly for i386.  We had no
i386-specific behaviour in the old ifdef ladder, so we were using the
default "prefer a then b then c" fallback; this is actually the
correct per-the-spec handling for i386.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241202131347.498124-25-peter.maydell@linaro.org
2024-12-11 15:30:58 +00:00
Peter Maydell
390df9046b target/x86: Set FloatInfZeroNaNRule explicitly
Set the FloatInfZeroNaNRule explicitly for the x86 target.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241202131347.498124-12-peter.maydell@linaro.org
2024-12-11 15:30:55 +00:00
Pierrick Bouvier
7ba055b49b target/i386: fix hang when using slow path for ptw_setl
When instrumenting memory accesses for plugin, we force memory accesses
to use the slow path for mmu [1]. This create a situation where we end
up calling ptw_setl_slow. This was fixed recently in [2] but the issue
still could appear out of plugins use case.

Since this function gets called during a cpu_exec, start_exclusive then
hangs. This exclusive section was introduced initially for security
reasons [3].

I suspect this code path was never triggered, because ptw_setl_slow
would always be called transitively from cpu_exec, resulting in a hang.

[1] 6d03226b42
[2] 115ade42d5
[3] https://gitlab.com/qemu-project/qemu/-/issues/279

Fixes: https://gitlab.com/qemu-project/qemu/-/issues/2566
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20241025175857.2554252-2-pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-11-16 08:42:25 -08:00
Alexander Graf
8fa11a4df3 target/i386: Fix legacy page table walk
Commit b56617bbcb4 ("target/i386: Walk NPT in guest real mode") added
logic to run the page table walker even in real mode if we are in NPT
mode.  That function then determined whether real mode or paging is
active based on whether the pg_mode variable was 0.

Unfortunately pg_mode is 0 in two situations:

  1) Paging is disabled (real mode)
  2) Paging is in 2-level paging mode (32bit without PAE)

That means the walker now assumed that 2-level paging mode was real
mode, breaking NetBSD as well as Windows XP.

To fix that, this patch adds a new PG flag to pg_mode which indicates
whether paging is active at all and uses that to determine whether we
are in real mode or not.

Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2654
Fixes: b56617bbcb4 ("target/i386: Walk NPT in guest real mode")
Signed-off-by: Alexander Graf <graf@amazon.com>
Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Link: https://lore.kernel.org/r/20241106154329.67218-1-graf@amazon.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-11-07 16:54:02 +01:00
Peter Maydell
62d39b28ef target/i386: Set 2-NaN propagation rule explicitly
Set the NaN propagation rule explicitly for the float_status words
used in the x86 target.

This is a no-behaviour-change commit, so we retain the existing
behaviour of using the x87-style "prefer QNaN over SNaN, then prefer
the NaN with the larger significand" for MMX and SSE.  This is
however not the documented hardware behaviour, so we leave a TODO
note about what we should be doing instead.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241025141254.2141506-16-peter.maydell@linaro.org
2024-11-05 10:09:56 +00:00
Paolo Bonzini
6d8623b5c0 target/i386: use + to put flags together
This gives greater opportunity for reassociation on x86 targets,
since addition can use the LEA instruction.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:33 +01:00
Paolo Bonzini
134ffcb276 target/i386: use higher-precision arithmetic to compute CF
If the operands of the arithmetic instruction fit within a half-register,
it's easiest to use a comparison instruction to compute the carry.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:33 +01:00
Paolo Bonzini
24899cdcd2 target/i386: use compiler builtin to compute PF
This removes the 256 byte parity table from the executable.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:33 +01:00
Paolo Bonzini
46c04e4bcf target/i386: make flag variables unsigned
This makes it easier for the compiler to understand which bits are set,
and it also removes "cltq" instructions to canonicalize the output value
as 32-bit signed.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:33 +01:00