627 Commits

Author SHA1 Message Date
Richard Henderson
6b1b736bae target/i386: Convert do_xsave_* to X86Access
The body of do_xsave is now fully converted.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-26 12:51:50 -07:00
Richard Henderson
6d030aab29 tagret/i386: Convert do_fxsave, do_fxrstor to X86Access
Move the alignment fault from do_* to helper_*, as it need
not apply to usage from within user-only signal handling.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-26 12:51:50 -07:00
Richard Henderson
e41d2eaf17 target/i386: Convert do_xrstor_{fpu,mxcr,sse} to X86Access
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-26 12:51:50 -07:00
Richard Henderson
b7e6d3ad30 target/i386: Convert do_xsave_{fpu,mxcr,sse} to X86Access
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-26 12:51:50 -07:00
Richard Henderson
94f60f8f1c target/i386: Convert do_fsave, do_frstor to X86Access
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-26 12:51:50 -07:00
Richard Henderson
505e2ef744 target/i386: Convert do_fstenv to X86Access
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-26 12:51:50 -07:00
Richard Henderson
bc13c2dd01 target/i386: Convert do_fldenv to X86Access
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-26 12:51:50 -07:00
Richard Henderson
4526f58a27 target/i386: Convert helper_{fbld,fbst}_ST0 to X86Access
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-26 12:51:50 -07:00
Richard Henderson
d3e8b648ab target/i386: Convert do_fldt, do_fstt to X86Access
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-26 12:51:50 -07:00
Richard Henderson
24f6813924 target/i386: Add tcg/access.[ch]
Provide a method to amortize page lookup across large blocks.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-26 12:51:50 -07:00
Paolo Bonzini
56da568f88 target/i386: remove aflag argument of gen_lea_v_seg
It is always s->aflag.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:02 +02:00
Paolo Bonzini
6605817b1a target/i386: clean up repeated string operations
Do not bother generating inline wrappers for gen_repz and gen_repz2;
use s->prefix to separate REPZ from REPNZ in the case of SCAS and
CMPS.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:02 +02:00
Paolo Bonzini
f0e754d3ce target/i386: introduce gen_lea_ss_ofs
Generalize gen_stack_A0() to include an initial add and to use an arbitrary
destination.  This is a common pattern and it is not a huge burden to
add the extra arguments to the only caller of gen_stack_A0().

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini
20237d4070 target/i386: use mo_stacksize more
Use mo_stacksize for all stack accesses, including when
a 64-bit code segment is impossible and the code is
therefore checking only for SS32(s).

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini
d0e31d6d37 target/i386: inline gen_add_A0_ds_seg
It is only used in MONITOR, where a direct call of gen_lea_v_seg
is simpler, and in XLAT.  Inline it in the latter.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini
420d60caad target/i386: split gen_ldst_modrm for load and store
The is_store argument of gen_ldst_modrm has only ever been passed
a constant.  Just split the function in two.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini
b3c49e654a target/i386: reg in gen_ldst_modrm is always OR_TMP0
Values other than OR_TMP0 were only ever used by MOV and MOVNTI
opcodes.  Now that these have been converted to the new decoder,
remove the argument.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini
f5bd6a48ee target/i386: raze the gen_eob* jungle
Make gen_eob take the DISAS_* constant as an argument, so that
it is not necessary to have wrappers around it.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini
ad8f2ad77e target/i386: assert that gen_update_eip_cur and gen_update_eip_next are the same in tb_stop
This is an invariant now that there are no calls to gen_eob_inhibit_irq()
outside tb_stop.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini
2512f786bf target/i386: avoid calling gen_eob_inhibit_irq before tb_stop
sti only has one exit, so it does not need to generate the
end-of-translation code inline.  It can be deferred to tb_stop.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini
c8494cb8b1 target/i386: avoid calling gen_eob_syscall before tb_stop
syscall and sysret only have one exit, so they do not need to
generate the end-of-translation code inline.  It can be
deferred to tb_stop.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini
9594b59331 target/i386: document and group DISAS_* constants
Place DISAS_* constants that update cpu_eip first, and
the "jump" ones last.  Add comments explaining the differences
and usage.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini
abdcc5c8ef target/i386: set CC_OP in helpers if they want CC_OP_EFLAGS
Mark cc_op as clean and do not spill it at the end of the translation block.
Technically this is a tiny bit less efficient, but:

* it results in translations that are a tiny bit smaller

* for most of these instructions, it is not unlikely that they are close to
the end of the basic block, in which case cc_op would not be overwritten

* anyway the cost is probably dwarfed by that of computing flags.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini
a0625efd4d target/i386: cpu_load_eflags already sets cc_op
No need to set it again at the end of the translation block, cc_op_dirty
can be set to false.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini
f6ac77eab6 target/i386: remove unnecessary gen_update_cc_op before gen_eob*
This is already handled in gen_eob().  Before adding another DISAS_*
case, remove the double calls.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini
69d7281262 target/i386: cleanup eob handling of RSM
gen_helper_rsm cannot generate an exception, and reloads the flags.
So there's no need to spill cc_op and update cpu_eip, but on the
other hand cc_op must be reset to CC_OP_EFLAGS before returning.

It all works by chance, because by spilling cc_op before the call
to the helper, it becomes non-dirty and gen_eob will not overwrite
the CC_OP_EFLAGS value that is placed there by the helper.  But
let's clean it up.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:28:01 +02:00
Paolo Bonzini
f0f0136abb target/i386: no single-step exception after MOV or POP SS
Intel SDM 18.3.1.4 "If an occurrence of the MOV or POP instruction
loads the SS register executes with EFLAGS.TF = 1, no single-step debug
exception occurs following the MOV or POP instruction."

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 13:27:54 +02:00
Paolo Bonzini
8225bff7c5 target/i386: disable jmp_opt if EFLAGS.RF is 1
If EFLAGS.RF is 1, special processing in gen_eob_worker() is needed and
therefore goto_tb cannot be used.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-25 10:00:12 +02:00
Paolo Bonzini
ec56891984 target/i386: clean up AAM/AAD
The 32-bit AAM/AAD opcodes are using helpers that read and write flags and
env->regs[R_EAX].  Clean them up so that the table correctly includes AX
as a 16-bit input and output.

No real reason to do it to be honest, but they are nice one-output helpers
and it removes the masking of env->regs[R_EAX] that generic load/writeback
code already does.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240522123912.608497-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22 15:53:30 +02:00
Paolo Bonzini
d0414d71f6 target/i386: generate simpler code for ROL/ROR with immediate count
gen_rot_carry and gen_rot_overflow are meant to be called with count == NULL
if the count cannot be zero.  However this is not done in gen_ROL and gen_ROR,
and writing everywhere "can_be_zero ? count : NULL" is burdensome and less
readable.  Just pass can_be_zero as a separate argument.

gen_RCL and gen_RCR use a conditional branch to skip the computation
if count is zero, so they can pass false unconditionally to gen_rot_overflow.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240522123914.608516-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22 15:53:30 +02:00
Richard Henderson
dfc7228be3 target/i386: Use translator_ldub for everything
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-15 08:55:19 +02:00
Richard Henderson
962a145cdc accel/tcg: Provide default implementation of disas_log
Almost all of the disas_log implementations are identical.
Unify them within translator_loop.

Drop extra Priv/Virt logging from target/riscv.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-15 08:55:18 +02:00
Paolo Bonzini
3fabbe0b7d target/i386: move prefetch and multi-byte UD/NOP to new decoder
These are trivial to add, and moving them to the new decoder fixes some
corner cases: raising #UD instead of an instruction fetch page fault for
the undefined opcodes, and incorrectly rejecting 0F 18 prefetches with
register operands (which are treated as reserved NOPs).

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-10 15:45:14 +02:00
Paolo Bonzini
40a3ec7b5f target/i386: rdpkru/wrpkru are no-prefix instructions
Reject 0x66/0xf3/0xf2 in front of them.

Cc: qemu-stable@nongnu.org
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-10 15:45:14 +02:00
Paolo Bonzini
41c685dc59 target/i386: fix operand size for DATA16 REX.W POPCNT
According to the manual, 32-bit vs 64-bit is governed by REX.W
and REX ignores the 0x66 prefix.  This can be confirmed with this
program:

    #include <stdio.h>
    int main()
    {
       int x = 0x12340000;
       int y;
       asm("popcntl %1, %0" : "=r" (y) : "r" (x)); printf("%x\n", y);
       asm("mov $-1, %0; .byte 0x66; popcntl %1, %0" : "+r" (y) : "r" (x)); printf("%x\n", y);
       asm("mov $-1, %0; .byte 0x66; popcntq %q1, %q0" : "+r" (y) : "r" (x)); printf("%x\n", y);
    }

which prints 5/ffff0000/5 on real hardware and 5/ffff0000/ffff0000
on QEMU.

Cc: qemu-stable@nongnu.org
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-10 15:45:14 +02:00
Paolo Bonzini
9f07e47a5e target/i386: remove PCOMMIT from TCG, deprecate property
The PCOMMIT instruction was never included in any physical processor.
TCG implements it as a no-op instruction, but its utility is debatable
to say the least.  Drop it from the decoder since it is only available
with "-cpu max", which does not guarantee migration compatibility
across versions, and deprecate the property just in case someone is
using it as "pcommit=off".

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-10 15:45:14 +02:00
Ruihan Li
97ffb29998 target/i386: Give IRQs a chance when resetting HF_INHIBIT_IRQ_MASK
When emulated with QEMU, interrupts will never come in the following
loop. However, if the NOP instruction is uncommented, interrupts will
fire as normal.

	loop:
		cli
    		call do_sti
		jmp loop

	do_sti:
		sti
		# nop
		ret

This behavior is different from that of a real processor. For example,
if KVM is enabled, interrupts will always fire regardless of whether the
NOP instruction is commented or not. Also, the Intel Software Developer
Manual states that after the STI instruction is executed, the interrupt
inhibit should end as soon as the next instruction (e.g., the RET
instruction if the NOP instruction is commented) is executed.

This problem is caused because the previous code may choose not to end
the TB even if the HF_INHIBIT_IRQ_MASK has just been reset (e.g., in the
case where the STI instruction is immediately followed by the RET
instruction), so that IRQs may not have a change to trigger. This commit
fixes the problem by always terminating the current TB to give IRQs a
chance to trigger when HF_INHIBIT_IRQ_MASK is reset.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Ruihan Li <lrh2000@pku.edu.cn>
Message-ID: <20240415064518.4951-4-lrh2000@pku.edu.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 6a5a63f74ba5c5355b7a8468d3d814bfffe928fb)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-05-09 16:35:02 +03:00
Paolo Bonzini
d4e6d40c36 target/i386: remove duplicate prefix decoding
Now that a bulk of opcodes go through the new decoder, it is sensible
to do some cleanup.  Go immediately through disas_insn_new and only jump
back after parsing the prefixes.

disas_insn() now only contains the three sigsetjmp cases, and they
are more easily managed if they are inlined into i386_tr_translate_insn.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07 08:53:26 +02:00
Paolo Bonzini
ef309ec2a6 target/i386: split legacy decoder into a separate function
Split the bits that have some duplication with disas_insn_new, from
those that should be the main topic of the conversion.  This is the
first step towards removing duplicate decoding of prefixes between
disas_insn and disas_insn_new.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07 08:53:26 +02:00
Paolo Bonzini
7795b455dd target/i386: decode x87 instructions in a separate function
These are unlikely to be converted to the table-based decoding
soon (perhaps there could be generic ESC decoding in decode-new.c.inc
for the Mod/RM byte, but not operand decoding), so keep them separate
from the remaining legacy-decoded instructions.

Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07 08:53:26 +02:00
Paolo Bonzini
aef4f4affd target/i386: remove now-converted opcodes from old decoder
Send all converted opcodes to disas_insn_new() directly from the big
decoding switch statement; once more, the debugging/bisecting logic
disappears.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07 08:53:26 +02:00
Paolo Bonzini
3519b813e1 target/i386: port extensions of one-byte opcodes to new decoder
A few two-byte opcodes are simple extensions of existing one-byte opcodes;
they are easy to decode and need no change to emit.c.inc.  Port them to
the new decoder.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07 08:53:26 +02:00
Paolo Bonzini
37861fa519 target/i386: move BSWAP to new decoder
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07 08:53:26 +02:00
Paolo Bonzini
2b8046f361 target/i386: move remaining conditional operations to new decoder
Move long-displacement Jcc, SETcc and CMOVcc to the new decoder.
While filling in the tables makes the code seem longer, the new
emitters are all just one line of code.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07 08:53:26 +02:00
Paolo Bonzini
40c4b92f1c target/i386: merge and enlarge a few ranges for call to disas_insn_new
Since new opcodes are not going to be added in translate.c, round the
case labels that call to disas_insn_new(), including whole sets of
eight opcodes when possible.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07 08:53:26 +02:00
Paolo Bonzini
d7c41a60d0 target/i386: move C0-FF opcodes to new decoder (except for x87)
The shift instructions are rewritten instead of reusing code from the old
decoder.  Rotates use CC_OP_ADCOX more extensively and generally rely
more on the optimizer, so that the code generators are shared between
the immediate-count and variable-count cases.

In particular, this makes gen_RCL and gen_RCR pretty efficient for the
count == 1 case, which becomes (apart from a few extra movs) something like:

  (compute_cc_all if needed)
  // save old value for OF calculation
  mov     cc_src2, T0
  // the bulk of RCL is just this!
  deposit T0, cc_src, T0, 1, TARGET_LONG_BITS - 1
  // compute carry
  shr     cc_dst, cc_src2, length - 1
  and     cc_dst, cc_dst, 1
  // compute overflow
  xor     cc_src2, cc_src2, T0
  extract cc_src2, cc_src2, length - 1, 1

32-bit MUL and IMUL are also slightly more efficient on 64-bit hosts.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07 08:53:26 +02:00
Paolo Bonzini
b603136402 target/i386: generalize gen_movl_seg_T0
In the new decoder it is sometimes easier to put the segment
in T1 instead of T0, usually because another operand was loaded
by common code in T0.  Genrealize gen_movl_seg_T0 to allow
using any source.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07 08:53:26 +02:00
Paolo Bonzini
5e9e21bcc4 target/i386: move 60-BF opcodes to new decoder
Compared to the old decoder, the main differences in translation
are for the little-used ARPL instruction.  IMUL is adjusted a bit
to share more code to produce flags, but is otherwise very similar.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07 08:53:14 +02:00
Paolo Bonzini
2666fbd271 target/i386: allow instructions with more than one immediate
While keeping decode->immediate for convenience and for 4-operand instructions,
store the immediate in X86DecodedOp as well.  This enables instructions
with more than one immediate such as ENTER.  It can also be used for far
calls and jumps.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07 08:52:19 +02:00
Paolo Bonzini
442e38c4fb target/i386: extract gen_far_call/jmp, reordering temporaries
Extract the code into new functions, and swap T0/T1 so that T0 corresponds
to the first immediate in the instruction stream.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07 08:52:19 +02:00