-
Notifications
You must be signed in to change notification settings - Fork 0
Accumulator instructions #77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
target/ppc/insn32.decode
Outdated
@@ -133,6 +133,10 @@ | |||
%xx_xt 0:1 21:5 | |||
%xx_xb 1:1 11:5 | |||
%xx_xa 2:1 16:5 | |||
|
|||
&X_a ra | |||
@X_a ...... ra:3 -- ..... ..... .......... . &X_a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there shouldn't be any --
here. It should be in the instructions line later in the file
@X_a ...... ra:3 .. ..... ..... .......... . &X_a
...
XXMFACC 011111 ...-- 00000 ----- 0010110001 - @X_a
} \ | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could remove this extra blank line here
target/ppc/fpu_helper.c
Outdated
excp_ptr = &env->fp_status; | ||
excp_flags = get_float_exception_flags(excp_ptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest to not use excp_ptr
and just do get_float_exception_flags(&env->fp_status);
target/ppc/fpu_helper.c
Outdated
if (unlikely(excp_flags & float_flag_inexact)) { | ||
fpscr_flags |= FP_FX; | ||
fpscr_flags |= FP_XX; | ||
fpscr_flags |= (fpscr & FP_XE) ? FP_FEX : 0; | ||
} | ||
if (unlikely(excp_flags & float_flag_invalid)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mferst and I had a discussion yesterday on be2e904#r68851624 about not setting inexact when the operation (that case a conversion) is invalid. Do you think this is also the case here?
Also, I think the inexact case is not 'unlikely' as it tends to happen very frequently
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit different here. The description of these MMA instructions says that
For every multiply-add operation that is performed as part of the execution of this instruction, the operation is performed as if all exception enable bits are zero, and the trap-disabled result is returned. [...] Exception status is accumulated and the appropriate exception status bits in the FPSCR are updated at the completion of execution of the instruction.
IIUC, some multiply-add operation may set inexact and other may set invalid, and both bits would be set in FPSCR.
However, considering that softfpu accumulates float_exception_flags
(see float_raise
implementation in include/fpu/softfloat.h
), I wonder if we need this method at all. Shouldn't it be enough to move the call to helper_reset_fpstatus
outside the loop and call do_float_check_status
at the end of the helper?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, considering that softfpu accumulates float_exception_flags (see float_raise implementation in include/fpu/softfloat.h), I wonder if we need this method at all. Shouldn't it be enough to move the call to helper_reset_fpstatus outside the loop and call do_float_check_status at the end of the helper?
I don't think calling do_float_check_status
is enough because it doesn't deal with invalid
exceptions now (only overflow/underflow and inexact).
But I think moving the call outside the loop can still be done creating a temporary float_status
instead of a temporary fpscr
. (might not even be necessary, would need to check if any of the float32_*
relies on the status value being cleared from the previous iteration)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think calling
do_float_check_status
is enough because it doesn't deal withinvalid
exceptions now (only overflow/underflow and inexact).
Oh, right, we'd need to call something like float_invalid_op_madd
too.
But I think moving the call outside the loop can still be done creating a temporary
float_status
instead of a temporaryfpscr
. (might not even be necessary, would need to check if any of thefloat32_*
relies on the status value being cleared from the previous iteration)
AFAICT, the only place in softfpu that checks float_exception_flags
is the hardfloat test can_use_fpu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it also checked more, but having a quick look at the code it seems only float32_is_quiet_nan
and float32_is_signaling_nan
checks float_exception_flags
, I'll move the check outside the loop
target/ppc/fpu_helper.c
Outdated
bool acc, neg_mul, neg_acc; \ | ||
xmsk = mask & 0x0F; \ | ||
ymsk = (mask >> 4) & 0x0F; \ | ||
ymax = MIN(4, 128/(sizeof(TARGET_T) * 8)); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spaces around /
(there's a lot of checkpatch issues on this patch, might want to run it and address them)
target/ppc/insn32.decode
Outdated
&XX2 xt xb uim:uint8_t | ||
@XX2 ...... ..... ... uim:2 ..... ......... .. &XX2 xt=%xx_xt xb=%xx_xb | ||
|
||
&XX3 xt xa xb | ||
@XX3 ...... ..... ..... ..... ........ ... &XX3 xt=%xx_xt xa=%xx_xa xb=%xx_xb | ||
|
||
%xx_at 23:3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
align this line with the dots in the line below
target/ppc/fpu_helper.c
Outdated
psum = float32_zero; \ | ||
va = !(pmsk & 0x2) ? float32_zero : \ | ||
GET_VSR(Vsr##OR_EL, a, \ | ||
2*i, ORIG_T, float32); \ | ||
vb = !(pmsk & 0x2) ? float32_zero : \ | ||
GET_VSR(Vsr##OR_EL, b, \ | ||
2*j, ORIG_T, float32); \ | ||
vc = !(pmsk & 0x1) ? float32_zero : \ | ||
GET_VSR(Vsr##OR_EL, a, \ | ||
2*i + 1, ORIG_T, float32); \ | ||
vd = !(pmsk & 0x1) ? float32_zero : \ | ||
GET_VSR(Vsr##OR_EL, b, \ | ||
2*j + 1, ORIG_T, float32); \ | ||
psum = float32_mul(va, vb, excp_ptr); \ | ||
psum = float32_muladd(vc, vd, psum,0, excp_ptr); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like psum = float32_zero;
has no effect here
target/ppc/fpu_helper.c
Outdated
} \ | ||
} \ | ||
env->fpscr |= fpscr; \ | ||
helper_fpscr_check_status(env); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you shouldn't call helper_*
from inside another helper (like here). @mferst explained it in another PR: #76 (comment)
I'm trying to tackle this situation in my exceptions rework in #76 but it's not really usable right now. As I see that patch is a mix between your compute_fp_flags()
and helper_fpscr_check_status()
.
Some options could be wait/help to make my rework work here too;
or create a do_fpscr_check_status
that receive a raddr
(like do_float_check_status()
), move helper_fpscr_check_status()
logic to it and make helper_fpscr_check_status()
call do_fpscr_check_status()
target/ppc/fpu_helper.c
Outdated
j++, ymsk_bit >>= 1) { \ | ||
if ((xmsk_bit & xmsk) && (ymsk_bit & ymsk)) { \ | ||
helper_reset_fpstatus(env); \ | ||
psum = TYPE##_zero; \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this line could be removed too
target/ppc/fpu_helper.c
Outdated
if (acc) { \ | ||
psum = TYPE##_muladd(va, vb, aux_acc, \ | ||
op_flags, excp_ptr); \ | ||
} else { \ | ||
psum = TYPE##_mul(va, vb, excp_ptr); \ | ||
} \ | ||
\ | ||
at->Vsr##EL(j) = psum; \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you use a ternary operator you could even remove psum
from this function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the objective is removing the psum you don't even need a ternary operator, psum is there mostly to imitate the ISA's pseudocode, but I'll change to a ternary
9eb65b2
to
b7c0382
Compare
target/ppc/translate/vsx-impl.c.inc
Outdated
set_cpu_vsr(a->ra * 4 + i, tcg_constant_i64(0), false); | ||
set_cpu_vsr(a->ra * 4 + i, tcg_constant_i64(0), true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be better if you save tcg_constant_i64(0)
in a variable.
target/ppc/translate/vsx-impl.c.inc
Outdated
return true; | ||
} | ||
|
||
op_flag = op; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use op
directly?
target/ppc/translate/vsx-impl.c.inc
Outdated
TRANS(XVI4GER8 , do_ger_XX3, GER_NOP, gen_helper_XVI4GER8) | ||
TRANS(XVI4GER8PP , do_ger_XX3, GER_PP , gen_helper_XVI4GER8) | ||
TRANS(XVI8GER4 , do_ger_XX3, GER_NOP, gen_helper_XVI8GER4) | ||
TRANS(XVI8GER4PP , do_ger_XX3, GER_PP , gen_helper_XVI8GER4) | ||
TRANS(XVI8GER4SPP , do_ger_XX3, GER_SPP, gen_helper_XVI8GER4) | ||
TRANS(XVI16GER2 , do_ger_XX3, GER_NOP, gen_helper_XVI16GER2) | ||
TRANS(XVI16GER2PP , do_ger_XX3, GER_PP , gen_helper_XVI16GER2) | ||
TRANS(XVI16GER2S , do_ger_XX3, GER_SAT, gen_helper_XVI16GER2) | ||
TRANS(XVI16GER2SPP, do_ger_XX3, GER_SPP, gen_helper_XVI16GER2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TRANS(XVI4GER8 , do_ger_XX3, GER_NOP, gen_helper_XVI4GER8) | |
TRANS(XVI4GER8PP , do_ger_XX3, GER_PP , gen_helper_XVI4GER8) | |
TRANS(XVI8GER4 , do_ger_XX3, GER_NOP, gen_helper_XVI8GER4) | |
TRANS(XVI8GER4PP , do_ger_XX3, GER_PP , gen_helper_XVI8GER4) | |
TRANS(XVI8GER4SPP , do_ger_XX3, GER_SPP, gen_helper_XVI8GER4) | |
TRANS(XVI16GER2 , do_ger_XX3, GER_NOP, gen_helper_XVI16GER2) | |
TRANS(XVI16GER2PP , do_ger_XX3, GER_PP , gen_helper_XVI16GER2) | |
TRANS(XVI16GER2S , do_ger_XX3, GER_SAT, gen_helper_XVI16GER2) | |
TRANS(XVI16GER2SPP, do_ger_XX3, GER_SPP, gen_helper_XVI16GER2) | |
TRANS(XVI4GER8, do_ger_XX3, GER_NOP, gen_helper_XVI4GER8) | |
TRANS(XVI4GER8PP, do_ger_XX3, GER_PP, gen_helper_XVI4GER8) | |
TRANS(XVI8GER4, do_ger_XX3, GER_NOP, gen_helper_XVI8GER4) | |
TRANS(XVI8GER4PP, do_ger_XX3, GER_PP, gen_helper_XVI8GER4) | |
TRANS(XVI8GER4SPP, do_ger_XX3, GER_SPP, gen_helper_XVI8GER4) | |
TRANS(XVI16GER2, do_ger_XX3, GER_NOP, gen_helper_XVI16GER2) | |
TRANS(XVI16GER2PP, do_ger_XX3, GER_PP, gen_helper_XVI16GER2) | |
TRANS(XVI16GER2S, do_ger_XX3, GER_SAT, gen_helper_XVI16GER2) | |
TRANS(XVI16GER2SPP, do_ger_XX3, GER_SPP, gen_helper_XVI16GER2) |
target/ppc/translate/vsx-impl.c.inc
Outdated
TRANS64(PMXVI4GER8 , do_ger_MMIRR_XX3, GER_NOP, gen_helper_XVI4GER8) | ||
TRANS64(PMXVI4GER8PP , do_ger_MMIRR_XX3, GER_PP , gen_helper_XVI4GER8) | ||
TRANS64(PMXVI8GER4 , do_ger_MMIRR_XX3, GER_NOP, gen_helper_XVI8GER4) | ||
TRANS64(PMXVI8GER4PP , do_ger_MMIRR_XX3, GER_PP , gen_helper_XVI8GER4) | ||
TRANS64(PMXVI8GER4SPP , do_ger_MMIRR_XX3, GER_SPP, gen_helper_XVI8GER4) | ||
TRANS64(PMXVI16GER2 , do_ger_MMIRR_XX3, GER_NOP, gen_helper_XVI16GER2) | ||
TRANS64(PMXVI16GER2PP , do_ger_MMIRR_XX3, GER_PP , gen_helper_XVI16GER2) | ||
TRANS64(PMXVI16GER2S , do_ger_MMIRR_XX3, GER_SAT, gen_helper_XVI16GER2) | ||
TRANS64(PMXVI16GER2SPP, do_ger_MMIRR_XX3, GER_SPP, gen_helper_XVI16GER2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TRANS64(PMXVI4GER8 , do_ger_MMIRR_XX3, GER_NOP, gen_helper_XVI4GER8) | |
TRANS64(PMXVI4GER8PP , do_ger_MMIRR_XX3, GER_PP , gen_helper_XVI4GER8) | |
TRANS64(PMXVI8GER4 , do_ger_MMIRR_XX3, GER_NOP, gen_helper_XVI8GER4) | |
TRANS64(PMXVI8GER4PP , do_ger_MMIRR_XX3, GER_PP , gen_helper_XVI8GER4) | |
TRANS64(PMXVI8GER4SPP , do_ger_MMIRR_XX3, GER_SPP, gen_helper_XVI8GER4) | |
TRANS64(PMXVI16GER2 , do_ger_MMIRR_XX3, GER_NOP, gen_helper_XVI16GER2) | |
TRANS64(PMXVI16GER2PP , do_ger_MMIRR_XX3, GER_PP , gen_helper_XVI16GER2) | |
TRANS64(PMXVI16GER2S , do_ger_MMIRR_XX3, GER_SAT, gen_helper_XVI16GER2) | |
TRANS64(PMXVI16GER2SPP, do_ger_MMIRR_XX3, GER_SPP, gen_helper_XVI16GER2) | |
TRANS64(PMXVI4GER8, do_ger_MMIRR_XX3, GER_NOP, gen_helper_XVI4GER8) | |
TRANS64(PMXVI4GER8PP, do_ger_MMIRR_XX3, GER_PP, gen_helper_XVI4GER8) | |
TRANS64(PMXVI8GER4, do_ger_MMIRR_XX3, GER_NOP, gen_helper_XVI8GER4) | |
TRANS64(PMXVI8GER4PP, do_ger_MMIRR_XX3, GER_PP, gen_helper_XVI8GER4) | |
TRANS64(PMXVI8GER4SPP, do_ger_MMIRR_XX3, GER_SPP, gen_helper_XVI8GER4) | |
TRANS64(PMXVI16GER2, do_ger_MMIRR_XX3, GER_NOP, gen_helper_XVI16GER2) | |
TRANS64(PMXVI16GER2PP, do_ger_MMIRR_XX3, GER_PP, gen_helper_XVI16GER2) | |
TRANS64(PMXVI16GER2S, do_ger_MMIRR_XX3, GER_SAT, gen_helper_XVI16GER2) | |
TRANS64(PMXVI16GER2SPP, do_ger_MMIRR_XX3, GER_SPP, gen_helper_XVI16GER2) |
target/ppc/fpu_helper.c
Outdated
/* | ||
* This function should be changed before sending the patches, | ||
* as it's here as a placeholder since the functions that set | ||
* FPSCR also raise an exception. | ||
*/ | ||
static inline void compute_fp_flags(CPUPPCState *env, uintptr_t retaddr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this comment needs an update? This method is also raising the exception at the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe It'd be good to rewrite it to make it clearer, but the problem is that other functions would raise an exception before setting all FPSCR bits (and possibly wrongfully unsetting FR/FI), so this function seeks to avoid that
target/ppc/insn32.decode
Outdated
@@ -143,8 +143,9 @@ | |||
&XX3 xt xa xb | |||
@XX3 ...... ..... ..... ..... ........ ... &XX3 xt=%xx_xt xa=%xx_xa xb=%xx_xb | |||
|
|||
%xx_at 23:3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fix belong to the second commit ("target/ppc: Implemented xviger instructions")
target/ppc/translate/vsx-impl.c.inc
Outdated
TRANS(XVF16GER2, do_ger_XX3, GER_NOP, gen_helper_XVF16GER2) | ||
TRANS(XVF16GER2PP, do_ger_XX3, GER_PP , gen_helper_XVF16GER2) | ||
TRANS(XVF16GER2PN, do_ger_XX3, GER_PN , gen_helper_XVF16GER2) | ||
TRANS(XVF16GER2NP, do_ger_XX3, GER_NP , gen_helper_XVF16GER2) | ||
TRANS(XVF16GER2NN, do_ger_XX3, GER_NN , gen_helper_XVF16GER2) | ||
|
||
TRANS(XVF32GER, do_ger_XX3, GER_NOP, gen_helper_XVF32GER) | ||
TRANS(XVF32GERPP, do_ger_XX3, GER_PP , gen_helper_XVF32GER) | ||
TRANS(XVF32GERPN, do_ger_XX3, GER_PN , gen_helper_XVF32GER) | ||
TRANS(XVF32GERNP, do_ger_XX3, GER_NP , gen_helper_XVF32GER) | ||
TRANS(XVF32GERNN, do_ger_XX3, GER_NN , gen_helper_XVF32GER) | ||
|
||
TRANS(XVF64GER, do_ger_XX3, GER_NOP, gen_helper_XVF64GER) | ||
TRANS(XVF64GERPP, do_ger_XX3, GER_PP , gen_helper_XVF64GER) | ||
TRANS(XVF64GERPN, do_ger_XX3, GER_PN , gen_helper_XVF64GER) | ||
TRANS(XVF64GERNP, do_ger_XX3, GER_NP , gen_helper_XVF64GER) | ||
TRANS(XVF64GERNN, do_ger_XX3, GER_NN , gen_helper_XVF64GER) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TRANS(XVF16GER2, do_ger_XX3, GER_NOP, gen_helper_XVF16GER2) | |
TRANS(XVF16GER2PP, do_ger_XX3, GER_PP , gen_helper_XVF16GER2) | |
TRANS(XVF16GER2PN, do_ger_XX3, GER_PN , gen_helper_XVF16GER2) | |
TRANS(XVF16GER2NP, do_ger_XX3, GER_NP , gen_helper_XVF16GER2) | |
TRANS(XVF16GER2NN, do_ger_XX3, GER_NN , gen_helper_XVF16GER2) | |
TRANS(XVF32GER, do_ger_XX3, GER_NOP, gen_helper_XVF32GER) | |
TRANS(XVF32GERPP, do_ger_XX3, GER_PP , gen_helper_XVF32GER) | |
TRANS(XVF32GERPN, do_ger_XX3, GER_PN , gen_helper_XVF32GER) | |
TRANS(XVF32GERNP, do_ger_XX3, GER_NP , gen_helper_XVF32GER) | |
TRANS(XVF32GERNN, do_ger_XX3, GER_NN , gen_helper_XVF32GER) | |
TRANS(XVF64GER, do_ger_XX3, GER_NOP, gen_helper_XVF64GER) | |
TRANS(XVF64GERPP, do_ger_XX3, GER_PP , gen_helper_XVF64GER) | |
TRANS(XVF64GERPN, do_ger_XX3, GER_PN , gen_helper_XVF64GER) | |
TRANS(XVF64GERNP, do_ger_XX3, GER_NP , gen_helper_XVF64GER) | |
TRANS(XVF64GERNN, do_ger_XX3, GER_NN , gen_helper_XVF64GER) | |
TRANS(XVF16GER2, do_ger_XX3, GER_NOP, gen_helper_XVF16GER2) | |
TRANS(XVF16GER2PP, do_ger_XX3, GER_PP, gen_helper_XVF16GER2) | |
TRANS(XVF16GER2PN, do_ger_XX3, GER_PN, gen_helper_XVF16GER2) | |
TRANS(XVF16GER2NP, do_ger_XX3, GER_NP, gen_helper_XVF16GER2) | |
TRANS(XVF16GER2NN, do_ger_XX3, GER_NN, gen_helper_XVF16GER2) | |
TRANS(XVF32GER, do_ger_XX3, GER_NOP, gen_helper_XVF32GER) | |
TRANS(XVF32GERPP, do_ger_XX3, GER_PP, gen_helper_XVF32GER) | |
TRANS(XVF32GERPN, do_ger_XX3, GER_PN, gen_helper_XVF32GER) | |
TRANS(XVF32GERNP, do_ger_XX3, GER_NP, gen_helper_XVF32GER) | |
TRANS(XVF32GERNN, do_ger_XX3, GER_NN, gen_helper_XVF32GER) | |
TRANS(XVF64GER, do_ger_XX3, GER_NOP, gen_helper_XVF64GER) | |
TRANS(XVF64GERPP, do_ger_XX3, GER_PP, gen_helper_XVF64GER) | |
TRANS(XVF64GERPN, do_ger_XX3, GER_PN, gen_helper_XVF64GER) | |
TRANS(XVF64GERNP, do_ger_XX3, GER_NP, gen_helper_XVF64GER) | |
TRANS(XVF64GERNN, do_ger_XX3, GER_NN, gen_helper_XVF64GER) |
target/ppc/translate/vsx-impl.c.inc
Outdated
TRANS64(PMXVF16GER2, do_ger_MMIRR_XX3, GER_NOP, gen_helper_XVF16GER2) | ||
TRANS64(PMXVF16GER2PP, do_ger_MMIRR_XX3, GER_PP , gen_helper_XVF16GER2) | ||
TRANS64(PMXVF16GER2PN, do_ger_MMIRR_XX3, GER_PN , gen_helper_XVF16GER2) | ||
TRANS64(PMXVF16GER2NP, do_ger_MMIRR_XX3, GER_NP , gen_helper_XVF16GER2) | ||
TRANS64(PMXVF16GER2NN, do_ger_MMIRR_XX3, GER_NN , gen_helper_XVF16GER2) | ||
|
||
TRANS64(PMXVF32GER, do_ger_MMIRR_XX3_NO_PMSK, GER_NOP, gen_helper_XVF32GER) | ||
TRANS64(PMXVF32GERPP, do_ger_MMIRR_XX3_NO_PMSK, GER_PP , gen_helper_XVF32GER) | ||
TRANS64(PMXVF32GERPN, do_ger_MMIRR_XX3_NO_PMSK, GER_PN , gen_helper_XVF32GER) | ||
TRANS64(PMXVF32GERNP, do_ger_MMIRR_XX3_NO_PMSK, GER_NP , gen_helper_XVF32GER) | ||
TRANS64(PMXVF32GERNN, do_ger_MMIRR_XX3_NO_PMSK, GER_NN , gen_helper_XVF32GER) | ||
|
||
TRANS64(PMXVF64GER, do_ger_MMIRR_XX3_NO_PMSK, GER_NOP, gen_helper_XVF64GER) | ||
TRANS64(PMXVF64GERPP, do_ger_MMIRR_XX3_NO_PMSK, GER_PP , gen_helper_XVF64GER) | ||
TRANS64(PMXVF64GERPN, do_ger_MMIRR_XX3_NO_PMSK, GER_PN , gen_helper_XVF64GER) | ||
TRANS64(PMXVF64GERNP, do_ger_MMIRR_XX3_NO_PMSK, GER_NP , gen_helper_XVF64GER) | ||
TRANS64(PMXVF64GERNN, do_ger_MMIRR_XX3_NO_PMSK, GER_NN , gen_helper_XVF64GER) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TRANS64(PMXVF16GER2, do_ger_MMIRR_XX3, GER_NOP, gen_helper_XVF16GER2) | |
TRANS64(PMXVF16GER2PP, do_ger_MMIRR_XX3, GER_PP , gen_helper_XVF16GER2) | |
TRANS64(PMXVF16GER2PN, do_ger_MMIRR_XX3, GER_PN , gen_helper_XVF16GER2) | |
TRANS64(PMXVF16GER2NP, do_ger_MMIRR_XX3, GER_NP , gen_helper_XVF16GER2) | |
TRANS64(PMXVF16GER2NN, do_ger_MMIRR_XX3, GER_NN , gen_helper_XVF16GER2) | |
TRANS64(PMXVF32GER, do_ger_MMIRR_XX3_NO_PMSK, GER_NOP, gen_helper_XVF32GER) | |
TRANS64(PMXVF32GERPP, do_ger_MMIRR_XX3_NO_PMSK, GER_PP , gen_helper_XVF32GER) | |
TRANS64(PMXVF32GERPN, do_ger_MMIRR_XX3_NO_PMSK, GER_PN , gen_helper_XVF32GER) | |
TRANS64(PMXVF32GERNP, do_ger_MMIRR_XX3_NO_PMSK, GER_NP , gen_helper_XVF32GER) | |
TRANS64(PMXVF32GERNN, do_ger_MMIRR_XX3_NO_PMSK, GER_NN , gen_helper_XVF32GER) | |
TRANS64(PMXVF64GER, do_ger_MMIRR_XX3_NO_PMSK, GER_NOP, gen_helper_XVF64GER) | |
TRANS64(PMXVF64GERPP, do_ger_MMIRR_XX3_NO_PMSK, GER_PP , gen_helper_XVF64GER) | |
TRANS64(PMXVF64GERPN, do_ger_MMIRR_XX3_NO_PMSK, GER_PN , gen_helper_XVF64GER) | |
TRANS64(PMXVF64GERNP, do_ger_MMIRR_XX3_NO_PMSK, GER_NP , gen_helper_XVF64GER) | |
TRANS64(PMXVF64GERNN, do_ger_MMIRR_XX3_NO_PMSK, GER_NN , gen_helper_XVF64GER) | |
TRANS64(PMXVF16GER2, do_ger_MMIRR_XX3, GER_NOP, gen_helper_XVF16GER2) | |
TRANS64(PMXVF16GER2PP, do_ger_MMIRR_XX3, GER_PP, gen_helper_XVF16GER2) | |
TRANS64(PMXVF16GER2PN, do_ger_MMIRR_XX3, GER_PN, gen_helper_XVF16GER2) | |
TRANS64(PMXVF16GER2NP, do_ger_MMIRR_XX3, GER_NP, gen_helper_XVF16GER2) | |
TRANS64(PMXVF16GER2NN, do_ger_MMIRR_XX3, GER_NN, gen_helper_XVF16GER2) | |
TRANS64(PMXVF32GER, do_ger_MMIRR_XX3_NO_PMSK, GER_NOP, gen_helper_XVF32GER) | |
TRANS64(PMXVF32GERPP, do_ger_MMIRR_XX3_NO_PMSK, GER_PP, gen_helper_XVF32GER) | |
TRANS64(PMXVF32GERPN, do_ger_MMIRR_XX3_NO_PMSK, GER_PN, gen_helper_XVF32GER) | |
TRANS64(PMXVF32GERNP, do_ger_MMIRR_XX3_NO_PMSK, GER_NP, gen_helper_XVF32GER) | |
TRANS64(PMXVF32GERNN, do_ger_MMIRR_XX3_NO_PMSK, GER_NN, gen_helper_XVF32GER) | |
TRANS64(PMXVF64GER, do_ger_MMIRR_XX3_NO_PMSK, GER_NOP, gen_helper_XVF64GER) | |
TRANS64(PMXVF64GERPP, do_ger_MMIRR_XX3_NO_PMSK, GER_PP, gen_helper_XVF64GER) | |
TRANS64(PMXVF64GERPN, do_ger_MMIRR_XX3_NO_PMSK, GER_PN, gen_helper_XVF64GER) | |
TRANS64(PMXVF64GERNP, do_ger_MMIRR_XX3_NO_PMSK, GER_NP, gen_helper_XVF64GER) | |
TRANS64(PMXVF64GERNN, do_ger_MMIRR_XX3_NO_PMSK, GER_NN, gen_helper_XVF64GER) |
44bd674
to
0b52213
Compare
target/ppc/insn32.decode
Outdated
@@ -155,6 +155,9 @@ | |||
&XX2 xt xb | |||
@XX2 ...... ..... ..... ..... ......... .. &XX2 xt=%xx_xt xb=%xx_xb | |||
|
|||
&X_a ra | |||
@X_a ...... ra:3 .. ..... ..... .......... . &X_a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove extra space before &X_a
|
||
## Vector GER instruction | ||
|
||
XVI4GER8 111011 ... -- ..... ..... 00100011 ..- @XX3_at xa=%xx_xa |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, I didn't know this xa=%xx_xa
worked here
target/ppc/translate/vsx-impl.c.inc
Outdated
|
||
mask = ger_pack_masks(a->pmsk, a->ymsk, a->xmsk); | ||
helper(cpu_env, tcg_constant_i32(a->xa), tcg_constant_i32(a->xb), | ||
tcg_constant_i32(a->xt * 4), tcg_constant_i32(mask), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong opinion on this, but would this be the case to use times_4
from translate.c
? Does it even work with prefixed instructions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the new version has times_4 for comparison
56b8efe
to
87df5e7
Compare
87df5e7
to
2986290
Compare
According to Power ISA, the FI bit in FPSCR is non-sticky. This means that if an instruction is said to modify the FI bit, then it should be set or cleared depending on the result of the instruction. Otherwise, it should be kept as was before. However, the following inconsistency was found when comparing results from the hardware (tested on both a Power 9 processor and in Power 10 Mambo): (FI bit is set before the execution of the instruction) Hardware: xscmpeqdp(0xff..ff, 0xff..ff) = FI: SET -> SET QEMU: xscmpeqdp(0xff..ff, 0xff..ff) = FI: SET -> CLEARED As the FI bit is non-sticky, and xscmpeqdp does not list it as a field that is changed by the instruction, it should not be changed after its execution. This is happening to multiple instructions in the vsx implementations. If the ISA does not list the FI bit as altered for a particular instruction, then it should be kept as it was before the instruction. QEMU is not following this behavior. Affected instructions include: - xv* (all vsx-vector instructions); - xscmp*, xsmax*, xsmin*; - xstdivdp and similars; (to identify the affected instructions, just search in the ISA for the instructions that does not list FI in "Special Registers Altered") Most instructions use the function do_float_check_status() to commit changes in the inexact flag. So the fix is to add a parameter to it that will control if the bit FI should be changed or not. All users of do_float_check_status() are then modified to provide this argument, controlling if that specific instruction changes bit FI or not. Some macro helpers are responsible for both instructions that change and instructions that aren't suposed to change FI. This seems to always overlap with the sfprf flag. So, reuse this flag for this purpose when applicable. Signed-off-by: Víctor Colombo <[email protected]> Reviewed-by: Richard Henderson <[email protected]>
This patch fixes another not-so-clear situation in Power ISA regarding the inexact bits in FPSCR. The ISA states that: """ When Overflow Exception is disabled (OE=0) and an Overflow Exception occurs, the following actions are taken: ... 2. Inexact Exception is set XX <- 1 ... FI is set to 1 ... """ However, when tested on a Power 9 hardware, some instructions that trigger an OX don't set the FI bit: xvcvdpsp(0x4050533fcdb7b95ff8d561c40bf90996) = FI: CLEARED -> CLEARED xvnmsubmsp(0xf3c0c1fc8f3230, 0xbeaab9c5) = FI: CLEARED -> CLEARED (just a few examples. Other instructions are also affected) The root cause for this seems to be that only instructions that list the bit FI in the "Special Registers Altered" should modify it. QEMU is, today, not working like the hardware: xvcvdpsp(0x4050533fcdb7b95ff8d561c40bf90996) = FI: CLEARED -> SET xvnmsubmsp(0xf3c0c1fc8f3230, 0xbeaab9c5) = FI: CLEARED -> SET (all tests assume FI is cleared beforehand) Fix this by making float_overflow_excp() return float_flag_inexact if it should update the inexact flags. Signed-off-by: Víctor Colombo <[email protected]> Reviewed-by: Richard Henderson <[email protected]> Reviewed-by: Rashmica Gupta <[email protected]>
The bit FI fix used the sfprf flag as a flag for the set_fi parameter in do_float_check_status where applicable. Now, this patch rename this flag to sfifprf to state this dual usage. Signed-off-by: Víctor Colombo <[email protected]> Reviewed-by: Richard Henderson <[email protected]> Reviewed-by: Rashmica Gupta <[email protected]>
Implement the following PowerISA v3.1 instructions: xxmfacc: VSX Move From Accumulator xxmtacc: VSX Move To Accumulator xxsetaccz: VSX Set Accumulator to Zero The PowerISA 3.1 mentions that for the current version of the architecture, "the hardware implementation provides the effect of ACC[i] and VSRs 4*i to 4*i + 3 logically containing the same data" and "The Accumulators introduce no new logical state at this time" (page 501). For now it seems unnecessary to create new structures, so this patch just uses ACC[i] as VSRs 4*i to 4*i+3 and therefore move to and from accumulators are no-ops. Signed-off-by: Lucas Mateus Castro (alqotel) <[email protected]> Reviewed-by: Richard Henderson <[email protected]>
Implement the following PowerISA v3.1 instructions: xvi4ger8: VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) xvi4ger8pp: VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) Positive multiply, Positive accumulate xvi8ger4: VSX Vector 4-bit Signed Integer GER (rank-8 update) xvi8ger4pp: VSX Vector 4-bit Signed Integer GER (rank-8 update) Positive multiply, Positive accumulate xvi8ger4spp: VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) with Saturate Positive multiply, Positive accumulate xvi16ger2: VSX Vector 16-bit Signed Integer GER (rank-2 update) xvi16ger2pp: VSX Vector 16-bit Signed Integer GER (rank-2 update) Positive multiply, Positive accumulate xvi16ger2s: VSX Vector 16-bit Signed Integer GER (rank-2 update) with Saturation xvi16ger2spp: VSX Vector 16-bit Signed Integer GER (rank-2 update) with Saturation Positive multiply, Positive accumulate Signed-off-by: Lucas Mateus Castro (alqotel) <[email protected]> Reviewed-by: Richard Henderson <[email protected]>
Implement the following PowerISA v3.1 instructions: pmxvi4ger8: Prefixed Masked VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) pmxvi4ger8pp: Prefixed Masked VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) Positive multiply, Positive accumulate pmxvi8ger4: Prefixed Masked VSX Vector 4-bit Signed Integer GER (rank-8 update) pmxvi8ger4pp: Prefixed Masked VSX Vector 4-bit Signed Integer GER (rank-8 update) Positive multiply, Positive accumulate pmxvi8ger4spp: Prefixed Masked VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) with Saturate Positive multiply, Positive accumulate pmxvi16ger2: Prefixed Masked VSX Vector 16-bit Signed Integer GER (rank-2 update) pmxvi16ger2pp: Prefixed Masked VSX Vector 16-bit Signed Integer GER (rank-2 update) Positive multiply, Positive accumulate pmxvi16ger2s: Prefixed Masked VSX Vector 16-bit Signed Integer GER (rank-2 update) with Saturation pmxvi16ger2spp: Prefixed Masked VSX Vector 16-bit Signed Integer GER (rank-2 update) with Saturation Positive multiply, Positive accumulate Signed-off-by: Lucas Mateus Castro (alqotel) <[email protected]> Reviewed-by: Richard Henderson <[email protected]>
Implement the following PowerISA v3.1 instructions: xvf32ger: VSX Vector 32-bit Floating-Point GER (rank-1 update) xvf32gernn: VSX Vector 32-bit Floating-Point GER (rank-1 update) Negative multiply, Negative accumulate xvf32gernp: VSX Vector 32-bit Floating-Point GER (rank-1 update) Negative multiply, Positive accumulate xvf32gerpn: VSX Vector 32-bit Floating-Point GER (rank-1 update) Positive multiply, Negative accumulate xvf32gerpp: VSX Vector 32-bit Floating-Point GER (rank-1 update) Positive multiply, Positive accumulate xvf64ger: VSX Vector 64-bit Floating-Point GER (rank-1 update) xvf64gernn: VSX Vector 64-bit Floating-Point GER (rank-1 update) Negative multiply, Negative accumulate xvf64gernp: VSX Vector 64-bit Floating-Point GER (rank-1 update) Negative multiply, Positive accumulate xvf64gerpn: VSX Vector 64-bit Floating-Point GER (rank-1 update) Positive multiply, Negative accumulate xvf64gerpp: VSX Vector 64-bit Floating-Point GER (rank-1 update) Positive multiply, Positive accumulate Signed-off-by: Lucas Mateus Castro (alqotel) <[email protected]> Reviewed-by: Richard Henderson <[email protected]>
Implement the following PowerISA v3.1 instructions: xvf16ger2: VSX Vector 16-bit Floating-Point GER (rank-2 update) xvf16ger2nn: VSX Vector 16-bit Floating-Point GER (rank-2 update) Negative multiply, Negative accumulate xvf16ger2np: VSX Vector 16-bit Floating-Point GER (rank-2 update) Negative multiply, Positive accumulate xvf16ger2pn: VSX Vector 16-bit Floating-Point GER (rank-2 update) Positive multiply, Negative accumulate xvf16ger2pp: VSX Vector 16-bit Floating-Point GER (rank-2 update) Positive multiply, Positive accumulate Signed-off-by: Lucas Mateus Castro (alqotel) <[email protected]>
Implement the following PowerISA v3.1 instructions: pmxvf16ger2: Prefixed Masked VSX Vector 16-bit Floating-Point GER (rank-2 update) pmxvf16ger2nn: Prefixed Masked VSX Vector 16-bit Floating-Point GER (rank-2 update) Negative multiply, Negative accumulate pmxvf16ger2np: Prefixed Masked VSX Vector 16-bit Floating-Point GER (rank-2 update) Negative multiply, Positive accumulate pmxvf16ger2pn: Prefixed Masked VSX Vector 16-bit Floating-Point GER (rank-2 update) Positive multiply, Negative accumulate pmxvf16ger2pp: Prefixed Masked VSX Vector 16-bit Floating-Point GER (rank-2 update) Positive multiply, Positive accumulate pmxvf32ger: Prefixed Masked VSX Vector 32-bit Floating-Point GER (rank-1 update) pmxvf32gernn: Prefixed Masked VSX Vector 32-bit Floating-Point GER (rank-1 update) Negative multiply, Negative accumulate pmxvf32gernp: Prefixed Masked VSX Vector 32-bit Floating-Point GER (rank-1 update) Negative multiply, Positive accumulate pmxvf32gerpn: Prefixed Masked VSX Vector 32-bit Floating-Point GER (rank-1 update) Positive multiply, Negative accumulate pmxvf32gerpp: Prefixed Masked VSX Vector 32-bit Floating-Point GER (rank-1 update) Positive multiply, Positive accumulate pmxvf64ger: Prefixed Masked VSX Vector 64-bit Floating-Point GER (rank-1 update) pmxvf64gernn: Prefixed Masked VSX Vector 64-bit Floating-Point GER (rank-1 update) Negative multiply, Negative accumulate pmxvf64gernp: Prefixed Masked VSX Vector 64-bit Floating-Point GER (rank-1 update) Negative multiply, Positive accumulate pmxvf64gerpn: Prefixed Masked VSX Vector 64-bit Floating-Point GER (rank-1 update) Positive multiply, Negative accumulate pmxvf64gerpp: Prefixed Masked VSX Vector 64-bit Floating-Point GER (rank-1 update) Positive multiply, Positive accumulate Signed-off-by: Lucas Mateus Castro (alqotel) <[email protected]> Reviewed-by: Richard Henderson <[email protected]>
Implement the following PowerISA v3.1 instructions: xvbf16ger2: VSX Vector bfloat16 GER (rank-2 update) xvbf16ger2nn: VSX Vector bfloat16 GER (rank-2 update) Negative multiply, Negative accumulate xvbf16ger2np: VSX Vector bfloat16 GER (rank-2 update) Negative multiply, Positive accumulate xvbf16ger2pn: VSX Vector bfloat16 GER (rank-2 update) Positive multiply, Negative accumulate xvbf16ger2pp: VSX Vector bfloat16 GER (rank-2 update) Positive multiply, Positive accumulate pmxvbf16ger2: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update) pmxvbf16ger2nn: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update) Negative multiply, Negative accumulate pmxvbf16ger2np: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update) Negative multiply, Positive accumulate pmxvbf16ger2pn: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update) Positive multiply, Negative accumulate pmxvbf16ger2pp: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update) Positive multiply, Positive accumulate Signed-off-by: Lucas Mateus Castro (alqotel) <[email protected]> Reviewed-by: Richard Henderson <[email protected]>
These are new hwcap bits added for power10. Signed-off-by: Joel Stanley <[email protected]> Signed-off-by: Lucas Mateus Castro (alqotel) <[email protected]> Reviewed-by: Richard Henderson <[email protected]>
2986290
to
6f23817
Compare
No description provided.