Skip to content

LLVM MCA claims Skylake can issue 6 instructions per clock #82908

Open
@SeeSpring

Description

@SeeSpring
.att_syntax
.loop:
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	incq	%rax
	cmpq	$999, %rax
	jne	.loop

Godbolt

UICA predicts 4.50 cycles per iteration due to an issue bottleneck; MCA claims 3.2

According to Agner,

The maximum throughput from the decoders is four instructions or five μops per clock cycle

Similar problem with Icelake, again from Agner

The maximum throughput is improved to five instructions per clock cycle, where Skylake has four.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions