Skip to content

Parse cpu a76 core as a55 result in serious performance problem #195

Open
@xiaotongnii

Description

@xiaotongnii

Platform:android_arm64-android13
SOC: cortex A 4a55 and 4a76
Cpuinfo:core_siblings_list:0-7
HW:design 4a55 and 4a76 as a cluster on SOC and configure dtsi cpu 8core as a cluster.

Problem:Use benchmark tools test CPU performance,find cpuinfo parser cpu a76 core as a55 result in serious performance problem on Arm CPU

The cpuinfo call the main function as the following:
https://github.com/pytorch/cpuinfo/blob/76d5e8f5b563daa65340a60fce0e9aec73a936df/src/arm/linux/init.c#L34C37-L34C37

Use processors[0].package_leader_id update processors[sibling].package_leader_id(siblings value form 0 to 7).
Result in processors[i].package_leader_id value is 0.(i form 0 to 7)

//init.c:34
static bool cluster_siblings_parser(
	uint32_t processor, uint32_t siblings_start, uint32_t siblings_end,
	struct cpuinfo_arm_linux_processor* processors)
{
	for(int i = 0; i < 8; i++)
	   {cpuinfo_log_debug("init.c.39.processors[i].package_leader_id:%d, ",processors[i].package_leader_id);}
	processors[processor].flags |= CPUINFO_LINUX_FLAG_PACKAGE_CLUSTER;
	uint32_t package_leader_id = processors[processor].package_leader_id;
    cpuinfo_log_debug("File: %s, Function: %s, Line: %d, processor:%d, package_leader_id:%d,processors[processor].package_leader_id:%d,siblings_start:%d, siblings_end:%d\n", __FILE__, __func__, __LINE__,processor, package_leader_id,processors[processor].package_leader_id,siblings_start,siblings_end);

	for (uint32_t sibling = siblings_start; sibling < siblings_end; sibling++) {
		if (!bitmask_all(processors[sibling].flags, CPUINFO_LINUX_FLAG_VALID)) {
			cpuinfo_log_info("invalid processor %"PRIu32" reported as a sibling for processor %"PRIu32,
				sibling, processor);
			continue;
		}

		const uint32_t sibling_package_leader_id = processors[sibling].package_leader_id;
		if (sibling_package_leader_id < package_leader_id) {
			package_leader_id = sibling_package_leader_id;
		}
		// use processors[0].package_leader_id update processors[sibling].package_leader_id.
               //  processors[sibling].package_leader_id value is 0.
		processors[sibling].package_leader_id = package_leader_id;
		processors[sibling].flags |= CPUINFO_LINUX_FLAG_PACKAGE_CLUSTER;
		cpuinfo_log_debug("File: %s, Function: %s, Line: %d, sibling:%d, sibling_package_leader_id:%d, package_leader_id:%d\n", __FILE__, __func__, __LINE__,sibling,sibling_package_leader_id,package_leader_id);

	}

	processors[processor].package_leader_id = package_leader_id;

	return true;
}

Due to processors[processor].package_leader_id (processor from 0 to 7) is 0, cluster_leader value is 0 all the time.

if (cluster_leader == i) {

	/* Initialize core vendor, uarch, MIDR, and frequency for every logical processor */
	for (uint32_t i = 0; i < arm_linux_processors_count; i++) {
		if (bitmask_all(arm_linux_processors[i].flags, CPUINFO_LINUX_FLAG_VALID)) {
			const uint32_t cluster_leader = arm_linux_processors[i].package_leader_id;
			if (cluster_leader == i) {
				/* Cluster leader: decode core vendor and uarch */
				cpuinfo_arm_decode_vendor_uarch(
				arm_linux_processors[cluster_leader].midr,
#if CPUINFO_ARCH_ARM
				!!(arm_linux_processors[cluster_leader].features & CPUINFO_ARM_LINUX_FEATURE_VFPV4),
#endif
				&arm_linux_processors[cluster_leader].vendor,
				&arm_linux_processors[cluster_leader].uarch);
			} else {
				/* Cluster non-leader: copy vendor, uarch, MIDR, and frequency from cluster leader */
				arm_linux_processors[i].flags |= arm_linux_processors[cluster_leader].flags &
					(CPUINFO_ARM_LINUX_VALID_MIDR | CPUINFO_LINUX_FLAG_MAX_FREQUENCY);
				arm_linux_processors[i].midr = arm_linux_processors[cluster_leader].midr;
				arm_linux_processors[i].vendor = arm_linux_processors[cluster_leader].vendor;
				arm_linux_processors[i].uarch = arm_linux_processors[cluster_leader].uarch;
				arm_linux_processors[i].max_frequency = arm_linux_processors[cluster_leader].max_frequency;
			}
		}
	}

arm_linux_processors[i].uarch = arm_linux_processors[cluster_leader].uarch;

arm_linux_processors[i].uarch = arm_linux_processors[cluster_leader].uarch;

Use arm_linux_processors[0].uarch update arm_linux_processors[i].uarch(i from o0 to 7),Reulst in

arm_linux_processors[i].uarch is the same as update arm_linux_processors[0].uarch which is a55(0x00300355 in the cpuinfo.h)

cpuinfo_uarch_cortex_a55 = 0x00300355,

Please give me some advices.
BR.
Xiaotong

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions