summaryrefslogtreecommitdiffstats
path: root/UefiCpuPkg
Commit message (Collapse)AuthorAgeFilesLines
* UefiCpuPkg: PiSmmCpuDxeSmm: Check buffer size before accessingKun Qin2021-04-122-2/+9
| | | | | | | | | | | | | | | | | | | | | REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3283 Current SMM Save State routine does not check the number of bytes to be read, when it comse to read IO_INFO, before casting the incoming buffer to EFI_SMM_SAVE_STATE_IO_INFO. This could potentially cause memory corruption due to extra bytes are written out of buffer boundary. This change adds a width check before copying IoInfo into output buffer. Cc: Eric Dong <eric.dong@intel.com> Cc: Ray Ni <ray.ni@intel.com> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com> Signed-off-by: Kun Qin <kuqin12@gmail.com> Reviewed-by: Ray Ni <ray.ni@intel.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Message-Id: <20210406195254.1018-2-kuqin12@gmail.com>
* UefiCpuPkg/CpuTimerLib: Update LIBRARY_CLASS of Base instance.Lou, Yun2021-04-121-2/+2
| | | | | | | | | | | | | REF: https://bugzilla.tianocore.org/show_bug.cgi?id=2832 Update LIBRARY_CLASS of BaseCpuTimerLib to remove the usage limitation, otherwise the Base instance cannot be used in some types of modules. Signed-off-by: Jason Lou <yun.lou@intel.com> Reviewed-by: Ray Ni <ray.ni@intel.com> Cc: Eric Dong <eric.dong@intel.com> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com>
* UefiCpuPkg/PiSmmCpuDxeSmm: Support detect SMM shadow stack overflowSheng, W2021-04-091-1/+8
| | | | | | | | | | | | | | Use SMM stack guard feature to detect SMM shadow stack overflow. REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3280 Signed-off-by: Sheng Wei <w.sheng@intel.com> Cc: Eric Dong <eric.dong@intel.com> Reviewed-by: Ray Ni <ray.ni@intel.com> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com> Reviewed-by: Jiewen Yao <jiewen.yao@intel.com> Cc: Roger Feng <roger.feng@intel.com>
* UefiCpuPkg/MpInitLib: Consume MicrocodeLib to remove duplicated codeRay Ni2021-04-094-391/+96
| | | | | | | Signed-off-by: Ray Ni <ray.ni@intel.com> Reviewed-by: Eric Dong <eric.dong@intel.com> Acked-by: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com>
* UefiCpuPkg: Add MicrocodeLib for loading microcodeRay Ni2021-04-095-1/+480
| | | | | | | | REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3303 Signed-off-by: Ray Ni <ray.ni@intel.com> Reviewed-by: Eric Dong <eric.dong@intel.com> Acked-by: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com>
* UefiCpuPkg: Remove PEI/DXE instances of CpuTimerLib.Jason Lou2021-04-097-252/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | REF: https://bugzilla.tianocore.org/show_bug.cgi?id=2832 1. Remove PEI instance(PeiCpuTimerLib). PeiCpuTimerLib is currently designed to save time by getting CPU TSC frequency from Hob. BaseCpuTimerLib is designed to calculate TSC frequency by using CPUID[15h] each time. The time it takes to find CpuCrystalFrequencyHob (about 2000ns) is much longer than it takes to calculate TSC frequency with CPUID[15h] (about 450ns), which means using BaseCpuTimerLib to trigger a delay is more accurate than using PeiCpuTimerLib, recommend to use BaseCpuTimerLib instead of PeiCpuTimerLib. 2. Remove DXE instance(DxeCpuTimerLib). DxeCpuTimerLib is designed to calculate TSC frequency with CPUID[15h] in its constructor function, then save it in a global variable. For this design, once the driver containing this instance is running, this constructor function is called, it will take extra time to calculate TSC frequency. The time it takes to get TSC frequency from global variable is shorter than it takes to calculate TSC frequency with CPUID[15h], but 450ns is a short time, the impact on the platform is very limited. In addition, in order to simplify the code, recommend to use BaseCpuTimerLib instead of DxeCpuTimerLib. I did some experiments on one server platform and collected following data: 1. Average time required to find CpuCrystalFrequencyHob: about 2000 ns. 2. Average time required to find the last Hob: about 2700 ns. 2. Average time required to calculate TSC frequency: about 450 ns. Reference code: // // Calculate average time required to find Hob. // DEBUG((DEBUG_ERROR, "[PeiCpuTimerLib] GetPerformanceCounterFrequency - GetFirstGuidHob (1000 cycles)\n")); Ticks1 = AsmReadTsc(); for (i = 0; i < 1000; i++) { GuidHob = GetFirstGuidHob (&mCpuCrystalFrequencyHobGuid); } Ticks2 = AsmReadTsc(); if (GuidHob == NULL) { DEBUG((DEBUG_ERROR, "[PeiCpuTimerLib] - CpuCrystalFrequencyHob can not be found!\n")); } else { DEBUG((DEBUG_ERROR, "[PeiCpuTimerLib] - Average time required to find Hob = %d ns\n", \ DivU64x32(DivU64x64Remainder(MultU64x32((Ticks2 - Ticks1), 1000000000), *CpuCrystalCounterFrequency, NULL), 1000))); } // // Calculate average time required to calculate CPU frequency. // DEBUG((DEBUG_ERROR, "[PeiCpuTimerLib] GetPerformanceCounterFrequency - CpuidCoreClockCalculateTscFrequency (1000 cycles)\n")); Ticks1 = AsmReadTsc(); for (i = 0; i < 1000; i++) { Freq = CpuidCoreClockCalculateTscFrequency (); } Ticks2 = AsmReadTsc(); DEBUG((DEBUG_ERROR, "[PeiCpuTimerLib] - Average time required to calculate TSC frequency = %d ns\n", \ DivU64x32(DivU64x64Remainder(MultU64x32((Ticks2 - Ticks1), 1000000000), *CpuCrystalCounterFrequency, NULL), 1000))); Signed-off-by: Jason Lou <yun.lou@intel.com> Reviewed-by: Ray Ni <ray.ni@intel.com> Cc: Eric Dong <eric.dong@intel.com> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com>
* UefiCpuPkg: Consume MdeLibs.dsc.inc for RegisterFilterLibDandan Bi2021-03-311-1/+3
| | | | | | | | | | | | | | | | | | | | REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3246 MdeLibs.dsc.inc was added for some basic/default library instances provided by MdePkg and RegisterFilterLibNull Library was also added into it as the first version of MdeLibs.dsc.inc. So update platform dsc to consume MdeLibs.dsc.inc for RegisterFilterLibNull which will be consumed by IoLib and BaseLib. Cc: Eric Dong <eric.dong@intel.com> Cc: Ray Ni <ray.ni@intel.com> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com> Signed-off-by: Dandan Bi <dandan.bi@intel.com> Reviewed-by: Michael D Kinney <michael.d.kinney@intel.com> Reviewed-by: Liming Gao <liming.gao@intel.com> Acked-by: Ard Biesheuvel <ardb@kernel.org>
* UefiCpuPkg/SmmCommunication: Remove out-dated commentsNi, Ray2021-03-251-11/+5
| | | | | | | | | | | | | | | | The comments in PiSmmCommunicationPei.c describe the whole memory layout of the SMRAM regarding the SMM communication. But SHA-1: 8b1d14939053b63d80355465649c50f9f391a64a PiSmmCommunicationSmm: Deprecate SMM Communication ACPI Table removed the code that produces the ACPI Table. This change updates the accordingly comments. Signed-off-by: Ray Ni <ray.ni@intel.com> Reviewed-by: Eric Dong <eric.dong@intel.com> Acked-by: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com>
* UefiCpuPkg/CpuDxe: Guarantee GDT is below 4GBRay Ni2021-03-181-5/+16
| | | | | | | | | | | | | | | | | | | | | | REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3233 GDT needs to be allocated below 4GB in 64bit environment because AP needs it for entering to protected mode. CPU running in big real mode cannot access above 4GB GDT. But CpuDxe driver contains below code: gdt = AllocateRuntimePool (sizeof (GdtTemplate) + 8); ..... gdtPtr.Base = (UINT32)(UINTN)(VOID*) gdt; The AllocateRuntimePool() may allocate memory above 4GB. Thus, we cannot use AllocateRuntimePool (), instead, we should use AllocatePages() to make sure GDT is below 4GB space. Signed-off-by: Ray Ni <ray.ni@intel.com> Reviewed-by: Eric Dong <eric.dong@intel.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com>
* UefiCpuPkg/CpuDxe: Rename variables to follow EDKII coding standardRay Ni2021-03-181-12/+11
| | | | | | | | | The change doesn't impact any functionality. Signed-off-by: Ray Ni <ray.ni@intel.com> Reviewed-by: Eric Dong <eric.dong@intel.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com>
* UefiCpuPkg/MpInitLib: avoid printing debug messages in APRay Ni2021-03-173-10/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MpInitLib contains a function MicrocodeDetect() which is called by all threads as an AP procedure. Today this function contains below code: if (CurrentRevision != LatestRevision) { AcquireSpinLock(&CpuMpData->MpLock); DEBUG (( EFI_D_ERROR, "Updated microcode signature [0x%08x] does not match \ loaded microcode signature [0x%08x]\n", CurrentRevision, LatestRevision )); ReleaseSpinLock(&CpuMpData->MpLock); } When the if-check is passed, the code may call into PEI services: 1. AcquireSpinLock When the PcdSpinTimeout is not 0, TimerLib GetPerformanceCounterProperties() is called. And some of the TimerLib implementations would get the information cached in HOB. But AP procedure cannot call PEI services to retrieve the HOB list. 2. DEBUG Certain DebugLib relies on ReportStatusCode services and the ReportStatusCode PPI is retrieved through the PEI services. DebugLibSerialPort should be used. But when SerialPortLib is implemented to depend on PEI services, even using DebugLibSerialPort can still cause AP calls PEI services resulting hang. It causes a lot of debugging effort on the platform side. There are 2 options to fix the problem: 1. make sure platform DSC chooses the proper DebugLib and set the PcdSpinTimeout to 0. So that AcquireSpinLock and DEBUG don't call PEI services. 2. remove the AcquireSpinLock and DEBUG call from the procedure. Option #2 is preferred because it's not practical to ask every platform DSC to be written properly. Following option #2, there are two sub-options: 2.A. Just remove the if-check. 2.B. Capture the CurrentRevision and ExpectedRevision in the memory for each AP and print them together from BSP. The patch follows option 2.B. Signed-off-by: Ray Ni <ray.ni@intel.com> Reviewed-by: Eric Dong <eric.dong@intel.com> Acked-by: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com>
* UefiCpuPkg/CpuCacheInfoLib: Collect cache associative typeLou, Yun2021-03-173-26/+53
| | | | | | | | | | | | | | REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3265 Support collecting cache associative type in CpuCacheInfoLib. This prevents the user from using additional code to obtain the same information. Signed-off-by: Jason Lou <yun.lou@intel.com> Reviewed-by: Ray Ni <ray.ni@intel.com> Reviewed-by: Eric Dong <eric.dong@intel.com> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com>
* UefiCpuPkg/PiSmmCpu: Don't allocate Token for SmmStartupThisApRay Ni2021-03-111-7/+23
| | | | | | | | | | | | | | | | | | | | | | | | REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3199 When Token points to mSmmStartupThisApToken, this routine is called from SmmStartupThisAp() in non-blocking mode due to PcdCpuSmmBlockStartupThisAp == FALSE. In this case, caller wants to startup AP procedure in non-blocking mode and cannot get the completion status from the Token because there is no way to return the Token to caller from SmmStartupThisAp(). Caller needs to use its specific way to query the completion status. There is no need to allocate a token for such case so the 3 overheads can be avoided: 1. Call AllocateTokenBuffer() when there is no free token. 2. Get a free token from the token buffer. 3. Call ReleaseToken() in APHandler(). Signed-off-by: Ray Ni <ray.ni@intel.com> Reviewed-by: Eric Dong <eric.dong@intel.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com>
* UefiCpuPkg/SmmCpuFeaturesLib: Add Standalone MM supportMichael Kubacki2021-03-089-6/+95
| | | | | | | | | | | | | | | | | | | | | | | | | REF:https://bugzilla.tianocore.org/show_bug.cgi?id=3218 Adds an INF for StandaloneMmCpuFeaturesLib, which supports building the SmmCpuFeaturesLib code for Standalone MM. Minimal code changes are made to allow reuse of existing code for Standalone MM. The original INF file names are left intact (continue to use SMM terminology) to retain backward compatibility with platforms that use those INFs. Similarly, the pre-existing C file names are unchanged to be consistent with the INF file names. Note that all references in library source files to PiSmm.h have been changed to PiMm.h for consistency. Cc: Eric Dong <eric.dong@intel.com> Cc: Ray Ni <ray.ni@intel.com> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com> Signed-off-by: Michael Kubacki <michael.kubacki@microsoft.com> Message-Id: <20210217213227.1277-6-mikuback@linux.microsoft.com> Reviewed-by: Eric Dong <eric.dong@intel.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com>
* UefiCpuPkg/SmmCpuFeaturesLib: Abstract PcdCpuMaxLogicalProcessorNumberMichael Kubacki2021-03-085-1/+45
| | | | | | | | | | | | | | | | | | | | | REF:https://bugzilla.tianocore.org/show_bug.cgi?id=3218 Adds a new function called GetCpuMaxLogicalProcessorNumber() to return the number of maximum CPU logical processors (currently gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber). This allows the the mechanism used to retrieve the CPU maximum logical processor number to be abstracted from the logic that needs the value. Cc: Eric Dong <eric.dong@intel.com> Cc: Ray Ni <ray.ni@intel.com> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com> Signed-off-by: Michael Kubacki <michael.kubacki@microsoft.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Message-Id: <20210217213227.1277-5-mikuback@linux.microsoft.com> Reviewed-by: Eric Dong <eric.dong@intel.com>
* UefiCpuPkg/SmmCpuFeaturesLib: Cleanup library constructorsMichael Kubacki2021-03-085-32/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's currently two library instances: 1. SmmCpuFeaturesLib 2. SmmCpuFeaturesLibStm There's two constructor functions: 1. SmmCpuFeaturesLibConstructor() 2. SmmCpuFeaturesLibStmConstructor() SmmCpuFeaturesLibConstructor() is called by SmmCpuFeaturesLibStmConstructor() since the functionality in that function is required by both library instances. The declaration for SmmCpuFeaturesLibConstructor() is embedded in "SmmStm.c" instead of being declared in a header file. Further, that constructor function is called by the STM specific constructor. This change moves the common code to a function called CpuFeaturesLibInitialization() which is declared in an internal library header file "CpuFeaturesLib.h". Each constructor simply calls this function to perform the common functionality. Additionally, SmmCpuFeaturesLibConstructor() is moved from SmmCpuFeaturesLibNoStm.c into a instance-specific file allowing SmmCpuFeaturesLibNoStm.c to contain no STM implementation agnostic to a particular library instance. Cc: Eric Dong <eric.dong@intel.com> Cc: Ray Ni <ray.ni@intel.com> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com> Signed-off-by: Michael Kubacki <michael.kubacki@microsoft.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Message-Id: <20210217213227.1277-4-mikuback@linux.microsoft.com> Reviewed-by: Eric Dong <eric.dong@intel.com>
* UefiCpuPkg/SmmCpuFeaturesLib: Rename SmmCpuFeaturesLib.cMichael Kubacki2021-03-083-3/+3
| | | | | | | | | | | | | | | This change renames SmmCpuFeaturesLib.c to SmmCpuFeaturesLibCommon.c to better convey that this file contains library implementation common to all library instances. Cc: Eric Dong <eric.dong@intel.com> Cc: Ray Ni <ray.ni@intel.com> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com> Signed-off-by: Michael Kubacki <michael.kubacki@microsoft.com> Message-Id: <20210217213227.1277-3-mikuback@linux.microsoft.com> Reviewed-by: Eric Dong <eric.dong@intel.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com>
* UefiCpuPkg/SmmCpuFeaturesLib: Move multi-instance function decl to headerMichael Kubacki2021-03-086-10/+27
| | | | | | | | | | | | | | | | | | | | | FinishSmmCpuFeaturesInitializeProcessor() is a multi-instance internal library function that is currently not declared in a header file but embedded in "SmmCpuFeaturesLib.c". This change cleans up the declaration moving it to a new header file "CpuFeaturesLib.h" and removing the local declaration in "SmmCpuFeaturesLib.c". Cc: Eric Dong <eric.dong@intel.com> Cc: Ray Ni <ray.ni@intel.com> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com> Signed-off-by: Michael Kubacki <michael.kubacki@microsoft.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Message-Id: <20210217213227.1277-2-mikuback@linux.microsoft.com> Reviewed-by: Eric Dong <eric.dong@intel.com> [lersek@redhat.com: replace the guard macro "_CPU_FEATURES_LIB_H_" with "CPU_FEATURES_LIB_H_", for fixing ECC 8003, per commit 6ffbb3581ab7]
* UefiCpuPkg/MpInitLib: Remove unused Lock from MP_CPU_EXCHANGE_INFORay Ni2021-03-085-15/+1
| | | | | | | | | | The Lock is no longer needed since "LOCK XADD" was used in MpFuncs.nasm for ApIndex atomic increment. Signed-off-by: Ray Ni <ray.ni@intel.com> Reviewed-by: Eric Dong <eric.dong@intel.com> Acked-by: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com>
* UefiCpuPkg/MpInitLib: Use NASM struc to avoid hardcode offsetRay Ni2021-03-087-180/+193
| | | | | | | | | | In Windows environment, "dumpbin /disasm" is used to verify the disassembly before and after using NASM struc doesn't change. Signed-off-by: Ray Ni <ray.ni@intel.com> Reviewed-by: Eric Dong <eric.dong@intel.com> Acked-by: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com>
* UefiCpuPkg/PiSmmCpuDxeSmm: Fix SMM stack offset is not correctedk2-stable202102Sheng Wei2021-03-023-4/+8
| | | | | | | | | | | | | | | | | | | | | | | In function InitGdt(), SmiPFHandler() and Gen4GPageTable(), it uses CpuIndex * mSmmStackSize to get the SMM stack address offset for multi processor. It misses the SMM Shadow Stack Size. Each processor will use mSmmStackSize + mSmmShadowStackSize in the memory. It should use CpuIndex * (mSmmStackSize + mSmmShadowStackSize) to get this SMM stack address offset. If mSmmShadowStackSize > 0 and multi processor enabled, it will get the wrong offset value. CET shadow stack feature will set the value of mSmmShadowStackSize. REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3237 Signed-off-by: Sheng Wei <w.sheng@intel.com> Cc: Eric Dong <eric.dong@intel.com> Cc: Ray Ni <ray.ni@intel.com> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com> Cc: Jiewen Yao <jiewen.yao@intel.com> Cc: Roger Feng <roger.feng@intel.com> Reviewed-by: Jiewen Yao <jiewen.yao@intel.com> Reviewed-by: Ray Ni <ray.ni@intel.com>
* UefiCpuPkg/CpuExceptionHandlerLib: Clear CET shadow stack token busy bitSheng Wei2021-03-027-3/+75
| | | | | | | | | | | | | | | | | | | | | | | | | If CET shadows stack feature enabled in SMM and stack switch is enabled. When code execute from SMM handler to SMM exception, CPU will check SMM exception shadow stack token busy bit if it is cleared or not. If it is set, it will trigger #DF exception. If it is not set, CPU will set the busy bit when enter SMM exception. So, the busy bit should be cleared when return back form SMM exception to SMM handler. Otherwise, keeping busy bit 1 will cause to trigger #DF exception when enter SMM exception next time. So, we use instruction SAVEPREVSSP, CLRSSBSY and RSTORSSP to clear the shadow stack token busy bit before RETF instruction in SMM exception. REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3192 Signed-off-by: Sheng Wei <w.sheng@intel.com> Cc: Eric Dong <eric.dong@intel.com> Cc: Ray Ni <ray.ni@intel.com> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com> Cc: Jiewen Yao <jiewen.yao@intel.com> Cc: Roger Feng <roger.feng@intel.com> Reviewed-by: Jiewen Yao <jiewen.yao@intel.com> Reviewed-by: Ray Ni <ray.ni@intel.com>
* UefiCpuPkg/MpInitLib: Use XADD to avoid lock acquire/releaseRay Ni2021-02-262-26/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When AP firstly wakes up, MpFuncs.nasm contains below logic to assign an unique ApIndex to each AP according to who comes first: ---ASM--- TestLock: xchg [edi], eax cmp eax, NotVacantFlag jz TestLock mov ecx, esi add ecx, ApIndexLocation inc dword [ecx] mov ebx, [ecx] Releaselock: mov eax, VacantFlag xchg [edi], eax ---ASM END--- "lock inc" cannot be used to increase ApIndex because not only the global ApIndex should be increased, but also the result should be stored to a local general purpose register EBX. This patch learns from the NASM implementation of InternalSyncIncrement() to use "XADD" instruction which can increase the global ApIndex and store the original ApIndex to EBX in one instruction. With this patch, OVMF when running in a 255 threads QEMU spends about one second to wakeup all APs. Original implementation needs more than 10 seconds. Signed-off-by: Ray Ni <ray.ni@intel.com> Cc: Eric Dong <eric.dong@intel.com> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com> Reviewed-by: Michael D Kinney <michael.d.kinney@intel.com> Acked-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Eric Dong <eric.dong@intel.com>
* UefiCpuPkg: Move MigrateGdt from DiscoverMemory to TempRamDone. (CVE-2019-11098)Guomin Jiang2021-02-045-46/+46
| | | | | | | | | | | | | | | | | | | | REF: https://bugzilla.tianocore.org/show_bug.cgi?id=1614 REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3160 The GDT still in flash with commit 60b12e69fb1c8c7180fdda92f008248b9ec83db1 after TempRamDone So move the action to TempRamDone event to avoid reading GDT from flash. Signed-off-by: Guomin Jiang <guomin.jiang@intel.com> Cc: Eric Dong <eric.dong@intel.com> Cc: Ray Ni <ray.ni@intel.com> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com> Cc: Debkumar De <debkumar.de@intel.com> Cc: Harry Han <harry.han@intel.com> Cc: Catharine West <catharine.west@intel.com> Reviewed-by: Ray Ni <ray.ni@intel.com>
* UefiCpuPkg/CpuCacheInfoLib: Support no enabled AP case in DxeLibLou, Yun2021-02-031-0/+7
| | | | | | | | | | | | | | | REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3195 Support system has no enabled AP case in DxeCpuCacheInfoLib. Otherwise, if the system only has 1 BSP without any enabled AP, UEFI POST hangs when invoking StartupAllAPs protocol because EFI_NOT_STARTED is returned. Signed-off-by: Jason Lou <yun.lou@intel.com> Reviewed-by: Ray Ni <ray.ni@intel.com> Cc: Eric Dong <eric.dong@intel.com> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com>
* UefiCpuPkg/CpuCacheInfoLib: Add MpService dependencyLou, Yun2021-02-034-10/+3
| | | | | | | | | | | | REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3190 Add MpService dependency to enforce the executability of CpuCacheInfoLib. Signed-off-by: Jason Lou <yun.lou@intel.com> Reviewed-by: Ray Ni <ray.ni@intel.com> Cc: Eric Dong <eric.dong@intel.com> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com>
* UefiCpuPkg: SmmCpuExceptionHandlerLib: Added StandaloneMm module supportKun Qin2021-02-011-1/+1
| | | | | | | | | | | | | | | This change of SmmCpuExceptionHandlerLib adds support for StandaloneMm components to allow x64 StandaloneMm environment setting up exception handlers. Cc: Eric Dong <eric.dong@intel.com> Cc: Ray Ni <ray.ni@intel.com> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com> Signed-off-by: Kun Qin <kun.q@outlook.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Ray Ni <ray.ni@intel.com>
* UefiCpuPkg: CpuIo2Smm: Support of CpuIo driver under StandaloneMmKun Qin2021-02-013-0/+82
| | | | | | | | | | | | | | This change adds a new CpuIo driver instance for MM_STANDALONE type. The new driver entrypoint is implemented in a separate file to match the interface definition of MM_STANDALONE modules. Cc: Eric Dong <eric.dong@intel.com> Cc: Ray Ni <ray.ni@intel.com> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com> Signed-off-by: Kun Qin <kun.q@outlook.com> Reviewed-by: Ray Ni <ray.ni@intel.com>
* UefiCpuPkg: CpuIo2Smm: Abstract SMM specific functions into separate fileKun Qin2021-02-014-381/+421
| | | | | | | | | | | | | | | This change abstracts CpuIo2Smm driver entrypoint into separate file and moves functions/definitions that are not substantially specific to Traditional MM (SMM) into CpuIo2Mm.* in order to set ways for Standalone MM support in the future. Cc: Eric Dong <eric.dong@intel.com> Cc: Ray Ni <ray.ni@intel.com> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com> Signed-off-by: Kun Qin <kun.q@outlook.com> Reviewed-by: Ray Ni <ray.ni@intel.com>
* UefiCpuPkg: CpuIo2Smm: Move CpuIo2Smm driver to consume gMmstKun Qin2021-02-014-5/+6
| | | | | | | | | | | | | | This change replaced gSmst with gMmst to support broader compatibility under MM environment for CpuIo2Smm driver. Cc: Eric Dong <eric.dong@intel.com> Cc: Ray Ni <ray.ni@intel.com> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com> Signed-off-by: Kun Qin <kun.q@outlook.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Ray Ni <ray.ni@intel.com>
* UefiCpuPkg/MpInitLib: Don't increase CpuCount in ApWakeupFunctionRay Ni2021-01-291-11/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3179 When BSP first time wakes all APs, each AP atomically increases CpuMpData->CpuCount and CpuMpData->FinishedCount. Each AP atomically increases CpuMpData->NumApsExecuting in early assembly code and decreases it before it enters to HLT or MWAIT state. Putting them together, the 3 variables are changed in the following order: 1. NumApsExecuting++ // in assembly 2. CpuCpunt++ 4. FinishedCount++ 3. NumApsExecuting-- // in C BSP waits for a certain timeout and then polls NumApsExecuting until it drops to zero. It assumes all APs are waken up concurrently and NumApsExecuting only drops to zero when all APs have checked in. Then it additionally waits for FinishedCount == CpuCount - 1. (FinishedCount doesn't include BSP while CpuCount includes BSP.) There is no need to additionally wait for FinishedCount == CpuCount - 1 because when NumApsExecuting == 0, the number of increament of FinishedCount and CpuCount should equal. This patch simplifies the code to remove "CpuCount++" in ApWakeupFunction() and assigns FinishedCount + 1 to CpuCount after WakeUpAP(). Signed-off-by: Ray Ni <ray.ni@intel.com> Reviewed-by: Eric Dong <eric.dong@intel.com> Acked-by: Laszlo Ersek <lersek@redhat.com>
* MdePkg/Cpuid.h: Change and add some macro definitions.Lou, Yun2021-01-261-1/+1
| | | | | | | | | | | | | | | | REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3105 Change and add some macro definitions about CPUID_HYBRID_INFORMATION Leaf(1Ah). Signed-off-by: Jason Lou <yun.lou@intel.com> Cc: Michael D Kinney <michael.d.kinney@intel.com> Reviewed-by: Liming Gao <gaoliming@byosoft.com.cn> Cc: Zhiguang Liu <zhiguang.liu@intel.com> Reviewed-by: Ray Ni <ray.ni@intel.com> Cc: Eric Dong <eric.dong@intel.com> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com>
* UefiCpuPkg/Library/MpInitLib: Fix AP VolatileRegisters race conditionMichael D Kinney2021-01-261-13/+18
| | | | | | | | | | | | | | | | | | | | | | | REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3182 Fix the order of operations in ApWakeupFunction() when PcdCpuApLoopMode is set to HLT mode that uses INIT-SIPI-SIPI to wake APs. In this mode, volatile state is restored and saved each time a INIT-SIPI-SIPI is sent to an AP to request a function to be executed on the AP. When the function is completed the volatile state of the AP is saved. However, the counters NumApsExecuting and FinishedCount are updated before the volatile state is saved. This allows for a race condition window for the BSP that is waiting on these counters to request a new INIT-SIPI-SIPI before all the APs have completely saved their volatile state. The fix is to save the AP volatile state before updating the NumApsExecuting and FinishedCount counters. Cc: Eric Dong <eric.dong@intel.com> Reviewed-by: Ray Ni <ray.ni@intel.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com> Reviewed-by: Star Zeng <star.zeng@intel.com> Signed-off-by: Michael D Kinney <michael.d.kinney@intel.com>
* UefiCpuPkg/CpuMp: Fix hang when StackGuard is enabled in 16-core cpuRay Ni2021-01-223-9/+7
| | | | | | | | | | | | | | | | | | | REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3167 When StackGuard is enabled, the CpuMp driver allocates known good stacks for all CPUs for DF# and PF# exceptions. It uses AllocatePool to do so. The size needed equals to 64KB = StackSize (2K) * ExceptionNumber (2) * NumberOfProcessors (16) However, AllocatePool max allocation size is less than 64K. To fix the issue, AllocatePages() is used. Signed-off-by: Ray Ni <ray.ni@intel.com> Reviewed-by: Eric Dong <eric.dong@intel.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com>
* UefiCpuPkg RegisterCpuFeaturesLib: NumberOfCpus may be uninitializedZeng, Star2021-01-211-0/+1
| | | | | | | | | | | | | | | | | | | | | | | NumberOfCpus local variable in GetAcpiCpuData will be uninitialized when CpuS3DataDxe runs before DxeRegisterCpuFeaturesLib (linked by CpuFeaturesDxe) because there is no code to initialize it at (AcpiCpuData != NULL) execution path. The issue is exposed after cefad282fb31aff3e1a6dcbd368cbbffc3fce900 and 38ee7bafa72f58982f99ac6f61eef160f80bad69. There was negligence in that code review. One further topic may be "Could EDK2 CI be enhanced to catch this kind of uninitialized local variable case?". :) This patch fixes this regression issue. Cc: Eric Dong <eric.dong@intel.com> Cc: Ray Ni <ray.ni@intel.com> Cc: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Star Zeng <star.zeng@intel.com> Message-Id: <20210121093944.1621-1-star.zeng@intel.com> Reviewed-by: Ray Ni <ray.ni@intel.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com>
* UefiCpuPkg/CpuS3DataDxe: do not allocate useless register tablesLaszlo Ersek2021-01-201-32/+0
| | | | | | | | | | | | | | | | | | | | | | | | | CpuS3DataDxe allocates the "RegisterTable" and "PreSmmInitRegisterTable" arrays in ACPI_CPU_DATA just so every processor in the system can have its own empty register table, matched by APIC ID. This has never been useful in practice. Given commit e992cc3f4859 ("UefiCpuPkg PiSmmCpuDxeSmm: Reduce SMRAM consumption in CpuS3.c", 2021-01-11), simply leave both "AcpiCpuData->RegisterTable" and "AcpiCpuData->PreSmmInitRegisterTable" initialized to the zero address. This simplifies the driver, and saves both normal RAM (boot services data type memory) and -- in PiSmmCpuDxeSmm -- SMRAM. Cc: Eric Dong <eric.dong@intel.com> Cc: Philippe Mathieu-Daudé <philmd@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com> Cc: Ray Ni <ray.ni@intel.com> Cc: Star Zeng <star.zeng@intel.com> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3159 Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Ray Ni <ray.ni@intel.com> Message-Id: <20210119155440.2262-4-lersek@redhat.com> Reviewed-by: Star Zeng <star.zeng@intel.com>
* UefiCpuPkg/AcpiCpuData: update comments on register table fieldsLaszlo Ersek2021-01-201-0/+4
| | | | | | | | | | | | | | | | | | | | | | After commit e992cc3f4859 ("UefiCpuPkg PiSmmCpuDxeSmm: Reduce SMRAM consumption in CpuS3.c", 2021-01-11), it is valid for a CPU S3 Data DXE Driver to set "ACPI_CPU_DATA.PreSmmInitRegisterTable" and/or "ACPI_CPU_DATA.RegisterTable" to 0, in case none of the CPUs needs a register table of the corresponding kind, during S3 resume. Document this fact in the "UefiCpuPkg/Include/AcpiCpuData.h" header file. Cc: Eric Dong <eric.dong@intel.com> Cc: Philippe Mathieu-Daudé <philmd@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com> Cc: Ray Ni <ray.ni@intel.com> Cc: Star Zeng <star.zeng@intel.com> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3159 Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Ray Ni <ray.ni@intel.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210119155440.2262-3-lersek@redhat.com> Reviewed-by: Star Zeng <star.zeng@intel.com>
* UefiCpuPkg/CpuFeature: Don't assume CpuS3DataDxe alloc RegisterTableRay Ni2021-01-201-33/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are lots of fields in ACPI_CPU_DATA structure while only followings are accessed by CpuFeature infra: * NumberOfCpus * PreSmmInitRegisterTable // pointer * RegisterTable // pointer * CpuStatus * ApLocation // pointer So it's possible that an implementation of CpuS3DataDxe doesn't allocate memory for PreSmmInitRegisterTable/RegisterTable/ApLocation. This patch handles the case when CpuS3DataDxe doesn't allocate memory for PreSmmInitRegisterTable/RegisterTable. Cc: Eric Dong <eric.dong@intel.com> Cc: Philippe Mathieu-Daudé <philmd@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com> Cc: Ray Ni <ray.ni@intel.com> Cc: Star Zeng <star.zeng@intel.com> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3159 Signed-off-by: Ray Ni <ray.ni@intel.com> [lersek@redhat.com: update CC list, add BZ reference, add my S-o-b] [lersek@redhat.com: deal with RegisterTable and PreSmmInitRegisterTable being zero independently of each other; replacing the ASSERT()] Signed-off-by: Laszlo Ersek <lersek@redhat.com> Message-Id: <20210119155440.2262-2-lersek@redhat.com> Reviewed-by: Star Zeng <star.zeng@intel.com>
* UefiCpuPkg/CpuCacheInfoLib: Add new CpuCacheInfoLib.Lou, Yun2021-01-1910-0/+1023
| | | | | | | | | | | | | | | | | REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3105 This new library uses a platform agnostic algorithm to get CPU cache information. It provides user with an API(GetCpuCacheInfo) to get detailed CPU cache information by each package, each core type included in this package, and each cache level & type. This library can be used by code that produces SMBIOS_TABLE_TYPE7 SMBIOS table. Signed-off-by: Jason Lou <yun.lou@intel.com> Reviewed-by: Ray Ni <ray.ni@intel.com> Cc: Eric Dong <eric.dong@intel.com> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com>
* UefiCpuPkg/CpuDxe: Fix boot errorGuo Dong2021-01-121-5/+3
| | | | | | | | | | | | | | | | REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3084 When DXE drivers are dispatched above 4GB memory in 64bit mode, the address setCodeSelectorLongJump in stack will be override by parameter. Jump to Qword is not supported by some processors. So use "o64 retf" instead. Signed-off-by: Guo Dong <guo.dong@intel.com> Acked-by: Laszlo Ersek <lersek@redhat.com> Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Michael D Kinney <michael.d.kinney@intel.com> Tested-by: James Bottomley <jejb@linux.ibm.com> Reviewed-by: Ray Ni <ray.ni@intel.com>
* UefiCpuPkg/MpInitLib: Fix a hang in above 4GB caseGuo Dong2021-01-121-4/+4
| | | | | | | | | | | | This patch fixed the hang in UEFICpuPkg when it is dispatched above 4GB. In UEFI BIOS case CpuInfoInHob is provided to DXE under 4GB from PEI. When using UEFI payload and bootloaders, CpuInfoInHob will be allocated above 4GB since it is not provided from bootloader. so we need update the code to make sure this hob could be accessed correctly in this case. Signed-off-by: Guo Dong <guo.dong@intel.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Ray Ni <ray.ni@intel.com>
* UefiCpuPkg PiSmmCpuDxeSmm: Reduce SMRAM consumption in CpuS3.cZeng, Star2021-01-111-17/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes two refinements to reduce SMRAM consumption in CpuS3.c. 1. Only do CopyRegisterTable() when register table is not empty, IsRegisterTableEmpty() is created to check whether the register table is empty or not. Take empty PreSmmInitRegisterTable as example, about 24K SMRAM consumption could be reduced when mAcpiCpuData.NumberOfCpus=1024. sizeof (CPU_REGISTER_TABLE) = 24 mAcpiCpuData.NumberOfCpus = 1024 = 1K mAcpiCpuData.NumberOfCpus * sizeof (CPU_REGISTER_TABLE) = 24K 2. Only copy table entries buffer instead of whole buffer. AllocatedSize in SourceRegisterTableList is the whole buffer size. Actually, only the table entries buffer needs to be copied, and the size is TableLength * sizeof (CPU_REGISTER_TABLE_ENTRY). Take AllocatedSize=0x1000=4096, TableLength=100 and NumberOfCpus=1024 as example, about 1696K SMRAM consumption could be reduced. sizeof (CPU_REGISTER_TABLE_ENTRY) = 24 TableLength = 100 TableLength * sizeof (CPU_REGISTER_TABLE_ENTRY) = 2400 AllocatedSize = 0x1000 = 4096 AllocatedSize - TableLength * sizeof (CPU_REGISTER_TABLE_ENTRY) = 4096 - 2400 = 1696 NumberOfCpus = 1024 = 1K NumberOfCpus * (AllocatedSize - TableLength * sizeof (CPU_REGISTER_TABLE_ENTRY)) = 1696K This patch also corrects the CopyRegisterTable() function description. Signed-off-by: Star Zeng <star.zeng@intel.com> Reviewed-by: Ray Ni <ray.ni@intel.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Cc: Ray Ni <ray.ni@intel.com> Cc: Eric Dong <eric.dong@intel.com> Cc: Laszlo Ersek <lersek@redhat.com> Message-Id: <20210111015419.28368-1-star.zeng@intel.com>
* Revert "UefiCpuPkg/CpuDxe: Fix boot error"Laszlo Ersek2020-12-181-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit cee5b0441af39dd6f76cc4e0447a1c7f788cbb00. Commit cee5b0441af3 ("UefiCpuPkg/CpuDxe: Fix boot error", 2020-12-08) breaks CpuDxe (and with it, OVMF boot) on AMD processors. AMD processors cannot do far jumps to 64-bit targets, as documented in the AMD64 Architecture Programmer's Manual. Revert the patch until a RETFQ-based substitute is posted. Cc: Eric Dong <eric.dong@intel.com> Cc: Guo Dong <guo.dong@intel.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Rahul Kumar <rahul1.kumar@intel.com> Cc: Ray Ni <ray.ni@intel.com> Cc: Thomas Lendacky <thomas.lendacky@amd.com> Ref: https://edk2.groups.io/g/devel/message/68597 Ref: https://www.redhat.com/archives/edk2-devel-archive/2020-December/msg00493.html Reported-by: Thomas Lendacky <thomas.lendacky@amd.com> Ref: https://edk2.groups.io/g/devel/message/68832 Ref: https://www.redhat.com/archives/edk2-devel-archive/2020-December/msg00737.html Reported-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Message-Id: <20201217085055.15131-1-lersek@redhat.com> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3121 [lersek@redhat.com: add BZ link]
* UefiCpuPkg/CpuFeature: reduce time complexty to calc CpuInfo.FirstRay Ni2020-12-141-49/+47
| | | | | | | | | | | | | | | | | | | | | | CpuInfo.First stores whether the current thread belongs to the first package in the platform, first core in a package, first thread in a core. But the time complexity of original algorithm to calculate the CpuInfo.First is O (n) * O (p) * O (c). n: number of processors p: number of packages c: number of cores per package The patch trades time with space by storing the first package, first core per package, first thread per core in an array. The time complexity becomes O (n). Signed-off-by: Ray Ni <ray.ni@intel.com> Reviewed-by: Eric Dong <eric.dong@intel.com> Reviewed-by: Star Zeng <star.zeng@intel.com> Cc: Yun Lou <yun.lou@intel.com> Cc: Laszlo Ersek <lersek@redhat.com>
* UefiCpuPkg RegisterCpuFeaturesLib: Use AllocatePages() for InitOrderStar Zeng2020-12-141-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | The required buffer size for InitOrder will be 96K when NumberOfCpus=1024. sizeof (CPU_FEATURES_INIT_ORDER) = 96 NumberOfCpus = 1024 = 1K sizeof (CPU_FEATURES_INIT_ORDER) * NumberOfCpus = 96K AllocateZeroPool() will call to PeiServicesAllocatePool() which will use EFI_HOB_MEMORY_POOL to management memory pool. EFI_HOB_MEMORY_POOL.Header.HobLength is UINT16 type, so there is no way for AllocateZeroPool() to allocate > 64K memory. So AllocateZeroPool() could not be used anymore for the case above or even bigger required buffer size. This patch updates the code to use AllocatePages() instead of AllocateZeroPool() to allocate buffer for InitOrder. Signed-off-by: Star Zeng <star.zeng@intel.com> Reviewed-by: Ray Ni <ray.ni@intel.com> Cc: Ray Ni <ray.ni@intel.com> Cc: Eric Dong <eric.dong@intel.com> Cc: Laszlo Ersek <lersek@redhat.com>
* UefiCpuPkg/SmmCpuFeaturesLib: Add Tiger Lake supportGuo Dong2020-12-081-1/+2
| | | | | | | Add Tiger Lake ModelId support in the SMM CPU feature lib. Signed-off-by: Guo Dong <guo.dong@intel.com> Reviewed-by: Eric Dong <eric.dong@intel.com>
* UefiCpuPkg/CpuDxe: Fix boot errorGuo Dong2020-12-081-2/+2
| | | | | | | | | | | | | | REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3084 When DXE drivers are dispatched above 4GB memory and the system is already in 64bit mode, the address setCodeSelectorLongJump in stack will be override by parameter. so change to use 64bit address and jump to qword address. Signed-off-by: Guo Dong <guo.dong@intel.com> Reviewed-by: Ray Ni <ray.ni@intel.com> Reviewed-by: Eric Dong <eric.dong@intel.com>
* UefiCpuPkg/Feature: Support different thread count per coreRay Ni2020-12-043-83/+119
| | | | | | | | | | | | | | | | | Today's code assumes every core contains the same number of threads. It's not always TRUE for certain model. Such assumption causes system hang when thread count per core is different and there is core or package dependency between CPU features (using CPU_FEATURE_CORE_BEFORE/AFTER, CPU_FEATURE_PACKAGE_BEFORE/AFTER). The change removes such assumption by calculating the actual thread count per package and per core. Signed-off-by: Ray Ni <ray.ni@intel.com> Reviewed-by: Eric Dong <eric.dong@intel.com> Cc: Yun Lou <yun.lou@intel.com> Acked-by: Laszlo Ersek <lersek@redhat.com>
* UefiCpuPkg/PiSmmCpuDxeSmm: Reflect page table depth with page table addressSheng Wei2020-11-184-37/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When trying to get page table base, if mInternalCr3 is zero, it will use the page table from CR3, and reflect the page table depth by CR4 LA57 bit. If mInternalCr3 is non zero, it will use the page table from mInternalCr3 and reflect the page table depth of mInternalCr3 at same time. In the case of X64, we use m5LevelPagingNeeded to reflect the depth of the page table. And in the case of IA32, it will not the page table depth information. This patch is a bug fix when enable CET feature with 5 level paging. The SMM page tables are allocated / initialized in PiCpuSmmEntry(). When CET is enabled, PiCpuSmmEntry() must further modify the attribute of shadow stack pages. This page table is not set to CR3 in PiCpuSmmEntry(). So the page table base address is set to mInternalCr3 for modifty the page table attribute. It could not use CR4 LA57 bit to reflect the page table depth for mInternalCr3. So we create a architecture-specific implementation GetPageTable() with 2 output parameters. One parameter is used to output the page table address. Another parameter is used to reflect if it is 5 level paging or not. REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3015 Signed-off-by: Sheng Wei <w.sheng@intel.com> Cc: Eric Dong <eric.dong@intel.com> Cc: Ray Ni <ray.ni@intel.com> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com> Cc: Jiewen Yao <jiewen.yao@intel.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Eric Dong <eric.dong@intel.com>
* UefiCpuPkg/PiSmmCpuDxeSmm: Correct the Cr3 typoSheng Wei2020-11-181-5/+5
| | | | | | | | | | | | | | | | Change the variable name from mInternalGr3 to mInternalCr3. REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3015 Signed-off-by: Sheng Wei <w.sheng@intel.com> Cc: Eric Dong <eric.dong@intel.com> Cc: Ray Ni <ray.ni@intel.com> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Rahul Kumar <rahul1.kumar@intel.com> Cc: Jiewen Yao <jiewen.yao@intel.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Ray Ni <ray.ni@intel.com> Reviewed-by: Eric Dong <eric.dong@intel.com>