summaryrefslogtreecommitdiffstats
path: root/drivers/edac/amd64_edac.c
Commit message (Collapse)AuthorAgeFilesLines
* amd64_edac: Erratum #637 workaroundBorislav Petkov2011-04-261-2/+50
| | | | | | | | | F15h CPUs may report a non-DRAM address when reporting an error address belonging to a CC6 state save area. Add a workaround to detect this condition and compute the actual DRAM address of the error as documented in the Revision Guide for AMD Family 15h Models 00h-0Fh Processors. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Factor in CC6 save areaBorislav Petkov2011-04-261-1/+27
| | | | | | | | | | F15h and later use a portion of DRAM as a CC6 storage area. BIOS programs D18F1x[17C:140,7C:40] DRAM Base/Limit accordingly by subtracting the storage area from the DRAM limit setting. However, in order for edac to consider that part of DRAM too, we need to include it into the per-node range. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Remove node interleave warningBorislav Petkov2011-04-261-5/+1
| | | | | | | | | This warning was wrongfully added for a normal condition - intlvsel actually selects the destination node when node interleaving is enabled and it is not a mismatch. For a detailed example, see section 2.8.10.2 "Node Interleaving" in F10h BKDG. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* EDAC: Remove debugging output in scrub rate handlingMarkus Trippelsdorf2011-04-211-2/+0
| | | | | | | | | This patch removes superfluous debugging output in the sysfs scrub rate handler. It also consolidates the error handling in the scrub rate accessors. Signed-off-by: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Fix potential memleakBorislav Petkov2011-03-291-1/+1
| | | | | | | | | | We check the pointers together but at least one of them could be invalid due to failed allocation. Since we cannot continue if either of the two allocations has failed, exit early by freeing them both. Cc: <stable@kernel.org> # 38.x Reported-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Fix decode_syndrome typesBorislav Petkov2011-03-171-4/+4
| | | | | | Those should all be unsigned. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Fix DCT argument typeBorislav Petkov2011-03-171-5/+4
| | | | | | | | Fix amd64_debug_display_dimm_sizes() arguments order per convention (pvt is always first). Also, the now second arg denotes the DCT so adjust its type. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Fix ranges signednessBorislav Petkov2011-03-171-4/+5
| | | | | | The dram ranges make sense only as an unsigned type. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Drop local variableBorislav Petkov2011-03-171-3/+2
| | | | | | Use the macro directly instead Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Fix PCI config addressing typesBorislav Petkov2011-03-171-5/+5
| | | | | | Adjust argument types to the PCI config API's types. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Fix DRAM base macrosBorislav Petkov2011-03-171-2/+1
| | | | | | Return unsigned u8 values only. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Fix node id signednessBorislav Petkov2011-03-171-7/+9
| | | | | | | | A node id can never be negative since we use it as an index into the DRAM ranges array. This also makes one of the BUG_ON conditions redundant. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Enable driver on F15hBorislav Petkov2011-03-171-7/+23
| | | | | | | | | Add the PCI device ids required for driver registration. Remove pvt->ctl_name and use the family descriptor directly, instead. Then, bump driver version and fixup its format. Finally, enable DRAM ECC decoding on F15h. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Adjust ECC symbol size to F15hBorislav Petkov2011-03-171-17/+15
| | | | | | | F15h has the same ECC symbol size options as F10h revD and later so adjust checks to that. Simplify code a bit. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Simplify scrubrate settingBorislav Petkov2011-03-171-3/+5
| | | | | | Drop per-instance variable and compute min scrubrate dynamically. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Improve DRAM address mappingBorislav Petkov2011-03-171-68/+81
| | | | | | | | Drop static tables which map the bits in F2x80 to a chip select size in favor of functions doing the mapping with some bit fiddling. Also, add F15 support. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Sanitize ->read_dram_ctl_registerBorislav Petkov2011-03-171-7/+7
| | | | | | | This function is relevant for F10h and higher, and it has only one callsite so drop its function pointer from the low_ops struct. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Adjust sys_addr to chip select conversion routine to F15hBorislav Petkov2011-03-171-14/+15
| | | | | | | F15h sys_addr to chip select mapping is almost identical to F10h's so reuse that. Rename functions on that path accordingly. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Beef up early exit reportingBorislav Petkov2011-03-171-1/+12
| | | | | | | Add paranoid checks for the sys address before going off and decoding it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Revamp online spare handlingBorislav Petkov2011-03-171-17/+13
| | | | | | | Replace per-DCT macros with smarter ones, drop hack and look for the spare rank on all chip selects on a channel. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Fix channel interleave removalBorislav Petkov2011-03-171-9/+17
| | | | | | | Remove the channel interleave select bit properly. See F2x110[DctSelIntLvAddr] for details. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Correct node interleaving removalBorislav Petkov2011-03-171-4/+4
| | | | | | | | | When node interleaving is enabled, a subset of the addr[14:12] bits has to be removed in order to get the normalized DCT address of the DRAM channel. The actual number of bits to remove is determined by F1x[1, 0][7C:40][IntlvEn]. Do this correctly. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Add support for interleaved region swappingBorislav Petkov2011-03-171-0/+38
| | | | | | | | | On revC3 and revE Fam10h machines and later, non-interleaved graphics framebuffer memory under the 16G mark can be swapped with a region located at the bottom of memory so that the GPU can use the interleaved region and thus two channels. Add support for that. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Unify get_error_addressBorislav Petkov2011-03-171-13/+13
| | | | | | | | | The address bits from MC4_STATUS differ only between K8 and the rest so no need for a per-family method. No functional change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Simplify decoding pathBorislav Petkov2011-03-171-49/+27
| | | | | | | | | Use the struct mce directly instead of copying from it into a custom struct err_regs. No functionality change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Adjust channel counting to F15hBorislav Petkov2011-03-171-7/+6
| | | | | | | The only difference is that F10h used to sport ganged DCTs and F15h doesn't so adjust the F10h routine and reuse it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Cleanup old defines cruftBorislav Petkov2011-03-171-13/+13
| | | | | | Remove unused defines, drop family names from define names. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Cleanup NBSH cruftBorislav Petkov2011-03-171-13/+2
| | | | | | | | | Remove reporting of errors with UC bit set - this is done by the MCE decoding code anyway and this driver deals with DRAM ECC errors only. UC (NB uncorrectable error) doesn't necessarily mean it is a DRAM error. Remove unused macros while at it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Cleanup NBCFG handlingBorislav Petkov2011-03-171-24/+21
| | | | | | | | | The fact whether we are chipkill capable or not does not have any bearing when computing the channel index on a ganged DCT configuration so remove that. Also, simplify debug statements. Finally, remove old error injection leftovers, while at it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Cleanup NBCTL codeBorislav Petkov2011-03-171-7/+7
| | | | | | | | | Remove family names from macro names, drop single bit defines and comment their meaning instead. No functional change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Cleanup DCT Select Low/High codeBorislav Petkov2011-03-171-10/+10
| | | | | | | | | Shorten macro names, remove family name from macros, fix macro arguments, shorten debug strings. No functionality change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Cleanup Dram Configuration registers handlingBorislav Petkov2011-03-171-25/+17
| | | | | | | | | | | | | * Restrict DCT ganged mode check since only Fam10h supports it * Adjust DRAM type detection for BD since it only supports DDR3 * Remove second and thus unneeded DCLR read in k8_early_channel_count() - we do that in read_mc_regs() * Cleanup comments and remove family names from register macros * Remove unused defines There should be no functional change resulting from this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Cleanup DBAM handlingBorislav Petkov2011-03-171-19/+8
| | | | | | | | Do not read DBAM regs twice and simplify code around them. There should be no functional change resulting from this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Replace huge bitmasks with a macroBorislav Petkov2011-03-171-10/+9
| | | | | | Replace hard to read hex constants with a continuous masks macro. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Sanitize f10_get_base_addr_offsetBorislav Petkov2011-03-171-38/+45
| | | | | | | | | | | | This function maps the system address to the normalized DCT address. Document what the code does for more clarity and wrap insane bitmasks in a more understandable macro which generates them. Also, reduce number of arguments passed to the function. Finally, rename this function to what it actually does. No functional change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Sanitize channel extractionBorislav Petkov2011-03-171-50/+34
| | | | | | | | | | Cleanup and simplify f10_determine_channel(); make it more readable. Also drop f10_map_intlv_en_to_shift() in favor of simply counting the bits in F1x124[DramIntlvEn] which is equivalent. There should be no functionality change resulting from this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Cleanup chipselect handlingBorislav Petkov2011-03-171-182/+111
| | | | | | | | | | | | Add a struct representing the DRAM chip select base/limit register pairs. Concentrate all CS handling in a single function. Also, add CS looping macros for cleaner, more readable code. While at it, adjust code to F15h. Finally, do smaller macro names cleanups (remove family names from register macros) and debug messages clarification. No functional change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Cleanup DHAR handlingBorislav Petkov2011-03-171-18/+18
| | | | | | Adjust to F15h, simplify code, fixup macros. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Remove DRAM base/limit subfields cachingBorislav Petkov2011-03-171-166/+64
| | | | | | | | | | | | | | | | | Add a struct representing the DRAM base/limit range pairs and remove all cached subfields. Replace them with accessor functions, which actually saves us some space: text data bss dec hex filename 14712 1577 336 16625 40f1 drivers/edac/amd64_edac_mod.o.after 14831 1609 336 16776 4188 drivers/edac/amd64_edac_mod.o.before Also, it simplifies the code a lot allowing to merge the K8 and F10h routines. No functional change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Add support for F15h DCT PCI config accessesBorislav Petkov2011-03-171-47/+117
| | | | | | | | | | | | F15h "multiplexes" between the configuration space of the two DRAM controllers by toggling D18F1x10C[DctCfgSel] while F10h has a different set of registers for DCT0, and DCT1 in extended PCI config space. Add DCT configuration space accessors per family thus wrapping all the different access prerequisites. Clean up code while at it, shorten names. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* amd64_edac: Fix DIMMs per DCTs outputBorislav Petkov2011-02-101-20/+8
| | | | | | | | | | | | amd64_debug_display_dimm_sizes() reports the distribution of the DIMMs on each DRAM controller and its chip select sizes. Thus, the last don't have anything to do with whether we're running in ganged DCT mode or not - their sizes don't change all of a sudden. Fix that by removing the ganged-check and dump DCT0's config for DCT1 when in ganged mode since they're identical. Reported-and-tested-by: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* Merge branch 'mce-for-linus' of ↵Linus Torvalds2011-01-071-2/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp * 'mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: EDAC, MCE: Fix NB error formatting EDAC, MCE: Use BIT_64() to eliminate warnings on 32-bit EDAC, MCE: Enable MCE decoding on F15h EDAC, MCE: Allow F15h bank 6 MCE injection EDAC, MCE: Shorten error report formatting EDAC, MCE: Overhaul error fields extraction macros EDAC, MCE: Add F15h FP MCE decoder EDAC, MCE: Add F15 EX MCE decoder EDAC, MCE: Add an F15h NB MCE decoder EDAC, MCE: No F15h LS MCE decoder EDAC, MCE: Add F15h CU MCE decoder EDAC, MCE: Add F15h IC MCE decoder EDAC, MCE: Add F15h DC MCE decoder EDAC, MCE: Select extended error code mask
| * EDAC, MCE: Overhaul error fields extraction macrosBorislav Petkov2011-01-071-2/+2
| | | | | | | | | | | | Make macro names shorter thus making code shorter and more clear. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | amd64_edac: Disable DRAM ECC injection on K8Borislav Petkov2011-01-071-2/+3
| | | | | | | | | | | | | | K8 does not allow for an atomic RMW to a cacheline as F10h does so disable the error injection interface for it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | EDAC: Fixup scrubrate manipulationBorislav Petkov2011-01-071-12/+12
| | | | | | | | | | | | | | | | | | Make the ->{get|set}_sdram_scrub_rate return the actual scrub rate bandwidth it succeeded setting and remove superfluous arg pointer used for that. A negative value returned still means that an error occurred while setting the scrubrate. Document this for future reference. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | amd64_edac: Remove two-stage initializationBorislav Petkov2011-01-071-100/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that all prerequisites are in place, drop the two-stage driver instances initialization in favor of the following simple init sequence: 1. Probe PCI device: we only test ECC capabilities here and if none exit early. 2. If the hw supports ECC and it is/can be enabled, we init the per-node instance. Remove "amd64_" prefix from static functions touched, while at it. There actually should be no visible functional change resulting from this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | amd64_edac: Check ECC capabilities initiallyBorislav Petkov2011-01-071-66/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Rework the code to check the hardware ECC capabilities at PCI probing time. We do all further initialization only if we actually can/have ECC enabled. While at it: 0. Fix function naming. 1. Simplify/clarify debug output. 2. Remove amd64_ prefix from the static functions 3. Reorganize code. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | amd64_edac: Carve out ECC-related hw settingsBorislav Petkov2011-01-071-19/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is in preparation for the init path reorganization where we want only to 1) test whether a particular node supports ECC 2) can it be enabled and only then do the necessary allocation/initialization. For that, we need to decouple the ECC settings of the node from the instance's descriptor. The should be no functional change introduced by this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | amd64_edac: Remove PCI ECS enabling functionsBorislav Petkov2011-01-071-55/+0
| | | | | | | | | | | | | | PCI ECS is being enabled by default since 2.6.26 on AMD so this code is just superfluous now, remove it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | amd64_edac: Allocate driver instances dynamicallyBorislav Petkov2011-01-071-16/+29
| | | | | | | | | | | | | | | | | | Remove static allocation in favor of dynamically allocating space for as many driver instances as northbridges present on the system. There should be no functional change resulting from this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>