summaryrefslogtreecommitdiff
path: root/Documentation
diff options
context:
space:
mode:
authorAaron Miller <aaronmiller@fb.com>2016-11-03 15:01:53 -0700
committerBorislav Petkov <bp@suse.de>2017-01-19 10:29:40 +0100
commit4fb6fde74d6724dc6d64ec729f950fbdeefd7f07 (patch)
treed0030c8f6195834f3d410ca70ce5f4561bc34ec9 /Documentation
parent2287c63643f0f52d9d5452b9dc4079aec0889fe8 (diff)
EDAC: Expose per-DIMM error counts in sysfs
The old csrowX sysfs directories have per-csrow error counters, but the new dimmX directories do not currently expose error counts. EDAC already keeps these counts, add them to sysfs so per-DIMM counts are still available when CONFIG_EDAC_LEGACY_SYSFS=n. Signed-off-by: Aaron Miller <aaronmiller@fb.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20161103220153.3997328-1-aaronmiller@fb.com Signed-off-by: Borislav Petkov <bp@suse.de>
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/ABI/testing/sysfs-devices-edac17
-rw-r--r--Documentation/admin-guide/ras.rst20
2 files changed, 37 insertions, 0 deletions
diff --git a/Documentation/ABI/testing/sysfs-devices-edac b/Documentation/ABI/testing/sysfs-devices-edac
index 6568e0010e1a..46ff929fd52a 100644
--- a/Documentation/ABI/testing/sysfs-devices-edac
+++ b/Documentation/ABI/testing/sysfs-devices-edac
@@ -138,3 +138,20 @@ Contact: Mauro Carvalho Chehab <m.chehab@samsung.com>
Description: This attribute file will display what type of memory is
currently on this csrow. Normally, either buffered or
unbuffered memory (for example, Unbuffered-DDR3).
+
+What: /sys/devices/system/edac/mc/mc*/(dimm|rank)*/dimm_ce_count
+Date: October 2016
+Contact: linux-edac@vger.kernel.org
+Description: This attribute file displays the total count of correctable
+ errors that have occurred on this DIMM. This count is very important
+ to examine. CEs provide early indications that a DIMM is beginning
+ to fail. This count field should be monitored for non-zero values
+ and report such information to the system administrator.
+
+What: /sys/devices/system/edac/mc/mc*/(dimm|rank)*/dimm_ue_count
+Date: October 2016
+Contact: linux-edac@vger.kernel.org
+Description: This attribute file displays the total count of uncorrectable
+ errors that have occurred on this DIMM. If panic_on_ue is set, this
+ counter will not have a chance to increment, since EDAC will panic the
+ system
diff --git a/Documentation/admin-guide/ras.rst b/Documentation/admin-guide/ras.rst
index d71340e86c27..9939348bd4a3 100644
--- a/Documentation/admin-guide/ras.rst
+++ b/Documentation/admin-guide/ras.rst
@@ -438,11 +438,13 @@ A typical EDAC system has the following structure under
│   │   ├── ce_count
│   │   ├── ce_noinfo_count
│   │   ├── dimm0
+ │   │   │   ├── dimm_ce_count
│   │   │   ├── dimm_dev_type
│   │   │   ├── dimm_edac_mode
│   │   │   ├── dimm_label
│   │   │   ├── dimm_location
│   │   │   ├── dimm_mem_type
+ │   │   │   ├── dimm_ue_count
│   │   │   ├── size
│   │   │   └── uevent
│   │   ├── max_location
@@ -457,11 +459,13 @@ A typical EDAC system has the following structure under
│   │   ├── ce_count
│   │   ├── ce_noinfo_count
│   │   ├── dimm0
+ │   │   │   ├── dimm_ce_count
│   │   │   ├── dimm_dev_type
│   │   │   ├── dimm_edac_mode
│   │   │   ├── dimm_label
│   │   │   ├── dimm_location
│   │   │   ├── dimm_mem_type
+ │   │   │   ├── dimm_ue_count
│   │   │   ├── size
│   │   │   └── uevent
│   │   ├── max_location
@@ -483,6 +487,22 @@ this ``X`` memory module:
This attribute file displays, in count of megabytes, the memory
that this csrow contains.
+- ``dimm_ue_count`` - Uncorrectable Errors count attribute file
+
+ This attribute file displays the total count of uncorrectable
+ errors that have occurred on this DIMM. If panic_on_ue is set
+ this counter will not have a chance to increment, since EDAC
+ will panic the system.
+
+- ``dimm_ce_count`` - Correctable Errors count attribute file
+
+ This attribute file displays the total count of correctable
+ errors that have occurred on this DIMM. This count is very
+ important to examine. CEs provide early indications that a
+ DIMM is beginning to fail. This count field should be
+ monitored for non-zero values and report such information
+ to the system administrator.
+
- ``dimm_dev_type`` - Device type attribute file
This attribute file will display what type of DRAM device is