# Dave_Hansen.txt -- pre-review rules derived from Dave Hansen's review history

Maintainer:   Dave Hansen <dave.hansen@linux.intel.com>
Domain:       x86 (mm, pkeys, CET/shadow stack, FPU/XSAVE, TLB/PTI, MSR/CPUID)
Source:       lore.kernel.org/all, replies authored by dave.hansen@{linux.intel.com,intel.com}
Methodology:  lei q -O https://lore.kernel.org/all/ 'f:dave.hansen@linux.intel.com' 's:Re:' 's:PATCH'
              (847 messages mined; rules below are themes Dave raises repeatedly,
              each with multiple lore citations as evidence)

These rules are meant to be applied *before* a patch reaches Dave.  An agent
that fixes the patch (or pushes back on the submitter) on these points will
remove the most common reasons Dave has to re-review.  Each rule lists lore
URLs for messages where Dave himself raised the point.

================================================================================
RULE 1.  The changelog must explain WHY, not just WHAT.  It must be readable
         standalone, years from now, without the patch in front of you.
================================================================================

Dave consistently rejects changelogs that are terse, that describe what the
code does (the diff already shows that), or that paraphrase the diff without
giving the motivation a future maintainer needs.  Treat the changelog as a
durable artifact for someone debugging this code five years from now with no
other context.

What to check / fix before sending:
  - The changelog states the *problem* being solved (bug, missing feature,
    architectural cleanup) in plain prose, not as a list of files touched.
  - It explains *why* this solution was chosen, not just what it does.
  - It is self-contained: it does not require the reader to read the cover
    letter, the previous patch in the series, or external documentation in
    order to make sense of this single commit.
  - It captures the rationale that lived in the email thread.  Once merged,
    the thread is effectively gone; only the commit message survives.

Evidence (Dave's own words):
  - "Could you spell this out a bit more the changelog?"
    https://lore.kernel.org/all/5727CD69.7040901@linux.intel.com/
  - "This commit message kinda skirts around the real reasons for this patch."
    https://lore.kernel.org/all/11735e2e-781f-492f-7a1a-71b91e0876dc@linux.intel.com/
  - "In the interests of standalone changelogs, I'd really appreciate an
     actual explanation of what's going on here."
    https://lore.kernel.org/all/1f69bdb6-df5e-d709-064a-4f6fdd6e11a7@linux.intel.com/
  - "Its connection to this series is also tenuous and not spelled out in
     the exceptionally terse changelog."
    https://lore.kernel.org/all/20167c74-9b98-6fa1-972e-bcd2c9c4a1c8@linux.intel.com/
  - "This also needs a lot more high-level context about why it is necessary."
    https://lore.kernel.org/all/76caafd5-c85d-61bb-62ec-8056cd6d95ac@linux.intel.com/
  - "I think there are also two lesser things that the changelog is missing..."
    https://lore.kernel.org/all/5073562d-9831-4270-14ce-c0e97c62c21e@linux.intel.com/
  - "I'm not sure we should lean solely on the commit message to record why
     we need this until the end of time."
    https://lore.kernel.org/all/57603CBE.7090702@linux.intel.com/
  - "I'm going to harp on the patch description a bit more."
    https://lore.kernel.org/all/5073562d-9831-4270-14ce-c0e97c62c21e@linux.intel.com/

================================================================================
RULE 2.  Put the *why* in the code as a comment, not only in the changelog.
         A reader looking at the function should not have to git-blame to a
         commit message to understand subtle behavior.
================================================================================

This is one of Dave's most repeated themes and is distinct from RULE 1.  Even
a great changelog dies once the patch is merged -- the code is what people
read.  If the rationale is non-obvious, it belongs in a code comment.

What to check / fix before sending:
  - For every non-obvious choice (a check, a barrier, a magic ordering, an
    early return, an unusual cast, an unusual constant), there is a *brief*
    comment in the source explaining *why*.
  - The comment explains the reason, not the mechanism.  Do not write
    "increments x by 1" next to "x++"; write the constraint or invariant
    that makes that increment necessary.
  - If you have a long, beautiful patch description and zero new comments
    in the affected hunks, that is a smell -- audit which parts of the
    description belong as comments instead.

Evidence:
  - "I'd much rather have code comments than changelog comments."
    https://lore.kernel.org/all/dd7b3dd7-9e10-4b9f-b931-915298bfd627@linux.intel.com/
  - "This is another case of gloriously comment-free code, but stuff that
     _was_ covered in the changelog."
    https://lore.kernel.org/all/dd7b3dd7-9e10-4b9f-b931-915298bfd627@linux.intel.com/
  - "There was quite a long patch description, which will be surely lost
     to the ages, but nothing in the code that folks _will_ be looking at
     for decades to come."
    https://lore.kernel.org/all/92b86ab6-6f51-97b0-337c-b7e98a30b6cb@linux.intel.com/
  - "Add comments to code, and avoid doing it solely in changelogs."
    https://lore.kernel.org/all/49178f48-6635-353c-678d-3db436d3f9c3@linux.intel.com/
  - "For the benefit of the poor souls looking at this code in the decades
     to come, could you please take a few minutes to write a couple of
     sentences about why this is being done?"
    https://lore.kernel.org/all/9fb55ad9-a69f-bcc3-9e57-d90df5257726@linux.intel.com/
  - "I know it wasn't commented before, but please add a comment about
     what this is doing."
    https://lore.kernel.org/all/fa579920-fe34-80fb-4244-5b840b99e72c@linux.intel.com/
  - "This needs commenting about why is_wruss() is special."
    https://lore.kernel.org/all/bbb487cc-ac1c-f734-eee3-2463a0ba7efc@linux.intel.com/
  - "It is missing some explanation in the *code* of why sparc is doing
     this and when/why other architectures might want to use these hooks."
    https://lore.kernel.org/all/8665482b-f808-e995-cda1-99011be6ee34@linux.intel.com/
  - "The content is getting there, but we need it next to the code, please."
    https://lore.kernel.org/all/3f158401-f0b6-7bf7-48ab-2958354b28ad@linux.intel.com/
  - "This is the kind of thing that would be really useful to point out
     to reviewers instead of requiring them to ferret it out of the code."
    https://lore.kernel.org/all/75c31c99-cff7-72dc-f593-012fe5acd405@linux.intel.com/

================================================================================
RULE 3.  One logical change per patch.  Split a refactor from the behavior
         change.  Split a move from a modification.
================================================================================

Dave will routinely ask for a single patch to be split when it is doing more
than one thing.  Code motion (move/rename) must be separate from semantic
changes, so reviewers can verify the move is a no-op.  Refactoring is its own
patch, distinct from new behavior.

What to check / fix before sending:
  - Each patch in a series does exactly one thing whose subject line
    summarizes the whole patch in <~70 chars.
  - If a patch both moves a function and changes it, split it.
  - If a patch introduces a helper and uses it in a new code path, the
    helper-introduction is fine to combine if the helper has no users yet;
    but if it touches existing call sites, factor that out as a refactor.
  - Cleanups that are not strictly required for the feature should land
    as their own preparatory patches at the start of the series, or as a
    follow-on series.

Evidence:
  - "I think it would make a lot of sense to do the move and the
     modification in two patches."
    https://lore.kernel.org/all/57D9C54B.3010105@linux.intel.com/
  - "Consolidate code refactoring into separate patches."
    https://lore.kernel.org/all/49178f48-6635-353c-678d-3db436d3f9c3@linux.intel.com/
  - "It would be nice to see the arch-independent and x86 portions broken
     out and explained in their own right, even if they're small patches."
    https://lore.kernel.org/all/f7985965-3e33-4352-0ffb-97a5407f7acc@linux.intel.com/
  - "I'll also point out that in a multi-hundred-line patch, adding
     arguments to a existing function would not be something I'd try
     to include in the patch."
    https://lore.kernel.org/all/92b86ab6-6f51-97b0-337c-b7e98a30b6cb@linux.intel.com/
  - "That all sounds like great stuff to do in a follow-on patchset
     after this XSAVES stuff."
    https://lore.kernel.org/all/57321AFA.3020104@linux.intel.com/
  - "As large as this patch set is, I'd really prefer to see you get
     shadow stacks merged and then move on to IBT."
    https://lore.kernel.org/all/25675609-9ea7-55fb-6e73-b4a4c49b6c35@linux.intel.com/

================================================================================
RULE 4.  Use the kernel's CPU-feature abstractions; do not open-code CPUID,
         and do not use raw #ifdef when a runtime check (or IS_ENABLED())
         would do.
================================================================================

For x86 patches Dave will almost always ask for cpu_feature_enabled() or
static_cpu_has() over open-coded CPUID, over boot_cpu_has() where the
alternatives infrastructure should be doing the patching, and over an
#ifdef where IS_ENABLED() compiles both paths and lets dead-code elimination
do its job.  X86_FEATURE_* (and X86_BUG_*) bits are the right way to express
"this CPU supports/needs X".

What to check / fix before sending:
  - New CPU-feature detection: add an X86_FEATURE_ bit; check it with
    cpu_feature_enabled() (or static_cpu_has() in hot paths).  Don't
    re-execute CPUID at every call site.
  - For hardware errata, use X86_BUG_ + static_cpu_has_bug() so the
    alternatives mechanism patches the check out on unaffected CPUs.
  - Replace "#ifdef CONFIG_FOO ... #endif" gating function bodies with
    "if (!IS_ENABLED(CONFIG_FOO)) return ..." where the compiler can
    eliminate the dead branch.
  - Do not use boot_cpu_has(X86_FEATURE_FOO) as a proxy for "we are using
    feature FOO" -- it tells you the CPU has it, not that the kernel
    chose to use it.  Track the kernel's choice separately.

Evidence:
  - "Is there a reason to add the cpuid detection like this instead of
     adding an X86_FEATURE_ for it and using cpu_has() and friends?"
    https://lore.kernel.org/all/55439734.1070709@linux.intel.com/
  - "On a higher level, we really should stop using
     'boot_cpu_has(X86_FEATURE_XSAVES)' as a proxy for 'the kernel XSAVE
     buffer is in the XSAVES format'."
    https://lore.kernel.org/all/5723C13B.8080408@linux.intel.com/
  - "Borislav suggested using static_cpu_has_bug(), which will do the
     alternatives patching."
    https://lore.kernel.org/all/576052E0.3050408@linux.intel.com/
  - "what's wrong with: if (cpu_feature_enabled(X86_FEATURE_IBT) && ibt) { ..."
    https://lore.kernel.org/all/3a7e9ce4-03c6-cc28-017b-d00108459e94@linux.intel.com/
  - "FWIW, you should be able to use cpu_feature_enabled() instead of an
     explicit #ifdef here."
    https://lore.kernel.org/all/76caafd5-c85d-61bb-62ec-8056cd6d95ac@linux.intel.com/
  - "Instead of an #ifdef, is there a reason we can't just do:
     if (!IS_ENABLED(THP_SWAP)) return 0; ?"
    https://lore.kernel.org/all/92b86ab6-6f51-97b0-337c-b7e98a30b6cb@linux.intel.com/
  - "I'd rather this be: ALTERNATIVE \"jmp .Lparanoid_entry_no_fsgsbase\",
     \"nop\", X86_FEATURE_FSGSBASE ..."
    https://lore.kernel.org/all/fa579920-fe34-80fb-4244-5b840b99e72c@linux.intel.com/

================================================================================
RULE 5.  A new x86 user-visible feature ships with selftests.
================================================================================

Dave will not merge new x86 user-visible features without selftests, and he
will explicitly point at tools/testing/selftests/x86/ as the place they belong.
"Did you run it on hardware?" is also a recurring question; have an answer
ready.

What to check / fix before sending:
  - For any new feature that adds a prctl/syscall/sysctl/ioctl/sysfs/proc
    interface, a selftest exists in tools/testing/selftests/x86/ (or the
    relevant selftest directory).
  - The selftest covers both happy path and at least one failure-mode case
    (invalid arg, unsupported CPU, disabled by boot option).
  - The cover letter or changelog states which hardware the patch was tested
    on, including the negative case where the feature is absent.

Evidence:
  - "BTW, where are all the selftests for this code?"
    https://lore.kernel.org/all/a4dae7e3-58a7-5c38-1071-2deee758bb98@linux.intel.com/
  - "I don't think this can or should get merged before we have selftests."
    https://lore.kernel.org/all/a4dae7e3-58a7-5c38-1071-2deee758bb98@linux.intel.com/
  - "For MPX and protection keys, I added something to
     tools/testing/selftests/x86, for instance."
    https://lore.kernel.org/all/8b0edd2e-3e9b-1148-6309-38b61307a523@linux.intel.com/
  - "Any idea why that didn't show up in any of the x86 selftests?"
    https://lore.kernel.org/all/e8149c9e-10f8-aa74-ff0e-e2de923b2128@linux.intel.com/
  - "Is this a part of your selftests/ for this feature?"
    https://lore.kernel.org/all/67d8a813-b46a-d1da-3897-c38dd5b46b8e@linux.intel.com/

================================================================================
RULE 6.  Performance claims require numbers.  Performance patches require
         a microbenchmark that demonstrates the win.
================================================================================

When a changelog says "this is faster" or "this avoids overhead", Dave wants
to see measurements.  When a patch trades complexity for performance, the
question "how much faster?" must already be answered in the changelog.

What to check / fix before sending:
  - The changelog gives concrete before/after numbers (cycles, ns, ops/sec,
    syscalls/sec) on a named workload or microbenchmark, including the
    machine class.
  - For controversial changes (more text, more state), the win must be
    *concrete* -- not a hand-wave.
  - If the patch is "obviously a small improvement", say so and justify
    why measuring would not be informative; Dave has accepted this when
    the case is strong but he asks for sanity checks in borderline cases.

Evidence:
  - "I'd love to see at least a sanity check with a microbenchmark (or
     something) that, yes, this does help *something*."
    https://lore.kernel.org/all/8bb352bc-4e1f-4e87-80e3-a8e65d618d2a@linux.intel.com/
  - "I really don't think we should be adding these without having
     _concrete_ reasons for it."
    https://lore.kernel.org/all/51F6F087.9060109@linux.intel.com/
  - "Series looks good to me, and the performance numbers speak for themselves."
    https://lore.kernel.org/all/109fa3c1-b0ce-39ca-f0dd-4e38ec29cadb@linux.intel.com/
  - Dave's own model for how to present numbers (cite, do not imitate
     style verbatim, just match the rigor):
    https://lore.kernel.org/all/0d6ea030-ec3b-d649-bad7-89ff54094e25@linux.intel.com/
    https://lore.kernel.org/all/be2e683c-bf0a-e9ce-2f02-4905f6bd56d3@linux.intel.com/
    https://lore.kernel.org/all/8ea53aa4-34ee-15ab-c28c-04cf3c2e979b@linux.intel.com/

================================================================================
RULE 7.  No magic numbers.  Constants get named #defines or enums, and the
         name says what they mean.
================================================================================

What to check / fix before sending:
  - Every literal that is not 0, 1, -1, or an obvious power-of-two used
    locally has a named definition somewhere appropriate.
  - The literal is not "explained" by an adjacent comment when it could
    instead be a self-describing identifier.
  - If a constant matches a hardware quantity (XSAVE component size,
    reserved-field width, MSR layout), pull it from a header next to the
    definition of the relevant struct or feature, not from thin air.

Evidence:
  - "Also, why the magic number?"
    https://lore.kernel.org/all/fa579920-fe34-80fb-4244-5b840b99e72c@linux.intel.com/
  - "Shouldn't this come out of a macro somewhere rather than having to
     hard-code and spell out the convention?"
    https://lore.kernel.org/all/fa579920-fe34-80fb-4244-5b840b99e72c@linux.intel.com/
  - "I have the feeling that there's way too much stuff that hard-codes
     assumptions about XCR0 inside the kernel and out."
    https://lore.kernel.org/all/d8503ac3-b6a4-d6cc-3db7-4e092724b79d@linux.intel.com/
  - "Could we add a BUILD_BUG_ON(sizeof(hdr->reserved) != 48); That way,
     the next hapless kernel developer can't miss updating this."
    https://lore.kernel.org/all/f5387b72-0c2d-0603-b630-2aca25d16ccb@linux.intel.com/
  - "I would rather not just hard-code it in a way that we say one vendor
     has never and will never be affected."
    https://lore.kernel.org/all/4ecfab84-ae81-2e00-9004-8d0ccf863f76@linux.intel.com/

================================================================================
RULE 8.  Use BUILD_BUG_ON / WARN_ON_ONCE to lock in invariants that future
         code changes could silently break.
================================================================================

Dave repeatedly suggests adding compile-time or runtime guards so that a
later, unrelated change that violates an assumption is caught immediately
rather than producing a subtle bug.

What to check / fix before sending:
  - If the patch relies on a struct field having a specific size, layout,
    or value, add a BUILD_BUG_ON() that enforces it next to the code that
    depends on it.
  - If a function must be called only in a specific state (preempt off,
    a lock held, a feature bit set), add the corresponding
    lockdep_assert_*/might_sleep/WARN_ON_ONCE check.
  - Prefer WARN_ON_ONCE over WARN_ON for things expected to never fire,
    so a recurring bug does not flood dmesg.
  - BUG_ON() is rarely the right answer -- prefer WARN_ON_ONCE plus
    error recovery unless continuing really would corrupt the system.

Evidence:
  - "Could we add a BUILD_BUG_ON(sizeof(hdr->reserved) != 48); That way,
     the next hapless kernel developer can't miss updating this."
    https://lore.kernel.org/all/f5387b72-0c2d-0603-b630-2aca25d16ccb@linux.intel.com/
  - "Worst-case, do WARN_ON_ONCE(bitmap & ~PAGE_MASK);"
    https://lore.kernel.org/all/f97ce234-52fa-e666-2250-098925cf3c39@linux.intel.com/
  - "I think we should probably also have an _explicit_ FPU_WARN_ON() in
     the XRSTORS path for this."
    https://lore.kernel.org/all/573124CC.8010505@linux.intel.com/
  - "The point of this was to make this function do the right thing no
     matter what, but warn if it gets called in an unexpected way."
    https://lore.kernel.org/all/d7db2bfd-e251-606f-a42f-55c9ef1aca55@linux.intel.com/
  - "How did lockdep not blow up and scream about this?"
    https://lore.kernel.org/all/5583234E.5080807@linux.intel.com/

================================================================================
RULE 9.  Vendor-specific code goes in vendor-specific files.  Don't put
         Intel-only code in a common path, and don't pay the cost of it on
         non-Intel (or non-AMD) builds.
================================================================================

What to check / fix before sending:
  - Code that only runs on Intel CPUs lives under arch/x86/kernel/cpu/intel*
    (or is guarded by CONFIG_CPU_SUP_INTEL), and similarly for AMD/Hygon/etc.
  - Vendor checks use the established X86_VENDOR_* / x86_match_cpu() /
    INTEL_FAM6_* infrastructure rather than open-coded family/model
    comparisons sprinkled in drivers.

Evidence:
  - "Can we please stick this in one of the intel.c files, so it's only
     present on CPU_SUP_INTEL builds?"
    https://lore.kernel.org/all/57603DC0.9070607@linux.intel.com/
  - "Maybe we should be providing the vendor/family/model sets from a
     common place to the drivers, instead of making them all repeat it
     individually."
    https://lore.kernel.org/all/57FE64CA.6080902@linux.intel.com/
  - "INTEL_FAM6_XEON_PHI_KNL is just for the model and doesn't check family."
    https://lore.kernel.org/all/57FE64CA.6080902@linux.intel.com/

================================================================================
RULE 10. Names must say what the thing is.  No misleading abbreviations,
         no count-of-mystery-thing, no ambiguous prefixes.
================================================================================

Naming is a recurring review point.  If a name does not communicate the
meaning, Dave will ask for a rename even on otherwise-good patches.

What to check / fix before sending:
  - Function/variable/struct names answer "what is it / what does it do",
    not "how is it implemented".
  - A "count" or "size" name says count *of what* and in what unit
    (e.g. cluster_count -> cluster_swap_count, *_size -> *_bytes).
  - Boolean flags read positively where possible (enabled rather than
    !disabled) unless the negative is genuinely the natural form.
  - File-scoped statics and helpers introduced for a feature carry the
    feature's prefix so the namespace makes sense (e.g. phi_*, pti_*).

Evidence:
  - "cluster_swapcount() continues the horrific naming of cluster_couunt(),
     not saying what the count is *of*."
    https://lore.kernel.org/all/dd7b3dd7-9e10-4b9f-b931-915298bfd627@linux.intel.com/
  - "Please be careful with the naming."
    https://lore.kernel.org/all/3c3a4dce-980d-0405-d269-1da9e62b1344@linux.intel.com/
  - "Looking at the name, though, it's pretty clear that the naming can
     easily trip folks up."
    https://lore.kernel.org/all/57FE64CA.6080902@linux.intel.com/
  - "Could you please give it a sane name like 'phi_r3_mwait_disabled'?"
    https://lore.kernel.org/all/57FE7104.6070608@linux.intel.com/
  - "I think that means that something could use some better naming."
    https://lore.kernel.org/all/ef6012c0-52b9-14bb-c581-97627fdd5cb5@linux.intel.com/
  - "Could you elaborate on why the name is being changed?"
    https://lore.kernel.org/all/76caafd5-c85d-61bb-62ec-8056cd6d95ac@linux.intel.com/
  - "'clobbers' is the normal terminology for this, not 'scratches'."
    https://lore.kernel.org/all/fa579920-fe34-80fb-4244-5b840b99e72c@linux.intel.com/

================================================================================
RULE 11. Match Intel SDM terminology when describing hardware behavior.
================================================================================

For hardware-touching patches Dave wants kernel comments and changelogs to
use the same terminology as the Intel Software Developer's Manual, so that
readers can move between code and architecture documentation without a
translation layer.

What to check / fix before sending:
  - When the SDM has a specific term for a concept (XSTATE_BV, IA32_*,
    PTE_*), use it verbatim in comments and changelogs.
  - If the kernel already has its own term that diverges from the SDM,
    add a comment noting both names (kernel-name == SDM-name) at the
    point where they meet.

Evidence:
  - "I'd really like to see the nomenclature match the SDM where it's
     sensible..."
    https://lore.kernel.org/all/554904A6.8040503@linux.intel.com/
  - "I really wish those docs had reused the established SDM language
     instead of reinventing their own way of saying things."
    https://lore.kernel.org/all/46784af0-6fbb-522d-6acb-c6248e5e0e0d@linux.intel.com/
  - "This is one of those things where the SDM language does not match
     what we use in the kernel.  I say add comments to clarify what the
     SDM calls it vs..."
    https://lore.kernel.org/all/498c8824-9255-96be-71c2-3ebfa684a9d3@linux.intel.com/

================================================================================
RULE 12. printk() messages get a subsystem prefix via pr_fmt.  Prefer
         pr_warn/pr_info/pr_err over bare printk.
================================================================================

What to check / fix before sending:
  - The file uses #define pr_fmt(fmt) "x86/<subsystem>: " fmt or similar,
    so every pr_*() carries a recognizable prefix.
  - Replace any new bare printk() with the appropriate pr_* level.
  - Use pr_warn_once / pr_info_once for messages that should not flood
    the log on every event.

Evidence:
  - "We also probably want to prefix the pr_info() with something like
     'x86/intel:'."
    https://lore.kernel.org/all/57603CBE.7090702@linux.intel.com/
  - "Maybe it could be a pr_warn/err() instead of a full warning?"
    https://lore.kernel.org/all/6f75823d-ce8b-d483-6046-4efbe20d0410@linux.intel.com/
  - "A single-time printk could also go a long way to keeping folks from
     getting confused."
    https://lore.kernel.org/all/b8d952c5-2803-eea2-cd9a-20463a48075e@linux.intel.com/

================================================================================
RULE 13. New features need user-visible enable/disable status and an entry
         in Documentation/.
================================================================================

When a feature lands on x86, there must be a way for userspace and admins
to tell whether it is on and to read about it.

What to check / fix before sending:
  - The feature exposes its state somewhere (/proc, /sys, cpuinfo flag,
    boot log message) so a user can confirm it is active.
  - Documentation/ is updated in the same series (or, if the bulk lives
    elsewhere, the series adds a pointer).
  - If the patch introduces a new boot option or sysctl, the option is
    documented in admin-guide/kernel-parameters.txt or the appropriate
    Documentation/ subdirectory.

Evidence:
  - "Please don't forget to add ways to tell if this feature is on/off
     in /proc or whatever."
    https://lore.kernel.org/all/0f08d89e-61e1-20e3-5c59-0b2f7b32bf0c@linux.intel.com/
  - "We do need a Documentation/ update now."
    https://lore.kernel.org/all/dffd1146-5cc5-ae14-ec10-3df312703790@linux.intel.com/
  - "Neither of those are deal killers, but it would be nice to document it."
    https://lore.kernel.org/all/2c0502d0-20ef-44ac-db5b-7f651a70b978@linux.intel.com/

================================================================================
RULE 14. The cover letter (0/N) carries the high-level story; an individual
         patch's changelog must still stand on its own.
================================================================================

Dave reads cover letters and *also* reads each commit message standalone.
Information that lives only in the cover letter is lost forever once the
series is merged.

What to check / fix before sending:
  - The cover letter exists for any series of >1 patch and explains the
    motivation, the overall design, what was tested, and what is *not*
    addressed.
  - No commit's changelog says "see the cover letter for context".
  - Outstanding questions raised by the cover letter are resolved before
    requesting acks on individual patches.

Evidence:
  - "I think you need to resolve your outstanding issues (from your 0/4
     patch) before anyone can really ack these."
    https://lore.kernel.org/all/8665482b-f808-e995-cda1-99011be6ee34@linux.intel.com/
  - "I think (partially) duplicating that in a cover letter would also
     be nice."
    https://lore.kernel.org/all/c483a325-d914-222f-f7ab-b3c19c5ff77b@linux.intel.com/
  - "First of all, thanks for the excellent patch headers." (positive
     reinforcement of doing this well)
    https://lore.kernel.org/all/de510df6-7ea9-edc6-9c49-2f80f16472b4@linux.intel.com/

================================================================================
RULE 15. For TLB / page-table / FPU / XSAVE / shadow-stack code, enumerate
         the cases.  Dave reviews these areas heavily and asks about edge
         cases that are not obviously in scope.
================================================================================

This is the territory where Dave is most senior and most strict.  Patches
in these areas should anticipate the cross-cutting questions:

  - What happens on the CPU that lacks the feature?
  - What happens with paravirt / Xen / KVM guest?
  - What happens with 32-bit / compat?
  - What happens when the relevant CR* bit / feature bit is forced off?
  - Does this interact with PTI / KAISER?
  - Is the cost paid on every kernel entry, or only when the feature is in use?
  - Does this regress the case where the feature is *not* present?

What to check / fix before sending:
  - The changelog explicitly addresses the negative case ("on CPUs without
     X86_FEATURE_FOO this patch is a no-op because ...").
  - For TLB-flushing changes: state which ASIDs/PCIDs are affected, what
    the cost is for kernel-only vs user mappings, and whether the change
    is "single page" or "full flush".
  - For FPU/XSAVE changes: state the relationship to XCR0, XSAVES vs
    XSAVE format, and supervisor vs user state.

Evidence:
  - "TLB invalidation bugs are hard enough to find as it stands, but
     silently turning every 'little hammer' single flush into an implicit
     'big hammer' full flush shouldn't be something that we do lightly."
    https://lore.kernel.org/all/5eae565e-758d-f78e-bcde-16a1f278db7e@linux.intel.com/
  - "Before this patch set, on !X86_FEATURE_PGE systems, we cleared
     _PAGE_GLOBAL in virtually all places due to masking via
     __supported_pte_mask."
    https://lore.kernel.org/all/503339cf-7e54-0888-1767-c8ac87ce2130@linux.intel.com/
  - "With PCIDs, do we need to be a bit more explicit about what kind
     of %CR3 reload we have?"
    https://lore.kernel.org/all/187ad4e2-d72b-2267-df40-dfb178ab0e47@linux.intel.com/
  - "What about if there is supervisor state in place?"
    https://lore.kernel.org/all/11347d3a-8491-1545-d47d-a1cb2fb9b410@linux.intel.com/
  - "But what if the hardware you are migrating to/from *has* holes?"
    https://lore.kernel.org/all/d8503ac3-b6a4-d6cc-3db7-4e092724b79d@linux.intel.com/
  - "One of the reasons for doing the runtime-disable was to get rid of
     the !PARAVIRT dependency."
    https://lore.kernel.org/all/24359653-5b93-7146-8f65-ac38c3af0069@linux.intel.com/

================================================================================
RULE 16. Use existing kernel helpers rather than introducing parallel
         mechanisms.  Don't bypass an existing abstraction to save a few lines.
================================================================================

What to check / fix before sending:
  - Before adding a new helper, grep for existing equivalents
    (per-cpu helpers, x86_match_cpu, alternatives, IS_ENABLED, READ_ONCE,
    WARN_ON_ONCE, etc.).
  - If the patch duplicates logic that already exists nearby, the
    duplication is either eliminated, or the changelog explains why it
    cannot be.

Evidence:
  - "I really wish those docs had reused the established SDM language
     instead of reinventing their own way of saying things."
    https://lore.kernel.org/all/46784af0-6fbb-522d-6acb-c6248e5e0e0d@linux.intel.com/
  - "You might be able to reuse a bit like PageReadahead."
    https://lore.kernel.org/all/531F2ABA.6060804@linux.intel.com/
  - "Please use the macros in here for the model id..."
    https://lore.kernel.org/all/57603CBE.7090702@linux.intel.com/
  - "I know it's just two sites, but I'd much rather spend the space on
     a helper function than a copy-and-pasted comment."
    https://lore.kernel.org/all/e7047fd5-8bcf-42aa-9729-125bb7304f35@linux.intel.com/

================================================================================
RULE 17. Be ready to answer "why" before the question is asked.
================================================================================

Statistically, the most common single word in Dave's reviews is "why".  Any
non-obvious decision in the patch should be pre-emptively justified in either
the code comment (RULE 2) or the changelog (RULE 1) -- ideally both.

Specific "why"s to pre-empt:
  - Why this design and not the simpler/obvious one?
  - Why this magic number?
  - Why is this only on Intel / only on this family/model?
  - Why is the kernel doing this and not the hardware?
  - Why hasn't this caused problems before now?
  - Why is this not in a separate patch?
  - Why is this not behind an X86_FEATURE_* / CONFIG_*?
  - Why are we OK with the cost on the negative path?

Evidence (representative examples; this is the single most common pattern):
  - "Why isn't that being used on s390?"
    https://lore.kernel.org/all/05493150-5e4e-30bd-f772-0c6d88240030@linux.intel.com/
  - "Is there a reason this didn't cause problems up to now?"
    https://lore.kernel.org/all/4a56313b-1184-56d0-e269-30d5f2ffa706@linux.intel.com/
  - "Is there a reason we're not using pointers here?"
    https://lore.kernel.org/all/3a7e9ce4-03c6-cc28-017b-d00108459e94@linux.intel.com/
  - "Why are we OR'ing the results into this MSR?"
    https://lore.kernel.org/all/3a7e9ce4-03c6-cc28-017b-d00108459e94@linux.intel.com/
  - "Why is ADI tag manipulation needed?"
    https://lore.kernel.org/all/b31a3aef-26d9-6d3a-109b-c8453a3a2aef@linux.intel.com/
  - "Why does this even matter?"
    https://lore.kernel.org/all/de510df6-7ea9-edc6-9c49-2f80f16472b4@linux.intel.com/
  - "Why are we not supporting this now, and what would someone have to do
     in the future in order to enable it?"
    https://lore.kernel.org/all/56D62C1C.3050806@linux.intel.com/

================================================================================
RULE 18. CC the right people; send to x86@kernel.org for x86 patches.
================================================================================

What to check / fix before sending:
  - Patches touching arch/x86 are sent to x86@kernel.org and to the x86
    maintainers listed in MAINTAINERS, in addition to the relevant
    subsystem list.
  - The submitter ran scripts/get_maintainer.pl for the patch and
    included the resulting Cc list.
  - The submitter does not strip Cc lines that prior versions accumulated,
    unless someone has explicitly asked to be removed.

Evidence:
  - "And please send to x86@kernel.org."
    https://lore.kernel.org/all/68769b20-2be5-85b7-f21c-cc9094de547c@linux.intel.com/
  - "Borislav, if you don't want to be cc'd on these patches, then please
     speak up."
    https://lore.kernel.org/all/5733928B.7030909@linux.intel.com/

================================================================================
RULE 19. Don't forge tags.  An Acked-by from Dave looks like:
         "Acked-by: Dave Hansen <dave.hansen@linux.intel.com>"
         -- but only after he has said so on-list.
================================================================================

Dave's affirmative "you can add my tag" comes in a recognizable form, and
agents should not invent it.  The legitimate phrasings are documented here
so that an agent can recognize a real ack when re-sending a patch and not
manufacture a fake one.

Legitimate ack phrasings observed:
  - "Feel free to add my Acked-by."
    https://lore.kernel.org/all/68769b20-2be5-85b7-f21c-cc9094de547c@linux.intel.com/
  - "Feel free to add my acked-by on all 3 patches."
    https://lore.kernel.org/all/f5387b72-0c2d-0603-b630-2aca25d16ccb@linux.intel.com/
  - "Feel free to add my Reviewed-by or Acked-by."
    https://lore.kernel.org/all/07d101b3-d17a-7781-f05e-96738e6d6848@linux.intel.com/
  - "Looks fine to me." / "Looks good to me." -- this is *not* by itself
    an Acked-by; do not add a tag without "feel free to add" or an
    explicit Acked-by line.
    https://lore.kernel.org/all/579A35FF.6050606@linux.intel.com/
    https://lore.kernel.org/all/a67943d1-f7e8-c465-ce8d-a9c9a0a6f653@linux.intel.com/

What to check / fix before sending:
  - When carrying a tag forward from an earlier version, the tag is from a
    real message; the URL of that message is verifiable on lore.
  - An agent never adds an Acked-by/Reviewed-by/Tested-by tag for Dave on
    its own.  It may carry forward a tag he gave to a *substantially
    identical* patch, but if the patch changed meaningfully since the
    tag was given, the agent drops the tag and notes the change in a
    notes-section under the --- line.

================================================================================
RULE 20. Don't break userspace.  Be paranoid about ABI surface.
================================================================================

Dave is sensitive to anything that becomes ABI by accident -- the format of
/proc/cpuinfo flags, the bits in si_code/error_code, sysfs files, prctl()
return values, and so on.

What to check / fix before sending:
  - If the patch adds anything user-visible (/proc, /sys, cpuinfo string,
    new prctl/syscall/ioctl, new signal info field), it is called out
    explicitly in the changelog as "new ABI", and Documentation/ABI/ is
    updated.
  - If the patch changes the *format* of an existing user-visible field,
    that is treated as an ABI break and the changelog explains why
    userspace will not regress.
  - For new boot options or module parameters, they have sensible default
    values and the failure mode of "old userspace + new kernel" is
    described.

Evidence:
  - "For one, I think we fill in signal info with error_code, which makes
     it nominally part of the ABI."
    https://lore.kernel.org/all/6f75823d-ce8b-d483-6046-4efbe20d0410@linux.intel.com/
  - "There is existing userspace out there which depends on the existing
     prctl() setup."
    https://lore.kernel.org/all/54A64E6D.1020304@linux.intel.com/
  - "That's nice because it means we don't initially need a 'NEXT' ABI."
    https://lore.kernel.org/all/0f08d89e-61e1-20e3-5c59-0b2f7b32bf0c@linux.intel.com/
  - "I think it's good practice to just expose only the *minimal* amount
     of data necessary."
    https://lore.kernel.org/all/f90cef60-1710-1d99-61ee-4527067b115f@linux.intel.com/

================================================================================
Applying these rules
================================================================================

An agent reviewing a patch in Dave Hansen's territory (anything in
arch/x86/, mm/ for x86 page-table interactions, FPU/XSAVE, CET/SHSTK,
TLB/PTI/PCID, MSR/CPUID handling, and adjacent areas) should:

  1. Walk through rules 1..20 against the patch.
  2. For each rule that is violated, either fix it locally and amend the
     patch, or write a one-line reviewer note in the cover letter / the
     "---" section explaining why it does not apply.
  3. Prefer fixing over explaining: if the fix is small (rename, add
     comment, split, add WARN_ON_ONCE), just do it.
  4. The rules that produce the highest review-time savings (in roughly
     descending order, based on how often Dave personally raises them):
     RULE 1 (changelog why), RULE 17 (anticipate why), RULE 2 (code
     comments), RULE 4 (CPU-feature abstractions), RULE 3 (split
     patches), RULE 5 (selftests), RULE 10 (naming), RULE 7 (magic
     numbers), RULE 6 (perf numbers), RULE 8 (BUILD_BUG_ON/WARN_ON_ONCE).